Utilizing Deep Learning and SVM Models for Schizophrenia Detection and Symptom Severity Estimation Through Structural MRI

Utilizing Deep Learning and SVM Models for Schizophrenia Detection and Symptom Severity Estimation Through Structural MRI

Sheriff Alimi* Afolashade Oluwakemi Kuyoro Monday Okpoto Eze Oyebola Akande

Department of Computer Science, Babcock University, Ogun State 121003, Nigeria

Corresponding Author Email: 
alimi0356@pg.babcock.edu.ng
Page: 
993-1002
|
DOI: 
https://doi.org/10.18280/isi.280419
Received: 
30 April 2023
|
Revised: 
10 July 2023
|
Accepted: 
20 August 2023
|
Available online: 
31 August 2023
| Citation

© 2023 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

The automated diagnosis of schizophrenia utilizing Magnetic Resonance Imaging (MRI) has been the subject of numerous investigations, the majority of which have primarily directed their focus towards disorder detection. This study, however, aims to transcend detection, endeavoring to estimate the severity of schizophrenia symptoms by leveraging structural MRI data. Such capabilities are anticipated to enhance the monitoring of treatment efficacy, guide clinical decision-making, and ultimately contribute to improved schizophrenia management. MRI datasets for schizophrenia patients (23) and control subjects (20) were sourced from the OpenNeuro database. Each structural MRI was processed to extract a grayscale image, which was then segmented into White Matter (WM), Gray Matter (GM), and Cerebrospinal Fluid (CSF). Statistical attributes-such as standard deviation, moment, and skewness-were derived from each segment to form feature representations of the grayscale images. An SVM with a linear kernel was trained, distinguishing schizophrenia subjects from healthy controls. Furthermore, for the schizophrenia subjects, the sums of their respective Scale for the Assessment of Positive Symptoms (SAPS) and Scale for the Assessment of Negative Symptoms (SANS) scores were computed. A twelve-layer artificial neural network (ANN) was then trained to estimate these symptom severity scores. The SVM model achieved optimal classification accuracy at 81.8%, while the ANN demonstrated a correlation coefficient of 0.811 and a mean absolute error of 1.44 on the validation dataset. This performance surpasses that of a comparable study estimating schizophrenia symptom severity from electroencephalogram (EEG) data, which yielded correlation coefficients ranging from -0.6 to -0.702. The paper concludes with a proposed software architecture for practical application of these findings.

Keywords: 

schizophrenia, detection, severity, support vector, deep learning, Magnetic Resonance Imaging (MRI), regression

1. Introduction

Schizophrenia, a mental disorder typified by hallucinations, delusions, and disordered thinking, typically manifests in adolescence [1-3]. As one of the most prevalent mental disorders, it imposes substantial social disabilities, instigating social, economic, and psychological challenges [1-3]. Astonishingly high mortality rates, including a suicide rate twelve times that of the general population, are associated with schizophrenia [4].

Currently, the diagnosis of schizophrenia relies predominantly on clinical observations and patient interviews, which serve to assess mental status [1, 4]. Guiding this process are two systems: the Diagnostic and Statistical Manual (DSM-IV) and the International Classification of Diseases and Health-Related matters (ICD-10) [4]. However, such observational methods and symptom assessments are inherently subjective, heavily hinging on the clinician's expertise and familiarity with the disorder [2]. Particularly, early differentiation between schizophrenia and bipolar disorder poses a challenge due to shared psychotic symptoms [5]. This underscores the necessity for objective biomarkers that could render diagnoses more consistent, precise, and objective.

Schizophrenia has been observed to correlate with aberrations in brain structure [1, 2]. Studies suggest that schizophrenia patients exhibit decreased grey matter volume in specific brain regions such as the temporal cortex, prefrontal cortex, anterior cingulate cortex, and thalamus [1]. In comparison to healthy individuals, reductions in the grey matter of the fronto-temporolimbic section have been reported in patients with schizophrenia [3]. Structural abnormalities in regions like the middle temporal gyrus and corpus callosum have also been identified [2]. An overall reduction in grey matter is associated with schizophrenia, and these structural changes appear to be linked to the positive symptoms [6].

Structural Magnetic Resonance Imaging (SMRI) can capture these structural changes, making extracted features from such imaging potentially valuable biomarkers for diagnosing schizophrenia [1]. Numerous studies have employed SMRI for differentiating between schizophrenia patients and healthy subjects, typically involving feature extraction from the MRI data and subsequent classification using machine learning algorithms [1, 7]. The highest recorded performance in this field achieved a classification accuracy of 96.7%, utilizing multimodal MRI data and a Support Vector Machine (SVM) [8].

Discriminating between schizophrenia patients and healthy controls is crucial yet insufficient. Estimating the severity of the disorder's symptoms is equally valuable for monitoring treatment responses, particularly to antipsychotic drugs. With quantifiable treatment effectiveness and more precise decision-making facilitated by this novel capability, clinicians' effectiveness in managing schizophrenia could be significantly enhanced. This study proposes the use of structural MRI for classifying schizophrenia and estimating symptom severity, denoting severity scores by summing the Scale for the Assessment of Positive Symptoms (SAPS) and the Scale for Assessment of Negative Symptoms (SANS) from a secondary dataset. The research aims to develop two models: (1) a classification model to distinguish between schizophrenia patients and healthy controls, and (2) a regression model for estimating symptom severity in schizophrenia patients using structural MRI data.

2. Review of Related Works

A plethora of studies have been conducted focusing on the deployment of magnetic resonance imaging (MRI) data as neuro-biomarkers to aid in the diagnosis of schizophrenia. Some have utilized multimodal MRI data in their diagnostic procedures. The detection of schizophrenia is typically treated as a classification problem, with the objective being to differentiate between subjects with schizophrenia and healthy controls or non-schizophrenic subjects. Conversely, the estimation of symptom severity is a regression problem. Despite this, a comprehensive review of the literature reveals that the vast majority of research on the use of MRI for schizophrenia diagnosis addresses only the classification problem.

Following preprocessing of the MRI data, the brain is often segmented into Gray Matter (GM), White Matter (WM), and Cerebro-Spinal Fluid (CSF). Subsequently, features are extracted from these three segments, either automatically or manually. This review will delve into studies pertaining to both methods.

Research that incorporates automated feature extraction from MRI data include studies [6, 9], while the majority of the reviewed work involves manual feature extraction from the WM, GM, and CSF segments, including but not limited to studies [1, 7, 8, 10, 11].

In the study conducted by Hu et al. [9], convolutional neural networks (CNNs) were utilized to automatically extract features from GM, WM, and CSF probability maps, subsequently performing classification. The achieved classification accuracy was 79.27%. It is noteworthy that this study also explored manual feature extraction. Alternatively, Oh et al. [6] applied 3D-CNN directly to the raw sMRI images of 443 schizophrenia patients and 423 healthy controls, achieving the highest correct classification rate of 97%. However, when a dataset from a different center was utilized, the performance declined to 71%.

Manual extraction of handcrafted features from sMRI data was employed by the aforementioned study [9] to distinguish 289 schizophrenia subjects from 210 healthy controls. After flattening the GM, WM, and CSF maps and conducting principal component analysis (PCA), support vector machine (SVM) was utilized for classification, resulting in a classification accuracy of 69.15%.

The study conducted by Hu et al. [1] merged GM averages from disparate brain areas with polygenic risk scores obtained from blood samples as genetic characteristics. These features were then used to train an ensemble learning classifier (SVM and logistic regression), resulting in an accuracy of 71.8%. It is noteworthy that this study used a dataset of 508 schizophrenia patients and 502 healthy controls.

In study [7], the Total Intracranial Volume (TIV) was calculated from the MRI images by summing up GM, WM, and CSF segments. This TIV, along with age, sex, and a constant (scanner factor) were leveraged to construct a linear model whose accuracy ranged from 69% to 76%. The dataset encompassed 541 schizophrenia patients and 1252 healthy controls.

Chatterjee et al. [10] utilized a multisite MRI dataset (28 schizophrenia patients and 32 healthy controls) that was preprocessed to focus on grey matter (GM) reduction. Voxel-based morphometry (VBM) analysis was performed on the GM segment for feature extraction. A non-dominated sorting genetic algorithm (NSGA-II) was then used to select fewer features that were combined with age and sex. These features were subsequently used in training an SVM for distinguishing schizophrenia subjects from healthy controls. This methodology achieved a 90% classification accuracy.

In the work conducted by Liu et al. [11], a multitude of markers/features were extracted from MRI and DTI data collected from 62 schizophrenia patients and 33 healthy controls. A two-step feature selection mechanism was introduced to narrow down the most discriminative features, which were subsequently employed to train an SVM classifier. The classifier achieved 91.28% accuracy, 90.85% sensitivity, 92.17% specificity, and an AUC of 0.9485.

The seminal study by Nemoto et al. [7] employed multiple MRI images from 446 schizophrenia patients and 1577 healthy controls, segmented into grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF). The total intracranial volume (TIV) of the region of interest (ROI) was computed by aggregating these segments. The residual value (e) for each participant was then established using a linear analysis with age, sex, TIV, and scanner factor as variables. This residual value was utilized as a distinguishing feature between schizophrenia patients and healthy individuals, achieving an accuracy between 69% and 76%.

In contrast, Yang et al. [8] employed a multimodal MRI dataset, incorporating resting-state functional MRI and structural MRI, derived from 44 schizophrenia patients and 56 healthy controls. An automated anatomical labelling atlas facilitated the extraction of representative features such as grey matter volume (GMV), regional homogeneity (ReHo), the amplitude of low-frequency fluctuation (ALFF), and degree of centrality (DC). The combination of recursive feature elimination and a support vector machine (SVM) facilitated the identification of schizophrenia patients, with the highest accuracy of 96.7% achieved when employing ReHo and ALFF as input features.

Notwithstanding the above, no study to date has ventured to estimate the severity of schizophrenia using MRI data. The current study is poised to address this gap and to develop a comprehensive tool for the detection and monitoring of schizophrenia, thereby providing valuable feedback on patient responsiveness during treatment.

Several established psychiatric rating systems exist for determining the severity of psychiatric syndromes, including the Positive and Negative Symptom Scale (PANSS), the Scale for the Assessment of Positive Symptoms (SAPS), and the Scale for Assessment of Negative Symptoms (SANS) [12, 13]. The PANSS system, for instance, encompasses three major components: positive, negative, and cognitive or general psychopathology scales, each with specific items [13, 14]. The SANS measures negative symptoms across five domains, while the SAPS assesses positive symptoms.

The only attempt to estimate symptom severity using neuroimaging data was made by study [15], utilizing electroencephalography (EEG) data. The developed regression models displayed correlation coefficients ranging from -0.6 to -0.702 and mean square errors of 3.34 ± 2.40 and 3.9 ± 3.01 for positive and negative symptom severity respectively. The current research is positioned to extend such efforts to MRI data, driving advancements in schizophrenia severity estimation.

3. Methods

Several studies have been carried out that focus on using magnetic resonance imaging data as neuro-biomarkers for the diagnosis of schizophrenia. There are cases that multimodal MRI was used in the diagnosis.

The methodology is a muti-stage process pipeline with six major stages which are data acquisition, structural MRI preprocessing, extraction of statistical features as the dataset, splitting the dataset into training and test sets, training of the models (classification and regression) and evaluation of the performance of the models. The aforementioned methodology is represented in Figure 1.

Figure 1. Methods

3.1 Data acquisition

The neuro-image dataset of schizophrenic and healthy individuals was acquired from Openneuro [16] which was arranged in the Brain Imaging Data Structure (BIDS) standard. It contains structural and functional MRI data of 102 subjects. Importantly, the dataset is de-identified (anonymized) to protect the privacy of the subjects, and this makes it suitable for this research in adhering to ethical research practice.

  • The subjects are categorized into four groups:
  • Schizophrenia patients (SCZ), twenty-three (23) in total
  • Healthy controls (CON), twenty (20) in total
  • Schizophrenia patients’ siblings (SCZ-SIB), thirty-five (35) in total
  • Healthy controls’ siblings (CON-SIB), twenty-one (21) in total

This research will primarily focus on the structural MRI (T1w.nii.gz) of the twenty-three (23) Schizophrenia patients (SCZ) and twenty Healthy controls (CON). The pulse sequence of structural MRI (SMRI) is T1-weighted with short repetitive time (TR) and short echo time (TE). The pulse sequence provides contrast between the White Matter (light grey) and the Grey Matter (dark grey) with the Cerebrospinal fluid being dark.

Additional information about the subjects was also provided and this includes gender, age, SAPS and SANS schizophrenia severity scores among several other information. Table 1 shows the demographic distribution of the subjects by gender and age.

Table 1. Statistics of the subjects' count and age distribution

Group

 

Age

Schizophrenia (23)

 

 

 

Male (17)

23.81 ± 4.73

 

Female (6)

25.53 ± 4.71

Control (20)

 

 

 

Male (12)

21.41 ± 4.71

 

Female (8)

19.54 ± 4.59

3.2 MRI image preprocessing

The stage comprises four operations, extraction of greyscale (2D) image from structural magnetic resonance imaging, image denoising or smoothing and image rescaling or standardization.

Extraction of greyscale image: Grey image is extracted from an SMRI file using nibble a python library for processing medical neuroimaging files. The T1-weighted data (T1_data) has 3-D spatial space with three independent axes x, y and z and has data shape of (l, m, n). A greyscale image Y is obtained by taking a slice of the image array of T1_data defined as:

$Y=T 1$_ data $[:,:, \mathrm{n} / / 2]$

Image Intensity Standardization: Images from different scanners have different intensities and it is important to these images are standardized. The mathematical representation of this operation is presented in Eq. (1) and computed with NumPy (python numerical library).

Standardization $=\frac{I-\min (I)}{\max (I)-\min (I)} * 255$                 (1)

The min(I) and max(I) stand for the minimum and maximum pixel values in the image.

Skull stripping: This is an important part of the preprocessing pipeline as it focuses on the separation of the brain tissue (cerebellum and cortex) from the neighboring region. This process is characterized by a series of operations such as segmentation, binarization and iteration of erosion and dilation morphological process [17]. K-means segmentation algorithm partitions the image into foreground and background. For binarization, the background pixels are assigned a value of zero and the foreground one. The next key operations are dilation and erosion, represented by Eqs. (2)-(3), respectively, which were performed in an appropriate repeated order. The preliminary outcome is the brain tissue mask which is used to mask out the skull from the standardized image to obtain the region of interest. The morphological and segmentation operations are performed with OpenCV, a Python computer vision library.

$A \oplus B=\left\{z \mid B_z \cap A \subseteq A\right\}$      (2)

$A \ominus B=\left\{z \mid B_z \subseteq A\right\}$           (3)

A is the image and B is the structuring element.

Brain tissue segmentation: The brain tissue is segmented into four compartments with the Gaussian-mixture model (GMM) from Scikit-learn, a machine learning tool in Python. The compartments are the background, grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF). The background segment is eliminated from further processing as it does not convey any useful information.

3.3 Feature extraction

Statistical information is extracted from the three-segmented pixel clusters (GM, WM, and CSF), which are the representations for each of the structural magnetic resonance images. The statistical features obtained are standard deviation, entropy, skewness, kurtosis, and moments which are computed using NumPy from the pixel values (vi) of each of the cluster as represented with Eqs. (4)-(8).

Standard deviation (σ):

$\sigma^2=\left(\left(\frac{1}{n}\right) \sum_i\left(v_i-u\right)^2\right)$           (4)

Skewness: $s k w=\left(\frac{1}{n \sigma^3}\right) \sum_i\left(v_i-u\right)^3$                  (5)

Entropy: $E=-k \sum_i p_i \log _e p_i$                  (6)

Kurtosis: Kurt $=\left(\frac{1}{n \sigma^4}\right) \sum_i\left(v_i-u\right)^4$              (7)

Moment: $m_k=\left(\frac{1}{n}\right) \sum_i\left(v_i-u\right)^{\mathrm{k}}$              (8)

A couple of studies have used these statistical parameters extracted from an object's image as its features. The study of Kim et al. [18] shows that these parameters, considered conventional features extracted from EEG, can be used for the classification of schizophrenia, while the Alimi et al. [19] successfully uses the parameters extracted from red blood cells for distinguishing between infected cells and those not infected by malaria parasites. Some of the parameters are also used as acoustic features for discriminating schizophrenia subjects from healthy ones [20].

3.4 Training of classification and regression models

The extracted statistical features are used to train the classifier and regression models. The classifier will learn to differentiate between healthy control subjects and schizophrenia subjects based on the input features extracted from the three segments of the brain (CSF, WM, and GM). An SVM was chosen as the classifier for this study. Recursive feature elimination (RFE) is used to select the seven most important features for segregation between schizophrenia and healthy control datasets before the classifier is trained. The sum of the SAPS and SANS total scores represents the severity of schizophrenia symptoms. An artificial neural network with twelve layers is proposed as the regression model. With regards to the classification problem, 43 data points are used, which are divided into two in the ratio of 75%:25% for training and test sets, respectively. Concerning the regression problem, only data points from the 23 schizophrenia subjects were used, which were divided into training and test sets in a ratio of 75%:25% in favour of the training set.

3.5 Model validation and performance evaluation

After the training of the classifier and the linear regressor, the respective test datasets are used for validation. The performance of the SVM classifier is measured using four metrics, which are accuracy, precision, recall, and F1-score. For the 12-layer ANN regression model, the correlation coefficient and mean square error are considered for its performance evaluation.

4. Results

The acquired data for the research comprises structural magnetic resonance imaging of forty-three (43) subjects (control and schizophrenia groups) and their corresponding age and gender information. The control group consists of twelve (12) males and eight females (8) subjects, while the schizophrenia class consists of seventeen (17) males and six (6) females.

With the extraction of the grey scale image from the BIDS store for each of the subjects, the next operation performed on the image is the normalization to address variations in light intensities from different MRI scanners. Figure 2 shows the image before and after standardization. Also, a 5 by 5 Gaussian kernel was applied to the standardized image to remove noise (smoothing or denoising), and the resultant image is shown in Figure 3.

Skull-stripping is performed with k-means segmentation, binarization, a series of iterative morphological operations, and masking. Figure 4 shows the smoothed image, the mask of the area of interest, and the stripped brain image. In addition, Figure 5 is a gallery of stripped brain images of some of the subjects.

Figure 2. Image before and after standardization

Figure 3. Image before and after the smoothing operation using Gaussian kernel

Figure 4. The normalized, smoothed images and the mask of the brain tissue

Figure 5. Gallery of some skull-stripped MRI images

Figure 6. Segmentation of the smoothed image into grey matter, white matter and cerebrospinal fluid

The Gaussian-mixture model (GMM) was used to segment the stripped image into four clusters, the background, the Grey Matter, the White Matter and the Cerebrospinal Fluid, the background cluster is ignored from further processing. The outcome of the segmentation into GM, WM, and CSF is depicted in Figure 6.

Figure 7 shows the histogram of the intensity of cerebral-spinal fluid, grey matter, and white matter segments represented with red, green, and blue colors, respectively, indicating clear separation and effectiveness of the segmentation process. The histogram shows that the higher the intensity or pixel value, the higher the probability of being in the WM cluster, while those with lower values have a higher probability of being in the CSF cluster.

Figure 7. Histograms of brain intensity distribution by tissue class (CSF.GM and WM)

Figure 8. Correlation coefficients of the features and schizophrenia severity score

Statistical information such as standard deviation, entropy, skewness, kurtosis, and momentum were obtained from CSF, GM, and WM to form extracted features from the brain tissue segments of each subject. The age and gender information are also included as part of the features representing the subjects. Table 2 shows the dataset with features, also included in the table are the class label and severity scores. The class label 0 denotes the healthy control subject while 1 represents the schizophrenia subject. The schizophrenia symptoms severity scores which is the summation of the SANS total score and SAPS total score.

Table 2. Features extracted from GM, WM and CSF segments and other important information

Participant_ID

Condit

Gender

Age

GM_Entropy

WM_Entropy

CSF_Entropy

GM_Skew

sub-01

1

1

28.961

12.26378

12.72852

13.03354

2.691613

sub-05

1

2

25.6454

11.76337

12.13751

12.37536

3.289669

sub-07

1

2

29.9603

11.80682

12.47576

12.5361

3.3212

sub-09

1

1

23.1157

12.02702

12.34806

12.52394

2.933773

sub-11

0

1

27.5838

12.00111

12.35793

12.58978

2.955976

sub-12

0

2

18.768

11.85462

12.33345

12.66736

3.175685

sub-15

0

1

18.9706

12.17637

12.11962

12.44651

2.710635

sub-17

1

2

25.3771

12.47371

12.49474

11.88929

2.358542

sub-20

0

1

24.8323

11.96066

12.48804

12.4189

3.045928

sub-27

1

1

22.2834

12.04604

12.6131

12.26011

2.914034

sub-31

1

1

24.5749

12.42946

13.01694

12.86257

2.485679

sub-35

0

1

12.6051

12.37117

12.30057

12.68474

2.473272

sub-36

0

1

13.1307

12.57413

12.80382

12.72937

2.260366

sub-37

0

1

14.6092

12.6049

12.34948

12.72825

2.233632

sub-43

0

2

13.8344

12.39658

12.78699

13.16301

2.523116

sub-44

1

1

19.9562

12.35579

12.47026

12.92791

2.522139

sub-46

0

2

15.5099

11.96969

12.39618

12.53991

3.010734

sub-49

0

2

17.2512

12.00598

12.82691

12.77376

3.022601

sub-50

0

1

21.5168

12.30727

12.64311

13.01111

2.61066

sub-54

0

2

20.4025

11.73907

12.05339

12.29319

3.340324

sub-57

0

2

22.9541

12.14335

12.17836

12.32658

2.753877

sub-60

1

1

24.9802

12.24745

13.12724

12.75381

2.682995

sub-62

1

1

20.835

9.595401

13.08953

12.34063

7.533185

sub-64

0

1

22.7105

12.20094

12.92608

13.08893

2.746763

sub-70

1

1

19.6906

12.11621

12.58184

12.88524

2.878077

sub-72

0

1

27.6468

12.35759

12.93961

12.45314

2.532012

sub-74

1

1

19.4579

12.20314

12.72635

12.89215

2.757957

sub-76

1

2

16.1725

12.12464

12.602

12.71963

2.835174

sub-77

1

2

29.7221

12.03648

12.43141

12.48822

2.915327

sub-79

1

1

29.2156

12.15814

12.74276

12.49014

2.806728

sub-81

0

2

24.0329

11.57833

12.61144

12.21913

3.584144

sub-82

1

1

20.7036

11.97128

12.6046

12.61114

3.07267

sub-85

1

1

21.473

8.123896

12.65039

11.83086

12.80303

sub-88

0

1

17.6865

7.565774

12.60962

12.63499

15.60003

sub-91

1

1

23.0746

12.03369

12.96656

12.76931

3.013987

sub-92

1

2

26.2861

5.551898

12.70285

12.43918

31.07688

sub-94

1

1

26.4504

11.98281

12.4255

12.55032

2.992535

sub-95

1

1

27.7728

6.691368

12.73918

12.62031

21.08423

sub-96

1

1

26.0315

9.375612

12.85323

12.29352

8.17902

sub-97

0

2

23.5893

11.82164

12.36482

12.35344

3.200212

sub-99

1

1

26.2177

12.32393

12.88066

12.84452

2.622934

sub-101

0

1

27.9069

11.74004

12.80538

12.54912

3.376228

sub-102

0

1

27.6797

12.46726

12.89205

12.97723

2.409048

Participant_ID

Condit

Gender

WM_Skew

CSF_Skew

GM_Kurt

WM_Kurt

CSF_Kurt

sub-01

1

1

1.985023

1.664887

5.625375

2.010741

0.874765

sub-05

1

2

2.687912

2.404193

9.120774

5.281534

3.864119

sub-07

1

2

2.284033

2.237405

9.536508

3.302552

3.146744

sub-09

1

1

2.431038

2.231216

6.880804

3.98261

3.071991

sub-11

0

1

2.410243

2.144477

6.994258

3.85864

2.66835

sub-12

0

2

2.45376

2.065538

8.391828

4.106116

2.359412

sub-15

0

1

2.710385

2.313825

5.549653

5.401905

3.426481

sub-17

1

2

2.259034

3.060007

3.772621

3.180216

7.605851

sub-20

0

1

2.2596

2.359489

7.619659

3.168865

3.669207

sub-27

1

1

2.106314

2.557381

6.775699

2.482405

4.655326

sub-31

1

1

1.666598

1.855158

4.54157

0.85202

1.551471

sub-35

0

1

2.486544

2.037742

4.320545

4.247953

2.227597

sub-36

0

1

1.904577

2.002929

3.352107

1.709058

2.123538

sub-37

0

1

2.431768

2.003842

3.251137

3.990645

2.125017

sub-43

0

2

1.93041

1.53436

4.727364

1.825502

0.474982

sub-44

1

1

2.286157

1.773638

4.626502

3.302493

1.237416

sub-46

0

2

2.36948

2.210769

7.357092

3.677938

2.976398

sub-49

0

2

1.87586

1.95703

7.533216

1.592919

1.950048

sub-50

0

1

2.089474

1.696708

5.132816

2.452331

0.999641

sub-54

0

2

2.805965

2.515272

9.498311

5.956816

4.43835

sub-57

0

2

2.636091

2.463638

5.792543

5.005786

4.154982

sub-60

1

1

1.539838

1.973392

5.520499

0.429406

2.000313

sub-62

1

1

1.588713

2.630482

55.30403

0.598988

5.374245

sub-64

0

1

1.768428

1.602328

5.882329

1.208019

0.666821

sub-70

1

1

2.154943

1.827423

6.67573

2.718866

1.444613

sub-72

0

1

1.758182

2.304305

4.69582

1.176168

3.381497

sub-74

1

1

1.991945

1.822172

5.98601

2.049448

1.430453

sub-76

1

2

2.133213

2.015109

6.357356

2.629564

2.174868

sub-77

1

2

2.330811

2.266968

6.765263

3.516472

3.216474

sub-79

1

1

1.968295

2.287928

6.23255

1.944068

3.368965

sub-81

0

2

2.115741

2.607577

11.21939

2.540522

4.913155

sub-82

1

1

2.123087

2.138037

7.869415

2.569882

2.681093

sub-85

1

1

2.07612

3.28911

163.9253

2.387369

9.315454

sub-88

0

1

2.114881

2.100576

244.444

2.529592

2.500617

sub-91

1

1

1.722732

1.964015

7.555999

1.044702

1.979462

sub-92

1

2

2.002396

2.328674

966.8913

2.053562

3.508466

sub-94

1

1

2.328989

2.194856

7.241432

3.474183

2.896855

sub-95

1

1

1.969884

2.130251

446.5164

1.944178

2.655247

sub-96

1

1

1.847686

2.67935

65.62735

1.489496

5.618877

sub-97

0

2

2.40312

2.428423

8.516998

3.827371

3.97517

sub-99

1

1

1.815508

1.879332

5.279062

1.368644

1.651719

sub-101

0

1

1.891488

2.203198

9.812996

1.633736

2.949685

sub-102

0

1

1.797209

1.722933

4.107153

1.289614

1.066185

Participant_ID

Condit

Gender

WM_Std

CSF_Std

GM_Moment

WM_Moment

CSF_Moment

sub-01

1

1

54.70283

42.45232

436.7411

436.7411

1802.199

sub-05

1

2

45.34751

37.8722

466.2618

466.2618

1434.304

sub-07

1

2

49.20678

35.30661

303.0794

303.0794

1246.557

sub-09

1

1

56.46307

44.46524

689.9869

689.9869

1977.158

sub-11

0

1

39.95149

33.55183

379.3476

379.3476

1125.726

sub-12

0

2

56.01636

46.01071

603.2624

603.2624

2116.986

sub-15

0

1

54.51003

46.4504

910.362

910.362

2157.639

sub-17

1

2

35.95828

38.36526

592.6041

592.6041

1471.893

sub-20

0

1

56.01038

41.31276

560.1405

560.1405

1706.744

sub-27

1

1

47.91075

32.00488

406.4956

406.4956

1024.312

sub-31

1

1

69.08244

47.27261

650.1026

650.1026

2234.7

sub-35

0

1

42.94915

36.56025

532.312

532.312

1336.652

sub-36

0

1

56.25591

39.26871

609.6807

609.6807

1542.031

sub-37

0

1

59.33484

47.96401

888.9289

888.9289

2300.546

sub-43

0

2

53.34031

40.47425

390.2519

390.2519

1638.165

sub-44

1

1

65.39031

54.18927

897.8488

897.8488

2936.477

sub-46

0

2

48.84003

38.4539

484.1048

484.1048

1478.703

sub-49

0

2

65.52571

45.0955

518.1019

518.1019

2033.604

sub-50

0

1

62.54297

47.47646

602.0927

602.0927

2254.014

sub-54

0

2

45.56498

36.52336

430.4714

430.4714

1333.956

sub-57

0

2

49.94759

40.20628

725.0311

725.0311

1616.545

sub-60

1

1

75.32959

48.6925

726.2765

726.2765

2370.96

sub-62

1

1

62.12871

19.82407

14.49107

14.49107

392.9938

sub-64

0

1

52.06992

38.56744

364.6892

364.6892

1487.447

sub-70

1

1

51.37856

40.37275

411.7972

411.7972

1629.959

sub-72

0

1

38.63164

45.47531

440.6541

440.6541

2068.004

sub-74

1

1

58.26906

43.80368

487.2381

487.2381

1918.763

sub-76

1

2

58.9191

43.65753

576.6256

576.6256

1905.98

sub-77

1

2

47.517

38.08298

517.321

517.321

1450.314

sub-79

1

1

50.65068

33.36673

373.8015

373.8015

1113.338

sub-81

0

2

57.08382

37.755

439.3729

439.3729

1425.44

sub-82

1

1

54.35659

40.12008

428.9033

428.9033

1609.621

sub-85

1

1

62.8046

21.84646

7.706772

7.706772

477.268

sub-88

0

1

69.1323

52.65021

7.937528

7.937528

2772.045

sub-91

1

1

74.36122

49.86387

593.3442

593.3442

2486.405

sub-92

1

2

48.84925

34.84938

1.195492

1.195492

1214.479

sub-94

1

1

57.74932

46.61135

714.2899

714.2899

2172.618

sub-95

1

1

51.55567

36.01391

2.013435

2.013435

1297.002

sub-96

1

1

64.26879

22.00471

15.18876

15.18876

484.2071

sub-97

0

2

49.15964

38.40915

514.305

514.305

1475.263

sub-99

1

1

60.21483

41.48608

465.2156

465.2156

1721.095

sub-101

0

1

49.77279

34.72399

318.166

318.166

1205.756

sub-102

0

1

61.45561

45.48675

622.6346

622.6346

2069.044

Figure 8 shows the correlation coefficients of the seventeen extracted features, with the best positive correlation coefficients reported for GM entropy, CSF entropy, CSF standard deviation, and CSF moment with values of 0.224,0.319,0.234 and 0.234 respectively.

The twelve-layer neural network was trained with the training dataset, to predict schizophrenia symptom severity score from the extracted features of the three segments of the brain tissue (GM, WM, and CSF), Figure 9 represents the graph of loss vs iteration number during the training. Five features with the best positive correlation coefficients were selected to build the regression model.

The test dataset was used to validate the ANN regression model, the predicted severity scores and actual scores are presented in Figure 10.

The correlation coefficient and mean absolute error (MAE) between the predicted scores and the actual scores are 0.811 and 1.44. The correlation value shows that there is a strong relationship between the model output and the actual score, and this is corroborated in Figure 10 as the two regression lines follow the same trend.

For the classification, which refers to being able to distinguish between schizophrenia subjects and healthy subjects, recursive feature elimination (RFE) was used to select the seven most significant features of the seventeen features. The selected features are age, gender, GM entropy, WM kurtosis, CSF entropy, CSF kurtosis and CSF skewness forming a low-dimensional dataset.

Figure 9. Plot of loss vs iteration number during the ANN regression model training

Figure 10. Plot of the actual severity scores and ANN model predicted scores

Figure 11. Confusion matrix of the SVM classifier on the test dataset

An SVM was trained, and the test dataset was used to validate the classification model. Figure 11 is the confusion matrix representation of the validation output.

Classification accuracy, precision, recall and F1-score computed from the confusion matrix are 81.8%, 87.9%, 81.8% and 82.1% respectively as presented in Figure 12.

Figure 12. The evaluation outcome of the SVM classifier

It is important to note that the skull stripping operations do not always produce a perfect result, and this occasionally misrepresents the brain clusters (a cluster could be eroded or extended) and can lead to bias in the features extracted from the affected cluster.

5. Discussion

With the extracted statistical features from GM, MW, and CSF segments of the structural magnetic resonance image, the SVM classifier achieved an accuracy of 81.8%and precision of 87.9%. In terms of predicting the symptom severity scores using the summation of SANS and SAPS psychiatric measuring scales, the regression ANN model result was satisfactory with a correlation coefficient of 0.811 between the actual and predicted severity values.

For the regression problem, it was observed that GM entropy, CSF entropy, CSF standard deviation, and CSF moment are the features that had the best positive correlation coefficients with schizophrenia symptom severity; this implies that they are the most significant features in the regression problem and suitable biomarkers for estimating the severity of schizophrenia.

Also, effective biomarkers for the identification of schizophrenia are GM entropy, WM kurtosis, CSF entropy, CSF kurtosis, and CSF skewness based on the obtained results.

Previous studies focused on classification problems, which involve differentiating schizophrenia patients from healthy subjects, while this current work focuses on both classification and regression problems, with more emphasis on the latter being the existing gap being addressed. The classification performance recorded in this research (81.8%) is within the range of what was reported in the previous studies (69.15% to 96.7%). With our regression model, schizophrenia symptom severity can be estimated to a very high degree of accuracy with a 0.811 correlation coefficient to the actual symptom severity scores.

A closely related work developed regression models for estimating schizophrenia symptom severity whose correlation coefficients ranged between -0.6 and -0.702 with EEG data [15]. Our regression model is better for two reasons: (1) It has a positive correlation, and (2) the absolute correlation coefficient is higher (0.811 > 0.702).

Estimating symptom severity is a capability that will help in monitoring treatment effectiveness, assist the clinician in making the right decisions, and lead to an overall improvement in the treatment of schizophrenia.

To make the outcome of this research practically useful, the two models will be integrated with the preprocessing modules as depicted in Figure 13. When the magnetic resonance imaging data is submitted, a greyscale image of the structural magnetic resonance imaging is extracted. It then undergoes an image preprocessing stage that includes standardization, denoising, skull stripping, and segmentation. Statistical features extracted from the segments of the brain are passed as input to the SVM classifier that determines if the subject has schizophrenia or not. If the subject is not a schizophrenia patient, the flow ends. In a situation when the subject is classified to have schizophrenia, the features are then forwarded to the ANN regression model that estimates the symptom severity score.

Figure 13. Integrations of the models and preprocessing program for practical use

The proposed approach in this work produced an outstanding result, but it is important to note that the dataset size of 43 is small; it is therefore advised that the procedure be used on a large dataset for assurance.

It is also recommended to use Deep Reinforcement Learning to drive the morphological operations (repeated series of dilation and erosion) to enhance the results of skull stripping and, inevitably, the quality of the features that are extracted from the three segments (WM, GM, and CSF).

Exploring the combination of structural MRI with other neuroimaging techniques, such as functional MRI, for the possibility of improving the regression model's performance in terms of symptom severity estimation is also encouraged as further research.

6. Conclusion

The research objectives were achieved. All reviewed previous studies were only concerned with the detection of schizophrenia from MRI data, but with this current work, our state-of-the-art diagnostic models can (1) detect schizophrenia and (2) predict the symptom severity score from structural magnetic resonance imaging data to a very high degree of precision.

We also proposed a software architecture of how the various components developed during this research can be integrated to function as an expert system supporting clinicians in this field.

In conclusion, this regression model for estimating symptom severity will aid in evaluating the efficacy of treatment, guide the physician in making the best decisions which will lead to an overall improvement in the treatment of schizophrenia. The estimation of the severity of schizophrenia symptoms is an area that should be researched further.

  References

[1] Hu, K., Wang, M., Liu, Y., Yan, H., Song, M., Chen, J., Liu, B. (2021). Multisite schizophrenia classification by integrating structural magnetic resonance imaging data with polygenic risk score. NeuroImage: Clinical, 32: 102860. https://doi.org/10.1016/j.nicl.2021.102860

[2] Chen, Z., Yan, T., Wang, E., Jiang, H., Tang, Y., Yu, X., Liu, C. (2020). Detecting abnormal brain regions in schizophrenia using structural MRI via machine learning. Computational Intelligence and Neuroscience. https://doi.org/10.1155/2020/6405930

[3] Takayanagi, Y., Takahashi, T., Orikabe, L., Mozue, Y., Kawasaki, Y., Nakamura, K., Suzuki, M. (2011). Classification of first-episode schizophrenia patients and healthy subjects by automated MRI measures of regional brain volume and cortical thickness. PloS One, 6(6): e21047. https://doi.org/10.1371/journal.pone.0021047

[4] Barbato, A. (1998). Schizophrenia and public health. World Health Organization Division of Mental Health and Substance Abuse, WHO Nations Mental Health Initiative, 1998. 

[5] Madeira, N., Duarte, J.V., Martins, R., Costa, G.N., Macedo, A., Castelo-Branco, M. (2020). Morphometry and gyrification in bipolar disorder and schizophrenia: A comparative MRI study. NeuroImage: Clinical, 26: 102220. https://doi.org/10.1016/j.nicl.2020.102220

[6] Oh, J., Oh, B.L., Lee, K.U., Chae, J.H., Yun, K. (2020). Identifying schizophrenia using structural MRI with a deep learning algorithm. Frontiers in Psychiatry, 11: 16. https://doi.org/10.3389/fpsyt.2020.00016 

[7] Nemoto, K., Shimokawa, T., Fukunaga, M., Yamashita, F., Tamura, M., Yamamori, H., Arai, T. (2020). Differentiation of schizophrenia using structural MRI with consideration of scanner differences: A real-world multisite study. Psychiatry and Clinical Neurosciences, 74(1): 56-63. https://doi.org/10.1111/pcn.12934

[8] Yang, Y., Zhang, Y., Wu, F., Lu, X., Ning, Y., Huang, B., Wu, K. (2017). Automatic classification of first-episode, drug-naive schizophrenia with multi-modal magnetic resonance imaging. Journal of Biomedical Engineering, 34(5): 674-680. https://doi.org/10.7507/1001-5515.201607084

[9] Hu, M., Sim, K., Zhou, J. H., Jiang, X., Guan, C. (2020). Brain MRI-based 3D convolutional neural networks for classification of schizophrenia and controls. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, pp. 1742-1745. https://doi.org/10.1109/EMBC44109.2020.9176610 

[10] Chatterjee, I., Kumar, V., Rana, B., Agarwal, M., Kumar, N. (2020). Identification of changes in grey matter volume using an evolutionary approach: An MRI study of schizophrenia. Multimedia Systems, 26: 383-396. https://doi.org/10.1007/s00530-020-00649-6 

[11] Liu, J., Wang, X., Zhang, X., Pan, Y., Wang, X., Wang, J. (2018). MMM: Classification of schizophrenia using multi-modality multi-atlas feature representation and multi-kernel learning. Multimedia Tools and Applications, 77: 29651-29667. https://doi.org/10.1007/s11042-017-5470-7

[12] Kumari, S., Malik, M., Florival, C., Manalai, P., Sonje, S. (2017). An assessment of five (PANSS, SAPS, SANS, NSA-16, CGI-SCH) commonly used symptoms rating scales in schizophrenia and comparison to newer scales (CAINS, BNSS). Journal of Addiction Research & Therapy, 8(3): 324. http://dx.doi.org/10.4172/2155-6105.1000324

[13] Kay, L., Fiszbein, S.R. (1987). Positive and negative syndrome scale (panss) rating criteria. Schizophrenia Bulletin, 13(2): 261-276 Available: www.ncbi.nlm.nih.gov.

[14] Leucht, S. (2014). Measurements of response, remission, and recovery in schizophrenia and examples for their clinical application. The Journal of Clinical Psychiatry, 75(suppl1): 11378. https://doi.org/10.4088/JCP.13049su1c.02

[15] Kim, D.W., Lee, S.H., Shim, M., Im, C.H. (2017). Estimation of symptom severity scores for patients with schizophrenia using ERP source activations during a facial affect discrimination task. Frontiers in Neuroscience, 11: 436. https://doi.org/10.3389/fnins.2017.00436

[16] Barch, D.M., Repovš, G., Csernansky, J.G. (2014). Working memory in healthy and schizophrenic individuals|openfmri.org. https://openfmri.org/dataset/ds000115.

[17] Swiebocka-Wiek, J. (2016). Skull stripping for MRI images using morphological operators. In Computer Information Systems and Industrial Management: 15th IFIP TC8 International Conference, CISIM 2016, Vilnius, Lithuania, September 14-16, 2016, Proceedings 15, Vilnius, Lithuania, pp. 172-182. http://dx.doi.org/10.1007/978-3-319-45378-1_16

[18] Kim, K., Duc, N.T., Choi, M., Lee, B. (2021). EEG microstate features for schizophrenia classification. PloS One, 16(5): e0251842. http://dx.doi.org/10.1371/journal.pone.0251842

[19] Alimi, S., Adenowo, A.A., Kuyoro, A.O., Oludele, A. (2022). Quantitative approach to automated diagnosis of Malaria from Giemsa-Thin blood stain using support vector machine. In 2022 5th Information Technology for Education and Development (ITED), Abuja, Nigeria, pp. 1-8. http://dx.doi.org/10.1109/ITED56637.2022.10051472

[20] Espinola, C.W., Gomes, J.C., Pereira, J.M.S., dos Santos, W.P. (2021). Vocal acoustic analysis and machine learning for the identification of schizophrenia. Research on Biomedical Engineering, 37: 33-46. http://dx.doi.org/10.1007/s42600-020-00097-1