Alzheimer’s disease classification and detection by using AD-3D DCNN model

ABSTRACT


INTRODUCTION
The three most frequently caused neurologic diseases are Parkinson's disease (PD), Alzheimer's disease (AD), and schizophrenia (SZ) distinguished as disorders from regular healthy brain functioning [1].Individuals affected with either of these three diseases trigger the family with enormous trouble along with health care services.Thus, it's very difficult to identify these brain diseases at their early stage [2], [3].AD is a chronic neuronal disorder that relentless procession affects human retentivity, analytical capabilities, and memory.AD is caused by excess tau-hyper phosphorylation and Aβ (Amyloid-β) production [4].The hippocampus part is affected first due to this disease as it is inextricably linked with analysis and memory; thus, the common and earliest symptom is memory loss [5].To date, the main cause of this disease is obscure, and it's considered hereditary.Thus, detecting disease in the early stages impedes development [6], [7].Various image techniques like magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT) are used for detecting AD.For diagnosing AD tissue decay can be used and the MRI segmentation obtained at varying times can also be used to determine the morphological variations of the brain.Précised segmentation of the tumor region and its neighboring tissues is important for AD diagnosis and classifying the disease type and it is also noted from recent research that enormous data is necessary for diagnosing accurately.However, it's challenging for practitioners to manually examine and ISSN: 2302-9285  Alzheimer's disease classification and detection by using AD-3D DCNN model (Afiya Parveen Begum) 883 extract significant features from enormous and complicated data.Since MRI scans contain different interoperator and intra-operator variability concerns; analyzing MRI scans manually becomes a time-consuming, difficult task and subject to errors [8]- [14].Consequently, there is a need to develop a system for automatic segmentation and detection to accurately detect the disease and enhance the system's performance.Dementia is the word used to describe memory allied neuro disorders; AD is one of the variants of dementia.According to WHO they are around 55 million cases of dementia and 100k new cases in all years.60-70% of dementia suitcases progress AD.Mild cognitive impairment (MCI) period between a normal person's expected cognitive falls to major dementia fall.Memory loss, pondering, decision, and language issues are some of the characteristics of MCI [15].Subsequently, MCI can develop into dementia as a result of Alzheimer's disease, whereas in some cases it never relapses and, in some cases, finally it recuperates.Every year 10 to 20% of MCI cases prosper to AD and this advancement takes years.It's still a challenge to detect sMCI (stable MCI) i.e., cases that won't advance to AD, and to detect progressive MCI (pMCI) i.e., which progress to AD. Hence early detection is necessary to improve patients' health conditions and to increase the survival rate.MRI is widely used to analyze, detect and classify AD.Standard machine learning algorithms require domain experts for feature extraction and it observed that user-specified feature approaches are confined to certain limitations and produce diminished outcomes.The performance of the system can be enhanced by using approaches proficient in automatic feature learning (i.e., deep learning) based on inputs and problems specified.Since deep learning stipulates instinctive feature extraction, the performance of the system can be enhanced with accurate outcomes.Among all DL approaches CNN based its variant mostly use via DNN, RNN, and so on.DL base medical applications also include neuroimaging segmentation as it enhances the complete analysis approach.MRI scans are used to segment the brain's abnormal tissues to detect and classify AD [16]- [18].
AD advances its structure and damages the brain biopsy, shrinks the cerebral cortex and the hippocampus region and ventricles get elongated.Different stages of AD (MRI scans) are presented in Figure 1 i.e., very mild, mild, and moderate.Some of the imaging modalities or biomarkers are necessary for précised detection of disease at its early stage.fMRI, MRI, PET, DTI, and DWI are some of the brain modalities used for detecting and diagnosing brain disease, among all these fMRI and MRI are mostly used.Deep learning techniques are applied classification of different stages of AD using MRI [19], three various preprocessing techniques were applied i.e., skull-striping, cerebellum removal, and spatial normalization, and autoencoder was applied for feature extraction.Whereas, for classification, SVM is applied and obtained with around 95% accuracy.Research by Saikumar et al. [20], 3D CNN algorithm and autoencoder were proposed for classifying AD stages using MRI scans.Developed a model based on multimodal deep learning techniques for detecting stages of AD in the early stage [21], [22].Denoising autoencoder was applied for feature extraction classification using CNN.The proposed model identifies three different affected regions of the brain i.e., amygdala region, hippocampus region, and rey auditory verbal learning test (RAVLT) using ADNI datasets.[23] implemented CNN variant algorithm for the detection and classification of AD using the OASIS dataset.Various AD detection machine learning algorithms were compared such as SVM along with automated extracted features and SVM with manually extracted features, and AdaBoost.Higami et al. [24] developing a multi-modal DL network to predict AD using MRI data, CSF biomarkers, longitudinal cognitive measures, and cross-sectional neuroimaging modalities.The system also predicted the risk of developing AD.In developed a model using deep residual learning integrated with transfer learning algorithms for classifying six different stages of AD Alzheimer's disease.Ebrahimighahnavieh et al. [25] proposed a method for classification of AD using CNN variant i.e., LeNet architecture, and obtained around 96.86% accuracy and it also assists to predict the different stages of AD for a mixed range of ages.Research as discussed earlier they are three AD stages (i.e., very mild, mild, moderate), and the detection of AD an inaccurate before the moderate stage.Therefore, a deep learning architecture is proposed to detect various stages of AD.Key contributions proposed system are as follows: CNN architecture is implemented which detects AD and even it can classify the stages of AD.The key significance of this work develops a DL-based system that detects AD classifies its stages accurately and provides a summary comparison of existing systems with the proposed system and the rest paper is organized following the way in section 2 literature surveys, section 3 methodologies, section 4 result & discussions, section 5 conclusions.

METHOD 2.1. Proposed architecture
A CNN is a deep-learning approach influenced by the visual cortex.CNNs are expert models for analyzing image data and are developed for a better understanding of dimensional data obtained from 2D or 3D images and for feature extraction by employing a esemble stack of convolutional layers.The key concept and asset of CNN are classification and feature extraction is performed as a single task i.e., the domain expert is not required for feature extraction.CNN is implemented with several layers such as the input layer, convolutional layer, fully connected layer, and output layer, and various operations like pooling, activation, and sampling.The working of CNN is it first acquires input and assigns labels, and weights along with biases to different image pixels to distinguish images.Convolutional operation is performed for the entire input with various trainable kernels by employing a descending window approach to generate different feature maps which represent dissimilar input features.Pooling function, activation function like rectified linear unit (ReLU), and transformations operations are also applied to converge the network.

2D CNN
The convolutional layers are the core layers of CNN-based networks.It achieves the output by convolution operation of different kernel sizes to the input.Kernels or filters are used in these layers whose parameters are analyzed while model training.The kernel/filter size applied to the image is smaller than the actual input image size and all the filters applied convolves with an image and generate activation maps.Figure 2 represents the convolutional operation; i.e., the filter is slid over the entire image and the dot product between all the filter elements is computed and the input is computed for all the spatial points.The feature map is generated by convolving (passing the filter over the image) filter over the input image as seen in the above figure and this process is repeated till the entire image is convolved to generate the complete feature map.All the entries of the feature map are considered as the neuron outcome.Thus every neuron is linked with regions of an image, where the area size is equal to the filter size, and the parameters are shared among all the neurons, because of the local connectivity in the layer it learns which filters have the optimum response time towards the inputs local region.
ReLu is the nonlinear function that presents the nonlinearity in the network to accelerate the learning process.Confirming the productivity of the convolutional process is not a linear grouping of the contributions.For the experimental analysis, the ReLu activation function is chosen and for each input region, activation is performed.
Pooling layers the function of pooling layers is to minimize the dimensionality of feature maps.Therefore it reduces the computational time to train the network as it reduces the parameters to be learned.The key role of the pooling layer is to generalize the features generated by the convolutional layer.In this research work, a max-pooling layer with a 2×2 filter size is used.
FCNN layer a fully connected layer is the feed-forward neural network (FFNN), where the input is the outcome of the previous layer and it flattens the input.The flattened vector is approved to the fully associated layer, the following computation is done in these layers and it is repeated for each layer in the network.At last, it employs a softmax function to generate the input probabilities of a particular class.
Where f represents the activation function, W is the weight of the matrix, i is the input, b is the bias vector.As seen in Figure 2 the model consists of various layers and the underlying operations like convolution, activation ReLU, and pooling.These layers are stacked on each other in a specific connectivity paradigm named "dense connectivity" i.e., all the layers in the network are connected.For classification, the softmax layer is used to classify AD stages such as non-demented (ND), very mild, mild, and moderate. where, The outcome of the model is modified by using the cost matrix represented as "ξ".As the least common categories moderate, mild, and very mild are diminished in the training dataset, the outcome was modified.For example, consider "o" as the outcome of the network, "s" as the specific stage of disease, and "L" as the loss function, the "O" denotes the modified result as shown in (1).The mathematical representation of the loss function is as shown in (2).

3D CNN
The output of the CNN model is infused with the proposed model AD-3D densely connected convolutional neural network (3D DCNN), working of the proposed model is explained below.Over the years, the 3D neural network was the frequently used model for classification.Since the 3DCNN can examine and locate the region of interest in the frame of objects.The 3D DCNN generates a 3D feature map during the convolution step, which is used for analyze the data and map the features to the original features.A three-layered channel is utilized for the 3D convolution of the dataset to compute the portrayal of components at a low level.The portion moves in three headings (x, y, z), as displayed in the following figure.The value of every region of the feature map is calculated by the following mathematical formula.
Where    represents the kernel value connected to the feature map of the previous layer and size of the kernel is given by   .
Recent years have seen a huge increase in the use of 3DCNN-based approaches that employ 3D convolution layers for feature extraction from input data.Therefore, 3D densely connected convolution network is developed for the better and accurate classification.A 3D convolution window that slides along input data serves as the foundation for the 3D convolution layer.Above the data, the 3D convolution window shows many filters (each filter detects a different pattern).These 3D filters are moved in all three directions.Figure 3 depicts the proposed 3DCNN's architecture.The layers of this architecture are as follows: a. 3D-DCNN layers: 3DCNN layers, facilitate the recognition of objects in images.A three-dimensional filter moves in three directions is present in each layer i.e., x, y, z.A convolutional map is produced during the 3D convolution process, which is required for data analysis as well as time and volumetric context.b.MaxPooling layer: image data can be compressed using the MaxPooling layers for 3D data (MaxPooling3D).A mathematical operation called MaxPooling 3D can be used with 3D data as well as spatial or spatiotemporal data.For the max pooling procedures, the layers are defined using the  ×  ×  areas as corresponding filters.In addition, a stride is specified, which determines how many pixels the filter will advance over the image in each step.c.Batch normalization: each batch's previous layer is normalised using the batch normalisation structure.
Batch normalisation changes the mean activation to 0 and the standard deviation to 1. d. Dense layer: a dense layer made up of completely linked neurons that is typically one of the bottom layers.e. Flatten layer: the neural network's final layer, is called as flatten layer, which is used to transform the matrix into an output vector.Three-dimensional convolutional layers dominate the architecture.However, it also has strata that are a crucial component of every architecture and are always flat and dense.Based on research Vrskova et al. [26], we dealt with a 3D convolutional network, which prompted us to enhance the prior network and provide improved outcomes.Due to the excessively high number of layers used in the prior research, the architecture's performance was poorer.As a result, we've chosen to cut back on the layers in this proposal.As we can see from the outcomes, we were correct in our predictions, and the outcomes were really better.Mathematical operations, which require that the output from the layer must not be negative and must be an integer output, impose restrictions on hyperparameters like the number of filters and the core size of 3D convolution layers and MaxPooling.To get the best outcomes, we adapted an optimization strategy for ConvLSTM that we had previously presented [27] and applied it to this network.

Model configuration
Table 2 represents the architecture of the AD-3D DCNN.The first column depicts the deep CNN layer name, several filters used are represented in the second column, the third denotes the filter/pool size, the next indicates the number of parameters of the layers i.e., dimension, and the last depicts the layers that are concatenated together.A total of nineteen convolutional layers are employed which fourteen are 2D convolutional layers and the remaining are 3D convolutional layers, each convolutional layer is followed by the activation layer i.e., ReLu, followed by a pooling layer of 2x2 size, which uses sixteen filters of 3x3 size to reduce the dimensions of the image.2D and 3D convolutional networks are fused together by forming a single framework, the fusion is done by using the concatenation operation as seen in the

RESULTS AND DISCUSSION
Dataset is acquired from the ADNI database, in this ADNI is a wide range of polycentric datasets destined for developing various modalities like neuroimaging, medical biomarkers, biochemical, and genetic biomarkers for identification, classification of AD stages, and diagnosing AD.Various neuroimaging modalities are included in this dataset via, Fmri, MRI, DTI, and PET.For the experimental analysis of our proposed model, we used Fmri scans which contain 138 subject matters like 25 AD, 25 ognitive normal (CN), 25 EMCI, 25LMCI, 13 MCI, and 25 SMC.These scans are of people aged above 71 who had suffered with and were diagnosed with different stages of AD.Eleven features are described in the dataset in Table 3. Dataset is divided into an 80-20% training and testing set respectively.Different performance measures are used for system evaluation like recall, precision, accuracy, and f-score.The Figure 4 shows the precision and recall curve for stages of Alzheimer's disease and Figure 5 shows the F-Score graph obtained for predicting Alzheimer's disease.Table 4 demonstrates all the results (accuracy, precision, recall, and fscore) obtained after classifying all the stages of AD using the ADNI dataset.The proposed model is compared to existing pre-trained models like Xception, inception V3, MobileNet, and DenseNet and achieved better accuracy as shown in Table 5. Figure 6 depicts the overall test and train accuracy obtained at different stages of the disease.Proposed model classification accuracy in comparision with existing algorithms is shown in Figure 7.

CONCLUSION
AD is an untreatable neural disease with a high death rate throughout the world.Thus, early identification is important to decrease the mortality rate and increase the patient survival rate, and enhance the treatment.This paper developed a proficient approach for predicting and categorizing the four dissimilar phases of AD i.e., non-demented, very mild, mild, and moderate using ADNI-fMRI dataset at its early stage.The proposed method uses a deep CNN model for training and classifying the four stages of AD.The AD-3D DCNN architecture obtained 97.53% accuracy and 98% f-score.When compared to other pre-trained transfer learning models the proposed model achieved better accuracy.

Figure 1 .
Figure 1.Different stages of AD

Figure 4 .Figure 7 .
Figure 4. Precision and recall curve for stages of Alzheimer's disease 889

Table 1 describes
the summary of existing machine learning and deep learning techniques used to detect and classify the Alzheimer disease. ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 12, No. 2, April 2023: 882-890 884

Table 1 .
Summary of literature survey AD detection and classification using deep learning and machine learning technologies

Table 2 .
The architecture of the deep learning model, the classification layers are represented with *

Table 4 .
Results obtained for different stages of AD

Table 5 .
Contrast of the planned model with existing AD stage classification models