Transfer learning for detecting COVID-19 on x-ray using deep residual network

Coronavirus 2019 (COVID-19), caused by the SARS-CoV-2 virus, has been a disaster for humanity, especially in the health sector. Covid-19 is a serious disease, a large number of people lose their lives every day. This disease not only affects one country, but the whole world suffers from this viral disease. In the fight against COVID-19 immediate and accurate screening of infected patients is essential, one of the most widely used screening approaches is chest X-Ray (CXR) which is rated faster and cheaper. This study aims to detect patients suffering from COVID-19 through chest X-Ray using a transfer learning approach, the method used is with several deep residual network architectures such as ResNet50, RexNet100, SSL ResNet50, semi-weakly supervised learning (SWSL) ResNet50, Wide ResNet50, SK ResNet34, ECA ResNet50d, Inception ResNet V2, CSP ResNet50, and ResNest50d. Then the results will be compared with previous studies. The study was conducted ten times using different pre-training and got the best results on the SWSL ResNet50 architecture with an accuracy value of 99.28%, this value increased 6.98% from previous studies, 99.51% F1-Score, 99.41% Precision, 99.61% Sensitivity, and 98.33% Specificity, that means this study obtained better results than previous studies.


INTRODUCTION
The COVID-19 virus, generated by the SARS-CoV-2 virus, has been a calamity for humanity, particularly in the health industry, economic sector and education sector [1].For instance, a recent infection surge in India has prompted many families to seek care at home due to a scarcity of intensive care units [2].COVID-19 is a terrible disease, and a significant number of individuals are dying each day.This sickness does not affect just one country, it affects the entire world [3].The modern world is infested with COVID-19 illnesses.Rapid and reliable detection of infected patients is critical in the fight against COVID-19.Not only that, but COVID-19 also affects the education sector where students have difficulty obtaining an education because they have to learn from home, which results in little progress for students or even no progress when learning from home [4], [5].This has a significant impact on almost every element of one's life [6].
Several methods have been used to identify COVID-19 patients, including swab testing, rapid test, and antigens.Chest X-ray (CXR) and computed tomography (CT) are frequently used screening techniques that aid in the diagnosis of COVID-19 patients by comparing healthy and normal lungs, particularly when viral tests are unavailable [7].Even screening with a chest X-ray is quicker and less expensive [8].
There are many previous research on the detection of COVID-19, such as that conducted by Umair.Using the transfer learning method, Umair created deep learning model capable of diagnosing Covid-19 ISSN: 2302-9285  patients from X-ray data, achieving an F1 score of 98.36 percent on ResNet-34 and 97.56 percent on ResNet50 with only 406 X-ray images [9].In another study Ismael [10], using 380 X-ray images data achieved F1 score of 95.92% on ResNet50 features and SVM approch.Another study Imaduddin [11] using ResNet 50 obtained 92.3% accuracy, 93% F1-score, 93% precision, 90.7% specificity and 99% sensitivity on X-ray datasets.Research conducted now will be compared with research that has been done before.The goal of this project is to develop a COVID-19 detection model using the transfer learning approach on multiple residual network architectures in order to train computers using weight models built on larger datasets.Once the learning process is complete, computers may be used to diagnose COVID-19, which is extremely beneficial for expediting screening in response to the current pandemic emergency.We conducted 10 studies using 10 different pre-training based on residual network, and pytorch framework to obtain even better accuracy scores.A breakdown of the paper's structure is provided below.We provide a brief summary of the dataset, data pre-processing, convolutional neural network (CNN) classifiers, and different pre-training strategies in section 2. Section 3 presents the experiment's results and discussion.Finally, based on the study in section 4, we draw a few conclusions.

METHOD
According to this study, the transfer learning method was used to classify x-ray image data in order to identify COVID-19 disease.Chemotherapy X-rays were used to conduct various studies with ten different residual network topologies, such as Covid 19.Models that have been pre-trained include ResNet50, RexNet100, SSL ResNet50, semi-weakly supervised learning (SWSL)ResNet50, Wide ResNet50, SK ResNet34, ECA ResNet50d, Inception ResNet V2, CSP ResNet50, and ResNest50d.The confusion matrix is used to construct the architecture, which includes data preparation, feature extraction, classification, and model evaluation.Accuracy, specificity, sensitivity, F1-Score and precision are metrics that used to assess model effectiveness.Figure 1 depicts the methodology employed in this research.The research was carried out utilizing dataset of x-ray images that had been gathered from a variety of sources, the dataset available on Kaggle website for free [12], [13].We only used two classes from the dataset, namely the normal class and the COVID class.The total number of images in the dataset was 13,808 images, including 10,192 normal data and 3,616 COVID data.Detailed information about the dataset shown in Table 1, an example of a COVID-19 x-ray data has been shown in Figure 2(a), followed by an example of a normal x-ray data has been shown in Figure 2(b).

Preprocessing
Preprocessing process is a process that is carried out to transform data before the data is used as model learning information.The image quality of the dataset can be improved by preprocessing [14].This process is useful for making the dataset produce more information to strengthen the classification model.The dataset is divided into two classes: normal class data and covid class data, with a total of 13,808 images consist of 10,192 normal images and 3,616 covid images.In this study, we resize the X-Ray image to a size of 224 x 224 and normalize it to ImageNet format with mean = [0.485,0.456, 0.406] and std = [0.229,0.224, 0.225].

Transfer learning
Transfer learning is one of the methods by which the knowledge gained by the CNN from one set of data is transferred to another set of data in order to accomplish a separate but related job that requires additional data [15].Data is one of the most important components of a deep learning approach, and the lack of medical data or datasets is one of the most significant challenges for academics in medical-related research.Fortunately, the availability of medical data or datasets is improving.transfer learning has the virtue of not necessitating the use of big data sets, which is advantageous.Calculations become more accurate and less expensive.When a model has been previously trained on a big dataset, it is transferred to a new model that needs to be trained on fresh data that is smaller than the original dataset.Transfer learning is a type of machine learning technique.The initialization of CNN training with tiny datasets, including large-scale datasets that have been trained in a pre-training model, is performed for a specific task through this technique [16].It is our goal in this study to either modify an existing model or use feature extraction from a previously trained model to train a classification model, after which we will perform fine-tuning in order to transfer information from a new dataset to the previously trained model with a small learning rate.

ResNets
Deeper neural networks are challenging to train, Researchers have developed a residual learning approach for training networks that is significantly deeper than those previously employed.An alternative approach to examining the unreferenced function is for the researcher to explicitly reformulate the layer as a residual learning function with respect to the input layer.There is evidence that this residual network is less difficult to optimize and that it can achieve accuracy at considerably greater depths [17].

ResNext
ResNext is a network architecture for image classification that is simple and highly modulated.By repeating the construction pieces that integrate a series of changes with the same topology, this network can be built up.As a result of this straightforward design, a homogenous multi-pronged architecture with only a few hyperparameters to define is created.We propose a residual learning strategy for training networks with multi-branched layers that is both efficient and effective.In addition to the depth and breadth dimensions, this technique reveals a new dimension, which is called cardinality (the size of the set of transformations), as a crucial issue to consider.This architecture demonstrates that increasing cardinality can improve classification accuracy even in situations of restricted complexity maintenance.Furthermore, while expanding capacity, increasing cardinality is more successful than increasing depth or width [18].

Wide ResNets
A proven deep residual network is capable of scaling up to thousands of layers and still has performance improvements.Researchers provide a residual learning approach by using a wider residual network but reduced network depth.Very deep residual network training has the problem of reduced feature reuse, which makes these networks very slow to train.By reducing the depth and increasing the width of the residual tissue, this problem can be overcome [19].

RexNet
Rank expansion networks are designed for bottleneck design in image classification models by expanding the input channel size of the convolution layer and replacing the ReLU6 activation function only after the first 1x1 convolution in each inverted bottleneck and also using other nonlinear functions such as ELU which is considered to improve accuracy [20].

ResNet SSL & ResNet SWSL
Studying residual functions with reference to layer inputs, instead of studying unreferenced functions.Instead of expecting every few layers that are stacked directly to match the desired base mapping, the residual net lets these layers conform to the residual mapping.They stack leftover blocks on top of each other to form a network.This model utilizes semi-supervised learning to improve model performance.This approach brings important advantages to standard architectures for image, video, and fine classification [21].

ResNet CSP
ResNet CSPNet is a convolutional neural network that applies the cross-stage partial network (CSPNet) technique to ResNet, which is type of convolutional neural network.CSPNet divides the base layer feature map into two pieces, which are subsequently combined through the use of a cross-stage hierarchical structure.Increased gradient flow through the network is made possible through the deployment of a split and merge approach.This network additionally contributes to gradient variability by incorporating feature maps from early and late network stages.It is estimated to reduce computations by 20% while maintaining or even improving accuracy [22].

SK ResNet
SK ResNet is a ResNet version that makes use of a selective kernel unit rather than the traditional ResNet unit.All large kernel convolutions in the original bottleneck block of ResNet are replaced by the suggested Selective Kernel convolutions, which allows the network to select the most optimal receptive field size in an adaptive manner [23].

ECA ResNet
Using the efficient channel attention (ECA) module, ECA ResNet is a ResNet variation network that is similar to the original ResNet.It is an architectural unit built on squeezing and excitation blocks that minimizes model complexity without reducing dimensions to only a handful of parameters while giving a significant gain in performance [24].

ResNest
ResNest is a ResNet version in which Split-Attention blocks are stacked on top of one another.The cardinal group representations are then combined along the channel dimensions to form the final representation.If the input and output feature maps have the same shape as the regular residual block, a shortcut connection is used to construct the final output of another Split-Attention block, just as it is in the standard residual block.In the case of blocks with strides, the required transformation is applied to the shortcut connection in order to align the output shape with the block shape.Because of the simplicity and unity of this architecture, it can be parameterized with only a few variables and is assessed to surpass EfficientNet in terms of accuracy versus latency trade-off in image classification, compared to other approaches [25].

ResNet V2
A convolutional neural network that is based on the Inception architecture family but also adds residual connections is described here (replacing the filter splicing stage of the Inception architecture).The expansion-filter layer (1 x 1 convolution without activation) that follows each Inception block is used to extend the dimensions of the filter bank before addition in order to make it more compatible with the input depth of the block.As a result of the Inception block's dimensional reduction, this is required to compensate.Inception network training was greatly expedited as a result of this concept, which resulted in training with residual connections [26].

Training
This research was carried out using the Python programming language, PyTorch as a framework, and several specific libraries to run deep learning models as part of the process.During our research, we also used Google's cloud computing, which is equipped with a GPU, which reduces the amount of time required to develop a learning model.For the sake of this research, a random seed number of 42 was used to ensure that the trials in this study could be replicated with the same findings.As an example, we use the LogSoftmax activation function on the output layer, followed by a loss function that employs Negative Log Likability and optimizes using AdamW, followed by the EarlyStopping callback function, which is used to terminate the training process if the accuracy value has not decreased within a specified amount of time, presuming that the training process has converged.At patience = 2 for EarlyStopping in the adaptation phase and at patience = 5 for EarlyStopping in the fine-tuning phase, a learning rate of 0.001 for the adaptation phase and a learning rate of 1e-5 for the fine-tuning phase is obtained, respectively.During preprocessing, the data is enlarged to a size of 224x224 and then normalized to ImageNet format for use in the data training, validation, and test phases of the process.In the feature extraction process, we use pre-training that has been previously trained on large amounts of data from ImageNet.The pre-training models include ResNet50, RexNet100, SSL ResNet50, SWSL ResNet50, Wide ResNet50, SK ResNet34, ECA ResNet50d, Inception ResNet V2, CSP ResNet50, and ResNest50d.The feature extraction process is then completed using the pretraining model.As a result of the usage of pre-training, research can be made easier because we do not have to train feature extraction; this technique is referred to as "transfer learning."The only modification that we need to make to the transfer learning architecture is to update the head of the architecture to meet the needs of the dataset that we are utilizing as part of the transfer learning process.Due to the fact that the desired categorization result contains two polarities, two neurons are used at the output in this instance.

Evaluation
Evaluation is a method used to test the classification performance of a model.The evaluation of the model in this research uses a confusion matrix that will produce true positive (TP), false positive (FP), false negative (FN), and true negative (TN) values [27].To evaluate the implemented model, we use several matrices such as accuracy, sensitivity, specificity, precision, and F1-Score.The accuracy formula has been shown in (1), the sensitivity formula has been shown in (2), the specificity formula has been shown in (3), the precision formula has been shown in (4), and the F1-Score formula has been shown in (5).

RESULTS AND DISCUSSION
The testing phase is carried out utilizing Python as the programming language and pytorch as framework, we run program on Google collaboratory as the collaboration platform due to its high performance.research was carried out utilizing a dataset of 13,808, which consisted of 3,616 data that had been infected with COVID and 10,192 data that had not been infected.Despite the fact that the dataset is highly uneven, the transfer learning approach can still produce quite decent results on the test set when applied to the test set.The classification of X-Ray pictures of the lungs (as normal or COVID) on the testing dataset is shown in Table 2.It was discovered using SWSL ResNet50, which had an F1-Score of 99.51 percent, and the model with the highest F1-Score.In contrast, the model with the lowest F1-Score is Inception ResNet V2, which received a score of 97.96 percent.Accordingly, SWSL ResNet50 has the most sensitivity with a score of 99.61 percent, whereas Inception ResNet V2 has the lowest sensitivity with a score of 96.48 percent.The SWSL ResNet50 and ResNest50d models have the highest precision and specificity scores, with precision scoring 99.41 percent and specificity scoring 98.33 percent, respectively.As a result, it can be inferred that the SWSL ResNet50 model is the most effective model for X-ray lung image classification.In previous research, Musleh and Maghari [28] using the CheXNet approch obtained accuracy of 89.7%, compared to this study with accuracy 99.28%.Wang et al. [29] using using the Covid-Net approach to detect covid in x-ray images obtained a sensitivity of 80%, compared to this study with 99.61% sensitivity, in addition Imaduddin et al. [11] using ResNet 50 to get 92.3% accuracy, 93% precision, 93% F1-score, 99% sensitivity, and 90.7% specificity compared to the research conducted with 99.28% accuracy, 99.41% precision, 99.51% F1-Score, 99.61% sensitivity, and 98.33% specificity.

CONCLUSION
It is challenging to identify COVID-19 disease because of the enormous amount of data that is used and the very uneven distribution of that data.Several transfer learning approaches are proposed in this paper, which are based on the residual network architecture and will be applied to data classification.Even though the residual network architecture is based on data from ImageNet datasets that contain less medical data, the transfer learning method can produce good results when applied to identifying medical picture data.When we use transfer learning, we don't have to start from scratch, we only have to change the last layer of our model architecture.When it came to identifying two types of lung X-ray data, our model performed well, with the SWSL ResNet50 model producing the best results, with accuracy of 99.28%, precision of 99.41%, F1-Score of 99.51%, sensitivity of 99.61%, and specificity of 98.33%.

Table 1 .
Lung X-Ray dataset

Table 2 .
Table 2 shows the entire set of classification results from the training sessions conducted.Classification results in the test set