© 2024 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
The COVID-19 epidemic has had a profound impact on our lives. What makes this virus so lethal is its rapid exponential transmission rate between individuals. There is a scarcity of COVID-19 test kits in hospitals due to the significant increase in cases. Efficient and prompt screening of individuals with this condition is essential. This condition must be distinguished from other lung disorders, including pneumonia and tuberculosis. Deep learning provides innovative ways for analyzing medical pictures and serves as a valuable tool in helping radiologists, clinicians, and researchers examine large quantities of X-ray images, which are crucial for COVID-19 screening. In this paper, we propose a new deep learning model that utilises an attention gated technique to classify pneumonia, tuberculosis, and COVID-19. A trainable attention estimator has been included in conventional convolutional neural networks for the purpose of image classification. The suggested model obtained 90.16% accuracy in distinguishing between pneumonia and tuberculosis, 94.02% in distinguishing between pneumonia and COVID-19, and 98.27% in distinguishing COVID-19. Our novel deep learning model, utilizing an attention gated mechanism, accurately detects pneumonia, tuberculosis, and COVID-19 from X-ray pictures. This method tackles the urgent requirement for enhanced COVID-19 screening by providing a simplified option for effective diagnosis. Our methodology improves diagnostic processes by utilizing deep learning, offering substantial advantages for healthcare professionals and patients.
COVID-19, pneumonia, tuberculosis, deep learning, classification
Pneumonia and tuberculosis are two major diseases that affect millions of people around the world. In 2018 alone, 1.5 million people died of tuberculosis, million died of pneumonia in 2016. Every minute, at least two children die of pneumonia. 80% of pneumonia deaths accounted for children themselves [1].
Pneumonia causes infection inside the lung. The person affected by pneumonia feels difficulty in breathing as the lungs are filled with fluid. pneumonia can affect young, elderly, and even immune compromised people. The people who are weak and vulnerable will have high chances of getting affected by Pneumonia. More than 2,400 children under five years old died because of pneumonia in 2015. On average, 120 million cases are recorded per year; among them, 14 million are in severe stage. In 2016 alone, 880,000 children aged less than five died of pneumonia, most of whom were less than two years [2]. If a particular person has a weak immune system, chances of getting pneumonia are very high. If a person has heart disease, asthma, and emphysema, the chances are they would be more likely affected by pneumonia [3]. Pneumonia is divided into two types: bacterial and viral. Bacteria are the most common cause of community-acquired pneumonia in adults [4]. By taking prescribed antibiotics based on doctor's suggestions, bacterial pneumonia can be hindered. Viruses cause viral pneumonia, with those of the same viruses that cause colds and flu, as well as the coronavirus that causes COVID-19 [5]. In viral pneumonia, antibiotics will not work as they work only in case of bacterial conditions. For curing viral pneumonia, the affected person must be given treatment based on the symptoms.
Tuberculosis was the primary cause of death in the United States of America during the twentieth century [4]. In 2018, over 1.5 million people died as a result of tuberculosis. And around two-thirds of all tuberculosis cases were recorded in eight countries, with India leading the way, led by China, Indonesia, the Philippines, Pakistan, Nigeria, Bangladesh, and South Africa. In 2018, 205,000 children died from tuberculosis, out of 1.1 million people who became sick [6].
Tuberculosis is an infectious illness that mostly affects the lungs. It can also travel to other areas of the body, such as the brain and spine. Tuberculosis is caused mostly by the bacteria Mycobacterium tuberculosis [4]. Each and every one who was affected by tuberculosis will not become sick. Two forms of tuberculosis are present: latent and active. Latent tuberculosis affecting a person will have germs in his body, but the person's immune system will stop germs, and it is also not contagious. In the case of active tuberculosis, the bacteria multiply, and it is contagious also. Latent tuberculosis can be easily cured by taking perfect medication as suggested by doctors [4]. Since active tuberculosis is infectious, precautions must be taken, such as avoiding contact with others, wearing a surgical mask during treatment, and covering one's mouth while coughing or sneezing. The most popular and commonly used diagnostic tool is X-rays [4], which includes imaging the body parts. There are other alternatives to X-rays, such as CT Scans, MRI (Magnetic Resource Imaging), Ultrasound Scan. For more sophisticated tests, doctors prefer X-rays. CT Scans may give accurate imaging, but using them gives side effects to the body. A single CT scan is equal to 150 X-ray in terms of radiation [7]. For women who have imaging during pregnancy, there is also a chance of childhood cancer and leukemia [7]. Furthermore, many rural and semi-urban areas do not have CT Scans due to it being expensive. So, X-rays are preferable to CT Scans.
Deep learning-based approaches have recently shown critical use and a wide variety of implementation in various fields such as computer vision, natural language processing, and robotics. In computer vision-based applications, image recognition is a well-known challenge. The cumulative volume of global data will hit 42 ZB by the end of 2020, according to data generated by the Internet Center (IDC) [1].
More precise and reliable forecasts are expected in the case of medical image processing. In remote areas around the world, skilled health staff and physicians are in short supply. These days, computer-assisted diagnoses based on AI-based approaches are emerging as major agencies [8, 9]. Deep learning and computer vision methods for diagnosing biomedical images have proven to be much effective in delivering a fast and accurate diagnosis of a disease that exceeds the precision of a reliable radiologist [10]. When there is a disease that comes up with multiple symptoms that may go unnoticed by the radiologist sometimes, these deep learning-based solutions may help the radiologist by giving some extra information from a given X-ray data (images). The deep learning-based solutions may not replace pre-trained clinicians on doctors, but they give extra information about the disease, which might have been unnoticed. Convolutional neural networks’ use and relevance have evolved significantly in recent years. The key objectives are Model Performance Evaluation, where the primary goal is to assess the performance of the proposed deep learning model in the classification and detection of respiratory disorders, notably tuberculosis, pneumonia, and COVID-19, utilizing medical imaging data such as X-rays or CT scans. Chest X-ray examination is challenging for radiotherapists. Patient X-rays of pneumonia might be ambiguous. Due to the complexity of identifying illness characteristics, disease prognosis is tough [11]. Another goal is to compare the proposed model to existing state-of-the-art models using various performance metrics such as accuracy, precision, recall, and F1 score.
Multi-Class Classification where the model's goal is to achieve accurate multi-class classification, allows for the simultaneous detection of several respiratory disorders from medical images. The work was carried out on Datasets like Montgomery Dataset [12], Shenzhen Hospital [12] and Pneumonia Dataset [13]. The contributions of this paper are as follows:
The paper is structured in the following manner: Section 1 discusses necessity of the study. Section 2 analyses the background and related work which describes previous work alongside the background. This section also describes a complete analysis of the dataset used in this paper to train and test. Section 3 discusses the proposed methodology that was used to address the problem. This section describes the novelty of the proposed methodology, advantages while comparing it with existing techniques. Section 4 analyses the final results of the proposed methodology with the datasets used for testing and comparing with state-of-the-art methods. Section 5 has the conclusion and future scope of the paper.
Image findings are becoming more widely accepted as a valuable method for quickly diagnosing diseases that infect lungs like COVID-19, pneumonia, tuberculosis, etc. Because of its authenticity and wide range of applications, research on disease classification has gained considerable importance in recent years. Due to its limited use in commercial and medical image processing, computer vision, deep learning and image classification have attracted a large number of researchers to work on it.
2.1 Traditional approaches
Owing to readily accessible datasets and ongoing visual analysis in the medical image diagnosis, the feature extraction technique has improved over time. Barstugan et al. [14] used different feature matrices like Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Local Directional Pattern (LDP), Gray-Level Size Zone Matrix (GLSZM), and Discrete Wavelet Transform (DWT) algorithms [15], as feature extraction algorithms. The extracted features are classified to get output predictions using Support Vector Machines (SVM).
Chandra and Verma [16] employed five established image classifiers - Multilayer Perceptron, Logistic Regression, Random Forest, Sequential Minimal Optimisation (SMO), and Classification by Regression - to classify pneumonia in chest X-rays. Pathak et al. [17] used ensemble-based feature extraction techniques and from the extracted features, final classification was performed using a random forest classifier. The accuracy in extracting micro-scale level features on medical images is hampered by methods of restricted scale feature extraction and training.
2.2 Deep learning approaches
2.2.1 CNN based methods
Recent advances in deep learning models, as well as the accessibility of large datasets, have enabled trained algorithms in outperforming healthcare practitioners in a variety of medical imaging tasks like diabetic retinopathy detection, cancer recognition, haemorrhage diagnosis, arrhythmia detection, and many more. Convolution Neural Networks (CNN)-based algorithms are used to extract a high-level representation of epidemiological data. Pathak et al. [17] which is the latest work on the classification of COVID-19, used transfer learning in image dataset. Some of the latest works which use transfer learning for the classification of COVID-19 include Jaiswal et al. [18] which used DenseNet201 based transfer learning technique, Aslan et al. [19] proposed a hybrid model that contains Bidirectional Long Short-Term Memories (Bi LSTM) layer, which considers temporal properties for classification. Transfer learning of feed forward convolutional networks has the advantage of using very deep structures [20-22] and decoder functions in the autoencoder, which is then taken from the feed forward mechanism. Several approaches, such as VGG [23], Inception, and residual learning, have also been recommended to improve the discriminative potential of deep convolutions. Functions like stochastic depth [24], batch normalization [25] and dropout [26] have been initialized to prevent overfitting and to take advantage of regularization for convergence. The above-mentioned models were unable to capture crucial micro-scale patterns from chest X-ray images. More than 70% of the information would be conveyed in the form of a 2D image or a video. Deep learning-based models are created to derive valuable knowledge from these images and videos. ResNet [27], VGG [23], Xception [28, 29], MobileNet [30], Inception [31], and EfficientNet [32] are some of the most advanced deep learning models. These models are used in a numerous computer vision problem, including digital image processing, image classification, face recognition, object detection, object tracking, medical images, among many others.
Hence, deep learning-based approaches have gained potential in medical image diagnosis in the recent past. The deep learning methods in medical diagnosis [33-35] are developed to distinguish chest X-ray images from pneumonia's absence or presence. Amit Kumar Jaiswal et al. used Mask-RCNN, which finds results by configuring regional context [35]. Stephen et al. [33] proposed their architecture and used image augmentation algorithms to improve the CNN model's validation and classification accuracy. Liang and Zheng [34] used transfer learning methods and proposed an architecture that dilated convolutions to classify pneumonia images. Deep learning-based methods are used in study [36] to classify the image into different TB manifestation category. Cao et al. [37] used CNN for feature extraction. Qin et al. [38] compared different deep learning based systems like CAD4TB [20], LunitINSIGHT [39], and qXR [40] for tuberculosis classification. In study [41], a robust normalisation approach was developed to improve detection and classification performance. Many techniques, including artificial neural networks (ANN), support vector machines (SVM), K-nearest neighbour (KNN), ensemble classifiers, recurrent neural network (RNN) and long short-term memory (LSTM) were employed for reliable lung disease diagnosis. For the accurate classification of tuberculosis images, Iqbal et al. [42] introduced TB-DenseNet, a model comprised of five dual convolution blocks, a DenseNet-169 layer, and a feature fusion block. The model obtained a classification accuracy of 95%. A CNN based model, named CovNet30 was proposed by Agrawal et al. [43], for automatically diagnosing COVID-19 from chest x-ray images and achieved an accuracy of 92.74% for the classification.
2.2.2 Segmentation based methods
In recent times, image segmentation followed by classification [44-47] has been successfully adopted in medical image diagnosis. In image classification, an additional segmentation stage is added before the feed forward mechanism. The proposed segmentation stage crops out the important parts of the image and gives only useful information to the classification layers. Unsupervised learning is commonly used to produce region proposals for image classification, unlike object recognition, which relies on complex segmentation masks to generate region proposals. For generating segmentation masks, we have to train an extra segmentation model other than classification mode. This model, however, is completely unnecessary due to the high complexity of bringing in segmentation masks particularly for image classification tasks.
The recent works on COVID-19 classification used a similar type of ensemble technique, where they group different pre-trained models for fine-grained accuracy. Covidx-net [48] and InstaCovNet-19 [49] are examples of such models. To compensate for the lack of training data, the InstaCovNet-19 model employs a variety of pre-trained models such as ResNet101 [27], Xception [29], InceptionV3, MobileNet [30], and NASNet [50]. The proposed model detects and classifies between COVID-19 and pneumonia by detecting anomalies caused by these diseases in the infected person's chest X-ray images. The Covidx-Net [48] is composed of seven distinct architectures of deep convolutional neural networks, including the second version of Google MobileNet and the most recent Visual Geometry Group Network (VGG19). The patient's condition can be determined by each convolutional neural network model through the analysis of the normalised intensities of the X-ray image. Soft attention networks have undergone recent development, and soft attention modules construct feed-forward neural networks using residual attention networks. This is the approach adopted by the authors in this paper. An end-to-end attention module for convolutional neural network (CNN) architecture designed for image classification training was proposed by Jetley et al. [51]. This module accepts as input a two-dimensional feature vector that represents the input image's intermediate representations at different phases of the CNN pipeline. In return, it generates a two-dimensional matrix of scores for each map. By incorporating this module, the conventional CNN architecture is altered. Subsequently, the CNN is trained under the condition that classification utilises a convex combination of intermediate 2D feature vectors, parameterized by the score matrices. In their study, Iqbal et al. [42] introduced a novel TB-UNet architecture that effectively segments lung regions through the combination of attention block (AB) and dilated fusion block (DF). The TB-UNet achieved a remarkable accuracy of 97.70%. In order to enhance the segmentation technique in biomedical image processing and address the bottleneck issue in U-Net models, Turk [52] introduced the RNGU-NET architecture, which integrates ResNet, Non-local Block, and gated attention block. This architecture demonstrated the highest achievable accuracy of 98.56%.
2.3 Dataset used
2.3.1 Tuberculosis
Montgomery dataset: Together with the Department of Health and Human Services, Montgomery County, Maryland, United States of America [12], the MC collection was acquired. Eighty common cases and fifty-eight cases with tuberculosis symptoms are represented among the 138 frontal chest X-rays in the set, which were obtained from the Montgomery County Tuberculosis Screening Network. Static Eureka X-ray systems (CRs) successfully acquired the X-rays.
Shenzhen hospital x-ray set: The X-ray images in this dataset were acquired at Shenzhen No.3 Hospital in Shenzhen, Guangdong Province, China [12]. The X-rays were obtained during standard medical care at Shenzhen Hospital. This set comprises photos in JPEG format. A total of 336 normal x-rays and 326 pathological x-rays with unique TB signs.
2.3.2 Pneumonia
Chinese dataset: The dataset contains 5,863 X-ray pictures in JPEG format categorised into two classes: Pneumonia and Normal. Retrospective chest X-ray images (anterior posterior) were obtained from a cohort of paediatric patients aged one to five years at Guangzhou Women and Children's Medical Centre in Guangzhou. The images were released online by Kermany et al. [13]. Chest X-ray imaging was performed at both sites as part of patients' normal health care. All chest X-rays for the research of chest X-ray pictures were initially inspected for quality control by removing any low-quality, corrupted, or illegible scans. Medical images were evaluated by two specialised doctors and scored before being approved for AI processing. The assessment collection was evaluated by a third professional to check for any score errors.
We divided the images into a train dataset and a test dataset, wherein each dataset there are 4 classes, namely Pneumonia, Tuberculosis, and Normal. In the train set we have 245 images for each category and in test-set, we have 105 images for each category. The ratio between train and test is 7:3 which were randomly sampled from images collected from the above-mentioned sources and shown in Table 1. Sample images from the training dataset for each class are shown in Figure 1.
Table 1. Dataset description
Name |
Pneumonia |
Tuberculosis |
Normal |
Total |
Montgomery Dataset [12] |
- |
58 |
80 |
138 |
Shenzhen Hospital [12] |
- |
326 |
336 |
662 |
Pneumonia Dataset [13] |
3906 |
- |
1341 |
5247 |
Figure 1. Sample images from the training dataset
The approach proposed in this paper, shown in Figure 2, has been postulated on the hypothesis that attention maps can be used to identify important image regions and magnify their impact while diminishing unnecessary image regions. Jetley et al. [53] proposed this approach. A trainable attention estimator has been integrated into standard CNN for the image classification. In a CNN, the local feature vectors are obtained at intermediate stages and the global feature vectors are fed to the linear classification layer. A notion of adaptability has been implemented between these local feature vectors and the global feature vectors. An attention-aware classification can be implemented. An attention-aware classification has been implemented by restraining the image classifier to use only a collection of local feature vectors which are weighted by compatibility scores.
Incorporating attention mechanisms into deep learning models significantly impacts medical image classification, especially in diagnosing respiratory diseases such as COVID-19. Unlike traditional convolutional neural networks (CNNs), attention mechanisms empower models to dynamically focus on crucial regions within the input image while disregarding irrelevant areas. This selective attention is achieved through a trainable attention estimator integrated into the standard CNN architecture. The attention mechanism operates by assigning varying weights to different spatial locations within the input image, allowing the model to prioritize disease-specific features. During training, the attention mechanism learns to allocate higher weights to regions containing salient information related to the target disease, while attenuating the influence of background noise or non-diagnostic regions. This selective focus enables the model to capture subtle patterns and features indicative of specific respiratory conditions, thus enhancing the accuracy and interpretability of the classification process. By dynamically adjusting its focus based on the relevance of different image regions, the attention mechanism facilitates more efficient and reliable screening of COVID-19 cases while effectively distinguishing them from other respiratory illnesses like pneumonia and tuberculosis.
3.1 Attention module
Denoted by $L_s=\left\{1_{1^s}, 1_{2^s}, \ldots, l_{n^s}\right\}$, are the feature vectors that are obtained from the convolutional layer s, where $s \in\{1, \ldots, S\}$. Out of the total n spatial locations, $l_i s$ is the output vector activation at the spatial location i. The global feature vector g is the product of nonlinear and convolutional layers and has only to pass through fully connected layers. This global feature vector g can be used to give the original architecture’s class score for that input.
$a_i s=\frac{\exp (c s i)}{\sum_{j=1}^n c s i}$ (1)
First, the set of compatibility scores $\mathrm{C}\left(L_s, \mathrm{~g}\right)=\left\{c_1 s, c_2 s, \ldots\right.$, $\left.c_n s\right\}$ are computed. Compatibility scores are explained in the next section. The linear mapping of $l_i s$ to the dimensionality of g, gives $L_s$. The softmax operation shown in Eq. (1) is used to normalise the compatibility scores.
$g_{a^s}=\sum_{j=1}^n a_i s . l_{i^s}$ (2)
The normalised compatibility scores As = {a1s, a2s, ..., ans} are then used to produce a single vector gas for each layer s, by simple element-wise weighted sum shown in Eq. (2). If a network is trained following the constraint that only gas only can be used to classify the input image, then this resembles ‘attention’.
The probabilities of class predictions $\left\{P_1, P_2, \ldots, P_T\right\}$, where T denotes the number of target classes, can be computed utilising the attention-incorporating global vector $g_a$. Ga is computed initially and subsequently transformed into a T-dimensional vector prior to its traversal through a softmax layer. It is possible to discover the unconstrained network parameters through end-to-end training utilising a cross-entropy loss function.
3.2 Compatibility scores
The global and local image descriptors in a convolutional neural network can be concatenated to an addition operation, without any effect on its generality. This is used to reduce the number of parameters in the attention unit. The resultant descriptor can be mapped to the compatibility scores in a fully connected manner. The universal features relevant to the object categories in the dataset can be learned using the weight vector.
$c_{i^s}=\left(l_{i^s}, g\right), i \in\{1 \ldots \ldots n\}$ (3)
The dot product between g and lis can be used as a compatibility score as shown in Eq. (3). The strength of activation of lis and the high-dimensional feature space could affect the comparative measure of the scores of associations between lis and g.
3.3 Architecture construction
The proposed architecture is divided into 7 blocks as shown in Table 2. Every convolution operation is followed by a batch normalization function and ReLu activation function. After each block, there is a max pooling layer, except for block-7, where an average pooling layer is used. Attention-1 and Attention-2 are formed by performing gated attention operations between the outputs of block-7 and block-4, and between the outputs of block-7 and block-5, respectively.
Table 2. Model description block wise
Block Number |
Output Size |
Conv |
Block1 |
$64 \times 64$ |
$$ $2 \times 2$ maxpool, stride 2 |
Block2 |
$32 \times 32$ |
$$ $2 \times 2$ maxpool, stride 2 |
Block3 |
$16 \times 16$ |
$$ $2 \times 2$ maxpool, stride 2 |
Block4 |
$8 \times 8$ |
$$ $2 \times 2$ maxpool, stride 2 |
Block5 |
$4 \times 4$ |
$$ $2 \times 2$ maxpool, stride 2 |
Block6 |
$2 \times 2$ |
$$ $2 \times 2$ maxpool, stride 2 |
Block6 |
$2 \times 2$ |
$$ $2 \times 2$ maxpool, stride 2 |
G1: = $2 \times 2$ averagepool, stride1 |
||
Attention-1: = Output of Block7 with Output of Block4 |
||
Attention-2: = Output of Block7 with Output of Block5 |
||
G1+Attention-1+Attention-2 Fully connected, N-d SoftMax |
The final layer is the fully connected layer which takes a summation of average pooling layer output, Attention-1, Attention-2, and outputs an N-dimensional SoftMax operation. The visual attention module learns with “attention” in specific areas and with “less attention” elsewhere. The selection of those specific areas can be decided through learning steps in each epoch. As the training process keeps moving forward, the error rate tries to reach local minima. By selecting specific areas and computing which gives error rate value, the areas are finalised with highest weightage and lowest weightages. Using this visual attention mechanism, the model tries to learn the patterns for each class. Dropouts are used during the training to accurately distinguishing COVID-19 features from the training data. By randomly deactivating neurons during training, dropout prevents overreliance on specific disease manifestations present in the training set, encouraging the model to explore a broader range of disease-related patterns and variations. This ensures the model's ability to capture nuanced differences in imaging characteristics among different diseases, such as ground-glass opacities in COVID-19 versus infiltrates and nodules in pneumonia and tuberculosis. Through dropout, the model learnt more generalizable and robust features, enabling it to make accurate and reliable classifications while mitigating the risk of overfitting to training data artifacts. Table 3 represents the statistical RGB values of dataset.
Table 3. Statistical RGB values of dataset
|
Red |
Green |
Blue |
Mean |
0.485 |
0.456 |
0.406 |
Standard Deviation |
0.229 |
326 |
0.225 |
The proposed work introduces several key innovations, including the integration of attention mechanisms into deep learning models for enhanced accuracy and interpretability. Unlike prior research which focused solely on COVID-19 detection, this project aims to discriminate between COVID-19, pneumonia, and tuberculosis, broadening its clinical utility. The novel deep learning architecture, incorporating attention-gated mechanisms, outperforms traditional CNNs by integrating trainable attention estimators. Overall, the project holds promise for clinical application, facilitating efficacious and expeditious screening of respiratory diseases.
The experimental findings show that the suggested technique is successful in categorising and identifying COVID-19, pneumonia, and TB from chest X-ray pictures. The suggested model demonstrated an overall accuracy of 90.16% in distinguishing between pneumonia and TB, 94.02% in distinguishing between pneumonia and COVID-19, and 98.27% in distinguishing COVID-19, surpassing baseline techniques and current approaches. Furthermore, the attention maps generated by proposed deep learning model provide valuable insights into the discriminative regions in the input images, aiding in the interpretation of model predictions. The high interpretability of our approach makes it suitable for integration into clinical decision support systems, where transparency and explainability are paramount.
4.1 Pre-processing
For various CNNs, the size of the input pictures was distinct. The datasets were pre-processed to resize the X-Ray images based on standard input sizes. The images are resized to 224´224 pixels for ResNet50, ResNet32, VGG16, VGG19 and MobileNetV2. A Gated visual attention based neural network was used in the proposed model, and several experiments and training have been done by changing the input size. Of all, the input size of 128´128 gave the best results. The applied transformations done on the dataset are:
Table 4. Model specific attributes and comparison between proposed methodology and other state-of-the-art methods
Model |
Parameters |
Learning Rate |
Input Size |
ResNet1 |
23,514,179 |
0.01 |
$224\times224$ |
VGG16 |
134,272,835 |
0.01 |
$224\times224$ |
VGG19 |
139,582,53 |
0.01 |
$224\times224$ |
Proposed Methodology |
49,726,405 |
0.1 |
$128\times128$ |
ResNet |
21,286,211 |
0.01 |
$224\times224$ |
MobileNet-V2 |
2,227,715 |
0.01 |
$224\times224$ |
4.2 Implementation details
The details of the experiments, conducted to test the proposed architecture, are presented here. Pytorch, an open-source framework was used to implement the deep learning networks with 100 as the number of epochs, learning rate of 0.1, batch size of 32, image size of $124\times124$ and all the weights initialized using Kaiming Normal. All the experiments were carried out in free version of Google Colab with 16GB GPU capacity.
All the pre-trained models which are described in Table 5, are of input size $224\times224$ with a learning rate of 0.01 and the proposed model has an input size $128\times128$ with a learning rate of 0.1. Table 6 gives the state of art methods accuracy per class.
Table 5. Comparative analysis of overall accuracy with the state-of-the-art methods
Model |
Accuracy |
Precision |
Recall |
F1 Score |
ResNet50 |
86.03 |
0.86 |
0.86 |
0.86 |
VGG16 |
89.52 |
0.90 |
0.90 |
0.90 |
VGG19 |
86.67 |
0.87 |
0.87 |
0.87 |
Proposed Methodology |
90.16 |
0.90 |
0.90 |
0.90 |
ResNet34 |
89.52 |
0.90 |
0.90 |
0.90 |
MobileNet-V2 |
87.94 |
0.88 |
0.88 |
0.88 |
Table 6. Per class accuracy evaluation with the state-of-the-art methods
Model |
Tuberculosis |
Pneumonia |
Normal |
ResNet50 |
78.09 |
97.14 |
82.85 |
VGG16 |
83.80 |
96.19 |
88.57 |
VGG19 |
79.04 |
98.09 |
82.85 |
Proposed Model |
80.0 |
98.09 |
92.38 |
ResNet34 |
81.90 |
98.09 |
88.57 |
MobileNet-V2 |
80.95 |
97.14 |
85.71 |
4.3 Classification results and analysis
4.3.1 Pneumonia and tuberculosis classification
The new model surpassed the existing models in terms of accuracy, F1 score, precision, and recall when trained on the same dataset. Among the existing models, VGG16 and ResNet34 achieved similar scores to our model.
The proposed model gave an accuracy of 80.0%, 98.09%, and 92.38% on tuberculosis, pneumonia, and normal classes, respectively. Our model classified pneumonia and normal more accurately than any another model. From Table 7, our observations showed that ResNet50 is weak in classifying tuberculosis. For tuberculosis ResNet34 leads with 1.90% more accurate results. For pneumonia and normal, our model produced far more accurate results. For normal, our model scored the highest accuracy compared to all other models.
From Table 8, it is obvious that, all the models except our proposed model tend to overfit at their very early stage, but over model from the beginning started with very low accuracy and proved to be learning from each epoch. We also observed that Attention based mechanisms learn slowly instead of overfitting in the data in a smaller number of epochs.
The above confusion matrix from Figure 3, has detailed comparison of various state-of-the-art models against our proposed model. Our models competitively defended all the state-of-the-art models in the above results.
Table 7. Comparative analysis of accuracy with respect to number of epochs
Model |
Epoch |
Train Loss |
Train Accuracy |
Test Loss |
Test Accuracy |
ResNet50 |
35 |
0.0683 |
98.10 |
0.5874 |
86.03 |
VGG16 |
66 |
0.0175 |
99.32 |
0.6913 |
89.52 |
VGG19 |
36 |
0.1408 |
93.20 |
0.5324 |
86.67 |
Proposed Model |
97 |
0.0101 |
99.86 |
0.6313 |
90.16 |
ResNet34 |
50 |
0.0070 |
99.86 |
0.6289 |
89.52 |
MobileNet-V2 |
36 |
0.0138 |
99.46 |
0.6115 |
87.94 |
Table 8. Comparative analysis of accuracy-precision-recall with over-sampled and under-sampled models
|
Accuracy |
Precision |
Recall |
Proposed Methodology |
90.16 |
0.91 |
0.90 |
Liebenlito et al. [54] (UnderSampled) |
86.0 |
0.75 |
0.86 |
Liebenlito et al. [54] (OverSampled) |
90.0 |
0.80 |
0.87 |
Figure 3. Confusion matrix for normal, pneumonia and tuberculosis of each model
Figure 4. Comparative analysis of ROC and AUC curves for normal, pneumonia and tuberculosis classes
Figure 5. Comparative analysis of precision-recall curve for normal, pneumonia and tuberculosis classes
From Figure 4, the area under ROC curve is 0.94, 1.00,0.94 for normal, pneumonia and tuberculosis respectively. Micro-average area is 0.96 and macro-average area under the curve is 0.96 for the proposed model. For pneumonia we got the area under curve 1.0 which is most accurate compared to all state-of-the-art models. All the curves have an area under the curve above 0.90, which are very precise results that are needed for medical images.
We have also checked our model against precision-recall curves. From Figure 5, areas under precision-recall are 0.86, 0.99, and 0.88 for Normal, Pneumonia and Tuberculosis, respectively.
There are very few works on pneumonia and tuberculosis combined in Deep Learning. One such work is from Liebenlito et al. [54], where they used simple convolutional networks for classifying pneumonia and TB, the same comparison can be found in Table 8. Sample output maps from the proposed model can be found in Figure 6.
Figure 6. Sample output results for normal, pneumonia and tuberculosis classes
4.3.2 Pneumonia and covid classification
We have also extended our research by applying the same architecture on classification between pneumonia and covid. We have used data which is collected from different data sources like COVID-19 data collection by Cohen et al. [55] and SIRM COVID-19 data base [56]. Our data has 3-categories Pneumonia, COVID-19 and healthy images. Furthermore, our dataset is divided into train and test which have 1850 and 463 for each split for each category.
Notable research work in classification includes those by Nishio et al. [57] and Luján-García et al. [58]. In study [57], they used GAP global average pooling layer, fully-connected layer, and dropout layer, along with the state-of-the-art model VGG-19 and omitted activation function for brevity. They achieved an accuracy of 83.7%. In study [58], they used Xception as their base network and added extra convolutions GAP to Exit flow of exception network. Chandra et al. [59] used radiomic texture descriptors derived from CXR images to classify COVID-19 infected patients. Ozturk et al. [60] used you only look once (YOLO) real time object detection system for finding COVID infected cases. All of the discussed accuracy comparisons can be found in Table 9 for Pneumonia and Covid Classification.
Table 9. Comparative analysis of accuracy with other state-of-art methods
|
Accuracy |
Method |
Nishio et al. [57] |
83.7% |
VGG-19+GAP |
Luján-García et al [58] |
91% |
Xception (Modified) |
Chandra et al. [59] |
93.41% |
Majority vote based (ensemble methods) |
Ozturk et al. [60] |
87.02% |
DarkNet |
Proposed Methodology |
94.02% |
Attension |
Per Class Accuracy for this classification was 96.96%, 93.30%, 91.79% for covid, normal, pneumonia respectively. Figure 7 explains the same in the form of Confusion Matrix for Covid and Pneumonia Classification.
From Figure 8, it is evident that our proposed model got area under curve 1.0 for all classes for classification of pneumonia and covid.
We checked our model against precision-recall curves for Pneumonia and Covid Classification. From Figure 9, areas under precision-recall are 0.99, 1.00, 1.00 for Normal, Pneumonia and Covid, respectively.
Figure 7. Confusion matrix for covid, normal, and pneumonia classes
Figure 8. Comparative analysis of ROC and AUC curves for covid, pneumonia, and normal classes
Figure 9. Comparative analysis of precision-recall curve for covid, pneumonia, and normal classes
4.3.3 Covid classification
We have expanded our study on COVID-19 by doing a classification analysis between photos of COVID-19 and healthy individuals due to the current global pandemic. The data consists of two categories: COVID-19 pictures and healthy photos. Our dataset is split into training and testing sets, with 1850 photos in the training set and 463 images in the testing set for each category. Hemdan et al. [61] employed seven distinct architectures of DCNNs to detect individuals with COVID-19. Panwar et al. [62] utilised a basic convolutional neural network (nCOVnet) for this task. Narin et al. [63] suggested utilising 5 pre-trained deep convolutional networks (ResNet50, ResNet101, ResNet152, InceptionV3, and Inception ResNetV2) for COVID-19 detection. Maghdid et al. [64] used a customised pretrained AlexNet model to a dataset of prepared X-rays and CT scan images for COVID19 detection. Table 10 contains comparisons of accuracy for all the models described in relation to the suggested model for Covid Classification.
Table 10. Comparative analysis of accuracy with respect to various state-of-the-art algorithms
|
Accuracy |
Algorithm |
Ozturk et al. [60] |
98.08% |
DarkNet |
Hemdan et al. [61] |
90.00% |
All state of the art models like VGG, ResNet, Inception etc. |
Panwar et al. [62] |
88.10% |
nCOVnet |
Chandra et al. [59] |
98.06% |
Majority vote based (ensemble methods) |
Narin et al. [63] |
98.00% |
ResNet, Inception |
Maghdid et al. [64] |
94.00% |
AlexNet with Modified CNN |
Proposed Methodology |
98.27% |
Attention |
Per class accuracy of Covid classification is 96.96% and 99.57% for Covid and normal classes respectively. Figure10 explains the same in the form of Confusion Matrix for Covid Classification.
From Figure 11, it is apparent that the proposed model got area under curve 1.0 for all classes for classification of covid.
Figure 10. Confusion matrix of covid classification
Figure 11. Comparative analysis of ROC and AUC curve for covid classification
Figure 12. Comparative analysis of precision-recall curve for covid classification
We have also checked our model against precision-recall curves for Covid Classification. From Figure 12, areas under precision-recall are 1.00 for both Normal and Covid.
4.4 Statistical compassion
Depending on the type of comparison, we used t-tests or ANOVA tests to statistically compare our approach with state-of-the-art works. To be more precise, we contrasted our method's performance parameters (such as accuracy, precision, recall, and F1-score) with those found in pertinent literature for comparable tasks (i.e., respiratory illness classification and detection from medical picture).
The average classification accuracies of three classical classifiers varied significantly, with a p-value < 0.001 from an n-way ANOVA test. A p-value of less than 0.001 signifies statistically significant differences among the classification levels of the indicated DL models component. Our proposed model classifiers showed the most significant difference compared to other combinations of DL models and classifiers. The ResNet-34 model using an ensemble of subspace discriminant classifiers achieved second place. Table 7's results indicate that ResNet-34 required almost double the processing time compared to the suggested DL model.
4.5 Discussion
We presented a practical approach for categorising images of COVID-19, pneumonia, and TB using pre-trained Deep Learning models. The main advantage of the features produced by these models is that they autonomously identified the most distinguishing parameters for categorising the original photos, eliminating the necessity for manual feature extraction methods during the development of these models. The DF features were derived from models trained on a wide range of generic images spanning over 1000 classes, rather than being specifically trained on COVID-19 images. The suggested model then developed its own distinct features or descriptors to effectively differentiate between various classes of photos. The examination of the DL models revealed an accuracy of 90.16% in diagnosing pneumonia and tuberculosis, 94.02% in classifying pneumonia and COVID-19, and 98.27% in classifying COVID-19 and TB. The statistical analysis of the data revealed substantial differences in the performances of the classification models with p-values < 0.001 for all tests, which was supported by our n-way ANOVA investigation. Each category has 105 photographs. The results indicate that it typically requires many attempts to identify the key aspects in the images that were evaluated. Once the findings were statistically validated, we assessed the computing demands necessary to extract deep features from all 245 photos for each category in the test set. Our model achieved the best performance among the existing collection of photographs, running on a standard Core i5 GPU PC with 16GB RAM at a speed of around 0.19 seconds per image. The computational expenses of other models showed significant variation over their ranges. Our deep learning model only took around one-fifth of the average time compared to all models in the comparison table, confirming the earlier assessment of it being the top-performing model. The specificity values for distinguishing between COVID-19 with pneumonia and the categorization of COVID-19 and TB are 94.02% and 98.57%, respectively. These two classes were the most accurately classified among all classes examined in this study. Detecting pneumonia, TB, and COVID-19 is crucial, especially in low- and middle-income countries where tuberculosis has a higher mortality rate than COVID-19. Tuberculosis (TB) differs from other types of pneumonia and normal pneumonia by having two distinct clusters (Figures 7 and 10), which results in greater separation between samples within the TB class. The analysis focused on our model to assess the performance of our deep learning network across several classification datasets. The parallels in symptoms such as fever, exhaustion, and cough between COVID-19 and TB highlight the importance of this stage in the study. The trained DL model had an error rate of 1.7% in discriminating between COVID-19 and TB. The main challenge was differentiating between pneumonia viruses and bacteria, with pneumonia viruses being the most commonly mistaken category. The accuracy, recall, specificity, F-score, and AUC measures of our proposed model supported these findings. The metrics confirm that the proposed model has the potential to be an effective and reliable computer-aided design (CAD) for identifying COVID-19 and TB. Upon visually examining the t-SNE feature projections in the first two dimensions, there was significant overlap between COVID-19 and TB. We examined the classification performance of our proposed model for distinguishing between two classes: COVID-19 and normal, as well as for distinguishing between three classes: COVID-19, normal, and TB. Both sets of tests show that COVID-19 is completely different from both normal conditions and TB. We contrasted our model outputs, which were based on the efficacy of our recommended combination of deep learning and machine learning models, with the performance results provided by several research organisations. Our proposed COVID-19 detection approach outperformed the majority of models in previous studies, as shown in Table 11. Each model assessed a unique dataset, testing split, and class. The comparison of these models is just illustrative but does showcase the potential of our strategy, which integrates DF with simple classifiers for this task.
Table 11. Comparison to the previous literature on COVID-19 detection with x-ray images
Study |
Dataset |
Evaluation Method |
Techniques Used |
Detection Accuracy |
Narin et al. [63] |
2 class: 50 COVID 19 / 50 normal |
5 fold CV |
Transfer learning with Resnet50 and InceptionV3 |
98% |
Panwar et al. [62] |
2 class: 142 COVID |
19/ 142 normal Holdout 30% |
nCOVnet CNN |
88% |
Hemdan et al. [61] |
3 class: 219 COVID 19 1341 normal 1345 pneumonia viral |
Holdout 27% |
2D curve let transform, chaotic salp swarm algorithm (CSSA), Efficient Net B0 |
99% |
Ozturk et al. [60] |
3class: 125 COVID 19 / 500 normal 500 pneumonia |
5 fold CV |
DarkCovidNet CNN |
87.2% |
Proposed Methodology |
Classification of COVID-19/normal/ pneumonia bacterial/ pneumonia viral/ tuberculosis |
Attention-vector image classification |
CNN |
98.27% |
We examined the classification accuracy for two-class (COVID-19 vs. normal) and three-class (COVID-19 vs. normal vs. TB) scenarios by utilising a combination of subspace discriminants and ResNet-50. Both sets of tests show that COVID-19 is completely different from both normal conditions and TB. We evaluated the performance of our proposed deep learning and machine learning model combination and compared it to the performance findings of other research groups, as shown in Table 7. Our proposed COVID-19 detection pipeline outperformed the state-of-the-art models mentioned in the literature, as seen in Table 7.
The study is limited by a very small number of COVID-19 and TB pictures, however it is the biggest compared to prior work. Additional COVID-19 and TB pictures are required to enhance the resilience of the proposed model.
Several significant advancements are shown by the suggested method of using Visual Attention Gated Networks (VAGN) for the categorization and detection of COVID-19, pneumonia, and TB. The model shows improved efficiency and accuracy in identifying various respiratory disorders using medical imaging data, such as X-rays or CT scans, by utilizing the power of deep learning and attention processes. By focusing on pertinent areas of the pictures, the model is able to improve interpretability and minimize computational overhead through the application of attention processes. Moreover, the capacity for multi-class categorization permits the concurrent identification of several respiratory ailments, simplifying the diagnostic procedure and perhaps assisting prompt intervention and therapy. All things considered, the method shows promise in using artificial intelligence to diagnose respiratory disorders quickly and accurately, improving patient outcomes and streamlining the healthcare system.
Developments in deep learning models for image classification and detection are essential for improving medical imaging. These developments will enhance the ability to identify diseases such as COVID-19 using X-Ray images, pneumonia, and tuberculosis at early stages and accurately locate them. A convolutional neural network incorporating an attention gated mechanism was introduced in this study for the purpose of image classification in order to identify COVID-19, pneumonia, and tuberculosis diseases. In order to assess the performance of our proposed model in comparison to several other deep learning models, this research was conducted on ResNet50, ResNet32, VGG16, VGG19, and MobileNetV2. Datasets including the Montgomery Dataset, Shenzhen Hospital Dataset, and Pneumonia Dataset. A multitude of outcomes and scores were acquired, including accuracy, recall, precision, and AUC score, all of which served to illustrate the model's resilience in comparison to most recent advancements in methodology. In differentiating between tuberculosis and pneumonia, the proposed model achieved an accuracy of 90.16%, in addition to a high F1 score of 0.90, AUC score of 0.96, precision of 0.91, and recall of 0.90. Furthermore, the accuracy of the proposed model in differentiating between pneumonia and COVID-19 was 94.02%, while its accuracy in distinguishing COVID-19 was 98.27%. In contrast to many pre-existing advanced deep learning models, our approach demonstrated superior performance in accurately classifying COVID-19, thereby effectively detecting the disease. For future work, high-power object detection algorithms can be put in use to detect the exact area of infection from images and on COVID-19 Radiographic Chest Images by using deep learning with transfer learning [65]. Adding more power to this, object detection algorithms can also be employed to determine the exact area spread and amount of infection present through pixel density. In future, it would be fascinating to see methods for more effectively estimating the weights associated with various models, as well as a model that makes forecasts while taking into account the patient's history.
[1] Liu, J.E., An, F.P. (2020). Image classification algorithm based on deep learning-kernel function. Scientific Programming, 2020(1): 7607612. https://doi.org/10.1155/2020/7607612
[2] Kim, S.J., Medina, M., Zhong, L., Chang, J. (2023). Factors associated with in-hospital death among pneumonia patients in US hospitals from 2016~2019. International Journal of Health Policy and Management, 12. https://doi.org/10.34172/ijhpm.2023.7390
[3] Johnson, S., Wells, D.H. (2019). Viral pneumonia: Symptoms, risk factors, and more. https://www.healthline.com/health/viral-pneumonia.
[4] Tawfick, M.M., Badawy, M.S.E., Taleb, M.H., Menofy, N.G.E. (2023). Tuberculosis diagnosis and detection of drug resistance: A comprehensive updated review. Journal of Pure & Applied Microbiology, 17(4). https://doi.org/10.22207/JPAM.17.4.56
[5] Watson, K., Reoch, J., Heales, L.J., Fernando, J., Tan, E., Smith, K., Austin, D., Divanoglou, A. (2022). The incidence and characteristics of ventilator-associated pneumonia in a regional nontertiary Australian intensive care unit: A retrospective clinical audit study. Australian Critical Care, 35(3): 294-301. https://doi.org/10.1016/j.aucc.2021.04.004
[6] Williams, P.M. (2024). Tuberculosis—United States, 2023. MMWR. Morbidity and Mortality Weekly Report, 73. https://www.who.int/news-room/fact-sheets/detail/tuberculosis.
[7] Davies, H.E., Wathen, C.G., Gleeson, F.V. (2011). The risks of radiation exposure related to diagnostic imaging and how to minimise them. BMJ, 342: d947. https://doi.org/10.1136/bmj.d947
[8] Kallianos, K., Mongan, J., Antani, S., Henry, T., Taylor, A., Abuya, J., Kohli, M. (2019). How far have we come? Artificial intelligence for chest radiograph interpretation. Clinical Radiology, 74(5): 338-345. https://doi.org/10.1016/j.crad.2018.12.015
[9] Liu, N., Wan, L., Zhang, Y., Zhou, T., Huo, H., Fang, T. (2018). Exploiting convolutional neural networks with deeply local description for remote sensing image classification. IEEE Access, 6: 11215-11228. https://doi.org/10.1109/ACCESS.2018.2798799
[10] Naicker, S., Plange-Rhule, J., Tutt, R.C., Eastwood, J.B. (2009). Shortage of healthcare workers in developing countries—Africa. Ethnicity & Disease, 19: 60-64.
[11] Hashmi, M.F., Katiyar, S., Keskar, A.G., Bokde, N.D., Geem, Z.W. (2020). Efficient pneumonia detection in chest xray images using deep transfer learning. Diagnostics, 10(6): 417. https://doi.org/10.3390/diagnostics10060417
[12] Jaeger, S., Candemir, S., Antani, S., Wáng, Y.X.J., Lu, P.X., Thoma, G. (2014). Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quantitative Imaging in Medicine and Surgery, 4(6): 475-477. https://doi.org/10.3978/j.issn.2223-4292.2014.11.20
[13] Kermany, D. (2018). Labeled optical coherence tomography (oct) and chest x-ray images for classification. Mendeley Data. https://doi.org/10.1371/journal.pone.0256630.g001
[14] Barstugan, M., Ozkaya, U., Ozturk, S. (2020). Coronavirus (COVID-19) classification using CT images by machine learning methods. arXiv preprint arXiv:2003.09424. https://doi.org/10.48550/arXiv.2003.09424
[15] Barstugan, M., Ozkaya, U., Ozturk, S. (2020). Coronavirus (COVID-19) classification using CT images by machine learning methods. arXiv preprint arXiv:2003.09424. https://doi.org/10.48550/arXiv.2003.09424
[16] Chandra, T.B., Verma, K. (2020). Pneumonia detection on chest X-ray using machine learning paradigm. In Proceedings of 3rd International Conference on Computer Vision and Image Processing, Singapore, pp. 21-33. https://doi.org/10.1007/978-981-32-9088-4_3
[17] Pathak, Y., Shukla, P.K., Tiwari, A., Stalin, S., Singh, S. (2022). Deep transfer learning based classification model for COVID-19 disease. IRBM, 43(2): 87-92. https://doi.org/10.1016/j.irbm.2020.05.003
[18] Jaiswal, A., Gianchandani, N., Singh, D., Kumar, V., Kaur, M. (2020). Classification of the COVID-19 infected patients using DenseNet201 based deep transfer learning. Journal of Biomolecular Structure & Dynamics, 39(15): 5682-5689. https://doi.org/10.1080/07391102.2020.1788642
[19] Aslan, M.F., Unlersen, M.F., Sabanci, K., Durdu, A. (2021). CNN-based transfer learning–BiLSTM network: A novel approach for COVID-19 infection detection. Applied Soft Computing, 98: 106912. https://doi.org/10.1016/j.asoc.2020.106912
[20] He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770-778. https://doi.org/10.1109/CVPR.2016.90
[21] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp. 1-9. https://doi.org/10.1109/CVPR.2015.7298594
[22] Murphy, K., Habib, S.S., Zaidi, S.M.A., Khowaja, S., Khan, A., Melendez, J., Scholten, E.T., Amad, F., Schalekamp, S., Verhagen, M., Philipsen, R.H.H.M., Meijers, A., van Ginneken, B. (2020). Computer aided detection of tuberculosis on chest radiographs: An evaluation of the CAD4TB v6 system. Scientific Reports, 10(1): 5492. https://doi.org/10.1038/s41598-020-62148-y
[23] Simonyan, K., Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
[24] Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q. (2016). Deep networks with stochastic depth. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, Proceedings, Part IV 14. Springer: Berlin/Heidelberg, Germany, pp. 646-661. https://doi.org/10.1007/978-3-319-46493-0_39
[25] Ioffe, S., Szegedy, C. (2015). Batch Normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015. arXiv preprint arXiv:1502.03167. https://doi.org/10.48550/arXiv.1502.03167
[26] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1): 1929-1958.
[27] He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770-778. https://doi.org/10.1109/CVPR.2016.90
[28] Hashmi, M.F., Katiyar, S., Hashmi, A.W., Keskar, A.G. (2021). Pneumonia detection in chest X-ray images using compound scaled deep learning model. Automatika: Časopis za Automatiku, Mjerenje, Elektroniku, Računarstvo i Komunikacije, 62(3-4): 397-406. https://doi.org/10.1080/00051144.2021.1973297
[29] Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 1800-1807. https://doi.org/10.1109/CVPR.2017.195
[30] Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. https://doi.org/10.48550/arXiv.1704.04861
[31] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp. 1-9. https://doi.org/10.1109/CVPR.2015.7298594
[32] Tan, M., Le, Q.V. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks. https://doi.org/10.48550/arXiv.1905.11946
[33] Stephen, O., Sain, M., Maduh, U.J., Jeong, D.U. (2019). An efficient deep learning approach to pneumonia classification in healthcare. Journal of Healthcare Engineering, 2019(1): 4180949. https://doi.org/10.1155/2019/4180949
[34] Liang, G., Zheng, L. (2020). A transfer learning method with deep residual network for pediatric pneumonia diagnosis. Computer Methods and Programs in Biomedicine, 187: 104964. https://doi.org/10.1016/j.cmpb.2019.06.023
[35] Jaiswal, A.K., Tiwari, P., Kumar, S., Gupta, D., Khanna, A., Rodrigues, J.J. (2019). Identifying pneumonia in chest X-rays: A deep learning approach. Measurement, 145: 511-518. https://doi.org/10.1016/j.measurement.2019.05.076
[36] Rajpurkar, P., O’Connell, C., Schechter, A., Asnani, N., Li, J., Kiani, A., Ball, R.L., Mendelson, M., Maartens, G., van Hoving, D.J., Griesel, R., Ng, A.Y., Boyles, T.H., Lungren, M.P. (2020). CheXaid: Deep learning assistance for physician diagnosis of tuberculosis using chest x-rays in patients with HIV. NPJ Digital Medicine, 3(1): 115. https://doi.org/10.1038/s41746-020-00322-2
[37] Cao, Y., Liu, C., Liu, B., Brunette, M.J., Zhang, N., Sun, T., Zhang, P., Peinado, J., Garavito, E.S., Garcia, L.L., Curioso, W.H. (2016). Improving tuberculosis diagnostics using deep learning and mobile health technologies among resource-poor and marginalized communities. In 2016 IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Washington, DC, USA, pp. 274-281. https://doi.org/10.1109/CHASE.2016.18
[38] Qin, Z.Z., Sander, M.S., Rai, B., Titahong, C.N., Sudrungrot, S., Laah, S.N., Adhikari, L.M., Carter, E.J., Puri, L., Codlin, A.J., Creswell, J. (2019). Using artificial intelligence to read chest radiographs for tuberculosis detection: A multi-site evaluation of the diagnostic accuracy of three deep learning systems. Scientific Reports, 9(1): 15000. https://doi.org/10.1038/s41598-019-51503-3
[39] Hwang, E.J., Park, S., Jin, K.N., Kim, J.I., Choi, S.Y., Lee, J.H., Goo, J.M., Aum, J., Yim, J.J., Park, C.M. (2018). Development and validation of a deep learning–based automatic detection algorithm for active pulmonary tuberculosis on chest radiographs. Clinical Infectious Diseases, 69(5): 739-747. https://doi.org/10.1093/cid/ciy967
[40] Putha, P., Tadepalli, M., Reddy, B., Raj, T., Chiramal, J.A., Govil, S., Sinha, N., Manjunath, K.S., Reddivari, S., Jagirdar, A., Rao, P., Warier, P. (2018). Can artificial intelligence reliably report chest x-rays? Radiologist validation of an algorithm trained on 2.3 million x-rays. arXiv preprint arXiv:1807.07455. https://doi.org/10.48550/arXiv.1807.07455
[41] Goyal, S., Singh, R. (2023). Detection and classification of lung diseases for pneumonia and COVID-19 using machine and deep learning techniques. Journal of Ambient Intelligence and Humanized Computing, 14(4): 3239-3259. https://doi.org/10.1007/s12652-021-03464-7
[42] Iqbal, A., Usman, M., Ahmed, Z. (2023). Tuberculosis chest X-ray detection using CNN-based hybrid segmentation and classification approach. Biomedical Signal Processing and Control, 84: 104667. https://doi.org/10.1016/j.bspc.2023.104667
[43] Agrawal, R., Sarkar, H., Prasad, A.O., Sahoo, A.K., Vidyarthi, A., Barik, R.K. (2023). Exploration of deep neural networks and effect of optimizer for pulmonary disease diagnosis. SN Computer Science, 4(5): 471. https://doi.org/10.1007/s42979-023-01940-9
[44] Amyar, A., Modzelewski, R., Li, H., Ruan, S. (2020). Multi-task deep learning based CT imaging analysis for COVID-19 pneumonia: Classification and segmentation. Computers in Biology and Medicine, 126: 104037. https://doi.org/10.1016/j.compbiomed.2020.104037
[45] Zhou, Y., He, X., Huang, L., Liu, L., Zhu, F., Cui, S., Shao, L. (2019). Collaborative learning of semi-supervised segmentation and classification for medical images. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 2074-2083. https://doi.org/10.1109/CVPR.2019.00218
[46] Das, S., Kharbanda, K., Suchetha, M., Raman, R., Dhas, E. (2021). Deep learning architecture based on segmented fundus image features for classification of diabetic retinopathy. Biomedical Signal Processing and Control, 68: 102600. https://doi.org/10.1016/j.bspc.2021.102600
[47] Priya, T.S., Ramaprabha, T. (2021). Resnet based feature extraction with decision tree classifier for classification of mammogram images. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(2): 1147-1153.
[48] Hemdan, E.E.D., Shouman, M.A., Karar, M.E. (2020). Covidx-net: A framework of deep learning classifiers to diagnose COVID-19 in x-ray images. arXiv preprint arXiv:2003.11055. https://doi.org/10.48550/arXiv.2003.11055
[49] Gupta, A., Gupta, S., Katarya, R. (2021). InstaCovNet-19: A deep learning classification model for the detection of COVID-19 patients using Chest X-ray. Applied Soft Computing, 99: 106859. https://doi.org/10.1016/j.asoc.2020.106859
[50] Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V. (2018). Learning transferable architectures for scalable image recognition. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 8697-8710. https://doi.org/10.1109/CVPR.2018.00907
[51] Jetley, S., Lord, N.A., Lee, N., Torr, P.H. (2018). Learn to pay attention. arXiv preprint, arXiv:1804.02391. https://doi.org/10.48550/arXiv.1804.02391
[52] Turk, F. (2024). RNGU-NET: A novel efficient approach in segmenting tuberculosis using chest X-Ray images. PeerJ Computer Science, 10: e1780. https://doi.org/10.7717/peerj-cs.1780.
[53] Jetley, S., Lord, N.A., Lee, N., Torr, P.H. (2018). Learn to pay attention. arXiv preprint, arXiv:1804.02391. https://doi.org/10.48550/arXiv.1804.02391
[54] Liebenlito, M., Irene, Y., Hamid, A. (2020). Classification of tuberculosis and pneumonia in human lung based on chest x-ray image using convolutional neural network. InPrime: Indonesian Journal of Pure and Applied Mathematics, 2(1): 24-32. https://doi.org/10.15408/inprime.v2i1.14545
[55] Cohen, J.P., Morrison, P., Dao, L. (2020). COVID-19 image data collection. arXiv preprint arXiv:2003.11597. https://doi.org/10.48550/arXiv.2003.11597
[56] López, CEB (2024). Design of an application to detect COVID-19 using convolutional neural networks and X-ray images. https://www.sirm.org/en/category/articles/COVID-19-database/.
[57] Nishio, M., Noguchi, S., Matsuo, H., Murakami, T. (2020). Automatic classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray image: Combination of data augmentation methods. Scientific Reports, 10(1): 17532. https://doi.org/10.1038/s41598-020-74539-2
[58] Luján-García, J.E., Moreno-Ibarra, M.A., Villuendas-Rey, Y., Yáñez-Márquez, C. (2020). Fast COVID-19 and pneumonia classification using chest X-ray images. Mathematics, 8(9): 1423. https://doi.org/10.3390/math8091423
[59] Chandra, T.B., Verma, K., Singh, B.K., Jain, D., Netam, S.S. (2021). Coronavirus disease (COVID-19) detection in chest X-ray images using majority voting based classifier ensemble. Expert Systems with Applications, 165: 113909. https://doi.org/10.1016/j.eswa.2020.113909
[60] Ozturk, T., Talo, M., Yildirim, E.A., Baloglu, U.B., Yildirim, O., Acharya, U.R. (2020). Automated detection of COVID-19 cases using deep neural networks with X-ray images. Computers in Biology and Medicine, 121: 103792. https://doi.org/10.1016/j.compbiomed.2020.103792
[61] Hemdan, E.E.D., Shouman, M.A., Karar, M.E. (2020). Covidx-net: A framework of deep learning classifiers to diagnose COVID-19 in x-ray images. arXiv preprint arXiv:2003.11055. https://doi.org/10.48550/arXiv.2003.11055
[62] Panwar, H., Gupta, P.K., Siddiqui, M.K., Morales-Menendez, R., Singh, V. (2020). Application of deep learning for fast detection of COVID-19 in X-Rays using nCOVnet. Chaos, Solitons & Fractals, 138: 109944. https://doi.org/10.1016/j.chaos.2020.109944
[63] Narin, A., Kaya, C., Pamuk, Z. (2021). Automatic detection of coronavirus disease (COVID-19) using x-ray images and deep convolutional neural networks. Pattern Analysis and Applications, 24: 1207-1220. https://doi.org/10.1007/s10044-021-00984-y
[64] Maghdid, H.S., Asaad, A.T., Ghafoor, K.Z., Sadiq, A.S., Khan, M.K. (2020). Diagnosing COVID-19 pneumonia from x-ray and CT images using deep learning and transfer learning algorithms. arXiv preprint arXiv:2004.00038v1. https://doi.org/10.48550/arXiv.2004.00038
[65] Kermany, D.S., Goldbaum, M., Cai, W., Valentim, C.C.S., Liang, H., Baxter, S.L., McKeown, A., Yang, G., Wu, X., Yan, F., Dong, J., Prasadha, M.K., Pei, J., Ting, M.Y.L., Zhu, J., Li, C., Hewett, S., Dong, J., Ziyar, I., Shi, A., Zhang, R., Zheng, L., Hou, R., Shi, W., Fu, X., Duan, Y., Huu, V.A.N., Wen, C., Zhang, E.D., Zhang, C.L., Li, O., Wang, X., Singer, M.A., Sun, X., Xu, J., Tafreshi, A., Lewis, M.A., Xia, H., Zhang, K. (2018). Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 172(5): 1122-1131. https://doi.org/10.1016/j.cell.2018.02.010