Hyperparameter-Tuned Deep Learning Framework for Accurate COVID-19 Detection Using Chest X-Rays and Computed Tomography Scans

Hyperparameter-Tuned Deep Learning Framework for Accurate COVID-19 Detection Using Chest X-Rays and Computed Tomography Scans

Arulmozhi Guru Gokul* Narayanaswamy Kumaratharan Perumal Leela Rani Natarajan Devi

Department of Information Technology, Sri Venkateswara College of Engineering, Sriperumbudur 602117, India

Department of Electronics and Communication Engineering, Sri Venkateswara College of Engineering, Sriperumbudur 602117, India

Corresponding Author Email: 
gurugokul@svce.ac.in
Page: 
991-1009
|
DOI: 
https://doi.org/10.18280/ts.430235
Received: 
26 January 2026
|
Revised: 
5 April 2026
|
Accepted: 
13 April 2026
|
Available online: 
30 April 2026
| Citation

© 2026 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Identifying COVID-19 fast and precisely is still a big challenge in medical testing, especially in areas where it's hard to get RT-PCR tests. This study introduces a new method to detect COVID-19 using chest X-rays and computed tomography (CT) scans. It uses a deep learning (DL) system that improves on the ResNet50 model by adding a CNN-Lattice structure and fine-tuning its settings. The suggested method begins with a thorough pre-processing step that utilizes Contrast Limited Adaptive Histogram Equalization (CLAHE) to scale, normalize, and enhance e contrast, thereby improving image quality and highlighting disease markers. Lung segments are isolated using an entropy-based U-Net segmentation model, which removes unnecessary background artifacts and focuses the exploration on the area of concern. For feature extraction and classification, CNNs-Lattice, which improves multi-scale feature learning through connections across many layers, is utilized in conjunction with a ResNet50 architecture. Hyperparameters can be changed to improve the model's performance by utilizing a Bayesian optimization technique to fine-tune the learning rate, batch size, and optimizer type. Through improved inter-layer connections and multi-scale feature fusion, the proposed CNN-Lattice integrated framework is demonstrated to be effective in addressing the shortcomings of current methods. The proposed method achieved the best results with a Dice Similarity Coefficient (DSC) of nearly 0.92, Intersection over Union (IoU) of almost 0.95, and classification accuracy of approximately 98.92% for accurate segmentation and classification of COVID-19 infected regions in chest X-ray and CT scan images compared with conventional U-Net and Traditional Methods.

Keywords: 

Contrast Limited Adaptive Histogram Equalization, CNNs-Lattice, COVID-19 detection, entropy-based U-Net, medical image analysis, ResNet50

1. Introduction

The appearance of Coronavirus Disease 2019 (COVID-19), caused by the new Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), has created big problems for healthcare systems, economies, and societies around the world [1]. It first appeared in early 2019 and quickly spread everywhere, causing millions of cases and many deaths [2]. Being able to diagnose the virus quickly and accurately is very important to stop its spread, give the right treatment, and reduce how much it spreads in communities [3]. Even though the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test is usually the best way to detect COVID-19, this method has several issues. It can take a long time, mistakes can happen during sample handling, and it might not always find the virus if there's not enough virus in the sample [4]. Because of this, medical imaging tools like chest X-rays (CXR) and computed tomography (CT) scans have become important in helping doctors find and track lung infections caused by COVID-19 [5]. These imaging techniques can show signs of the virus, such as ground-glass opacities, lung infiltrates, and changes in the lungs, helping radiologists understand how serious the illness is and how it is getting worse. But looking at a lot of imaging data by hand during a pandemic takes a lot of time, is error-prone, and can vary between different people looking at the images, which shows how important it is to have fast, reliable, and easy-to-use diagnostic tools [6].

Recent improvements in artificial intelligence (AI), especially deep learning (DL), have exposed great promise in analyzing medical images. DL methods, like convolutional neural networks (CNN), are really good at finding complex patterns in images. They often do better than older techniques that depend on manually created features for tasks like finding diseases, separating parts of an image, and sorting images into categories [7, 8]. In the realm of COVID-19, researchers around the globe have investigated the application of DL systems to identify viral pneumonia characteristics directly from CXR and CT images, intending to present fast, cost-efficient, and reproducible diagnostic assistance [9].

These models can learn to separate COVID-19 infections from other pneumonia types and normal lung conditions by recognizing subtle radiographic indicators that could be overlooked by human observers [10, 11]. Additionally, by employing transfer learning approaches and utilizing pre-trained models like VGGNet, ResNet, DenseNet [12], and InceptionNet, researchers have achieved high diagnostic precision even when working with relatively small COVID-19 imaging datasets—an essential benefit during the pandemic's early days when well-annotated data were limited [13, 14]. Additionally, diagnostic systems utilizing DL might be implemented in real-time in clinical environments, enabling healthcare professionals to prioritize critical cases, enhance resource utilization, and elevate overall patient outcomes [15].

Even with these encouraging advancements, there are still considerable obstacles and areas of research that need to be tackled in order to move DL-based COVID-19 detection organizations from experimental environments into broad clinical practice. Variations in image collection methods across different hospitals, disparities in scanner types and patient positioning, along with the natural diversity in COVID-19 presentations across various demographic groups, introduce domain variances that can influence the generalizability and resilience of the models that have remained trained. Additionally, because the opaque nature of deep neural networks can hinder physicians' trust and acceptance, ensuring that the models are interpretable and understandable is crucial. Incorporating explainable AI methods such as Class Activation Mapping (CAM) and Grad-CAM can enhance and clarify clinical decision-making by visualizing the key areas that impact the model's predictions.

Additionally, the use of effective data augmentation, cross-domain training techniques, federated learning approaches, and multi-modal integration methods, which combine imaging data with clinical and laboratory information, can improve the efficacy, reliability, and practical utility of these diagnostic systems. This research introduces an improved DL framework aimed at accurately detecting COVID-19 through chest X-rays and CT scans, highlighting the simulation's precision, reliability, and interpretability. The suggested approach incorporates advanced CNN architectures, comprehensive pre-processing, and visualizations that elucidate the results, to assist radiologists in providing timely, consistent, and trustworthy diagnoses, ultimately contributing to enhanced management of COVID-19 and better readiness for possible future outbreaks. The core contribution of this paper is

  • Developed a novel DL architecture that integrates CNN-Lattice with ResNet50 to improve feature extraction and classification of COVID-19 using chest X-ray and CT images.
  • Applied an entropy-guided U-Net example to accurately segment lung regions, enhancing disease localization by removing irrelevant background information.
  • Employed Bayesian optimization to optimize key hyperparameters with learning rate, batch size, and optimizer, leading to improved model performance and quicker convergence. 
  • Assessed the model's efficacy using comprehensive evaluation metrics, verifying its dependability as a decision-support tool in real-world clinical scenarios, especially in areas with inadequate access to RT-PCR. 

The next parts of the document are organized like this: related work is covered in section 2, the proposed model is explained in section 3, the experiment results are in section 4, and the conclusion is in section 5.

2. Literature Review

Gouda et al. [16] introduced a DL method for categorizing CXR images. They used an improved version of the ResNet-50 model, running it multiple times to make the system more accurate. They tested their system using two commonly used datasets: the COVID-19 Image Data Collection and CXR images. The results showed that their system worked better than traditional models like VGG or DenseNet, accomplishing an accuracy of 99.63%.

Bhattacharyya et al. [17] established an innovative scheme to detect COVID-19 and pneumonia from CXR. Their method has three main steps. First, they used a conditional generative adversarial network (C-GAN) to separate the lung areas from the X-ray images. Then, they used a special pipeline that combined trained neural networks with key point detection to extract important features from the lung images. Finally, they classified them into types similar to COVID-19, pneumonia, or normal using different machine learning methods. They compared different model combinations and found that using the VGG-19 model with the BRISK system gave the highest accuracy of 96.6%. This method can effectively screen patients for COVID-19.

Malik et al. [18] generated a new DL framework called DCDD_Net that can detect nine different chest syndromes using CT scans, cough sound, and chest X-rays. They converted cough sounds into images using a technique called a scalogram. They balanced the training data using the borderline (BL) SMOTE method before training their model. They tested their model using twenty publicly available chest disease datasets. DCDD_Net achieved high performance with an Area under Curve (AUC) of 99.43%, F1-score of 95.61%, recall of 95.76%, precision of 96.82%, and accuracy of 96.67%.

Malik et al. [19] established a multi-class classification system using a DL model to automatically identify five different chest conditions, including COVID-19, lung cancer (LC), pneumonia, and tuberculosis (TB). Their model, called CDC Net, uses dilated convolution and residual networks to diagnose these conditions from chest X-rays. They tested their model using publicly available benchmark data. The CDC Net model was the first to identify five chest diseases using a single DL model. They compared their model with three pre-trained CNNs: VGG19, ResNet-50, and Inception v3, using different performance metrics. CDC Net outperformed the other models, achieving an accuracy of 99.30%, a recall of 98.10%, a precision of 99.40%, and an AUC of 0.9953.

Kathamuthu et al. [20] proposed a new CNN-based framework using transfer learning to detect COVID-19 from CT scans. They tested several CNN models, with VGG16, VGG19, DenseNet121, InceptionV3, Xception, and ResNet50. The VGG16 model performed best with an accuracy of 98.00%. The experiments showed that their method is effective for identifying and observing COVID-19 patients, which might aid in creating tools for healthcare specialists to choose the best treatment options.

However, the proposed CNN-Lattice integrated ResNet50 framework overcomes these limitations by incorporating advanced inter-layer lattice connections, efficient multi-scale feature fusion, entropy-based U-Net segmentation, and Bayesian hyperparameter optimization. The enhanced extraction of discriminative features of COVID-19 and better segmentation performance lead to better performance, including higher Dice Similarity Coefficient (DSC), IoU, and overall classification accuracy than existing methods.

2.1 Problem statement

The rapid and reliable identification of COVID-19 remains a major challenge, particularly in areas with restricted access to RT-PCR testing. While CXR and CT scans offer a non-invasive alternative, the evaluation of these images by hand is time-unbearable and prone to mistakes in diagnosis because of the similarities they share with other lung conditions, like pneumonia. Current automated systems often lack reliability, do not adequately target important lung areas, or do not perform well among COVID-19, pneumonia, and normal cases. Thus, there exists a pressing need for a precise, efficient, and entirely automated diagnostic system that can improve image quality, effectively segment lung regions, and utilize DL for accurate classification. This research seeks to fill this void by developing a hyperparameter-tuned CNN-Lattice-ResNet50 model that is combined with an entropy-based U-Net segmentation and contrast-enhanced pre-processing pipeline to facilitate real-time, explainable, and highly accurate COVID-19 detection.

3. Proposed Method

The proposed framework is an actual process for identifying COVID-19 CXR and CT images, which includes steps for pre-processing, segmentation, and classification. Initially, the images are resized, normalized, and enhanced using Contrast Limited Adaptive Histogram Equalization (CLAHE) to improve contrast. A U-Net model based on entropy then segments the lung regions by highlighting areas with high information content while filtering out irrelevant features. The segmented data is analyzed utilizing a CNN-Lattice ResNet50 model to enhance the capability to extract features across various scales and carry out classification. Significant hyperparameters are optimized through Bayesian techniques to develop the prototype's efficacy. Ultimately, the organization is demonstrated in Figure 1.

Figure 1. Block diagram illustrating the proposed approach

3.1 Pre-processing

In order to organise the input image for further processing, it is first pre-processed. CLAHE, normalization, and image scaling are all included in the processing. Think about the $\triangle C V I D$ dataset. $\triangle C V I D \in \left\{C V I D_1, C V I D_2, C V I D_3, \ldots, C V I D_n\right\}$, where $n$ is the number of images, and each image's vector function is provided in the database of chest CT and CXR images. The associated feature values come in a variety of formats, including numeric, binary, and nominal. It is challenging to handle various formats of CT and CXR image data for the categorization procedure. As a result, we pre-process the datasets of CT and CXR pictures. During the pre-processing phase, values are resized, normalized, and pixel values are enhanced. Following preprocessing, the output data is sent for additional processing, which is described in detail below:

3.1.1 Resizing and normalizing images

CXR and CT images were scaled to a standard determination of $256 \times 256$ pixels. This scaling procedure preserves spatial relevance while allowing compatibility with the ResNet50 framework. In mathematical terms, resizing involves mapping pixel coordinates from the original space $(i, j)$ to the designated grid $\left(i^{\prime}, j^{\prime}\right)$, using bilinear interpolation. The pixel value ($i^{\prime}, j^{\prime}$) at the new location is determined using Eq. (1) as a weighted combination of the four nearest neighboring pixels in the original image:

$C V I D(i, j)=\sum_{m=0}^1 \sum_n^1 w_{m n} * I\left(x_m, y_n\right)$    (1)

where, $x=\frac{i^{\prime}}{H^{\prime}} * H, y=\frac{j^{\prime}}{W^{\prime}} * W$ and $w_{m n}$ are the weights determined by distance. Subsequent to the resizing procedure, the pixel values of the image were normalized to a standard range to enhance the convergence through the model training phase. Initially, a min-max normalization method was utilized, scaling pixel intensities after the unique 8-bit range of [0,255] down to $[0,1]$ using Eq. (2):

$C V I D_{\text {norm }}(p)=\frac{C V I D(p)-C V I D_{\min }}{C V I D \min _{\max }}$     (2)

where, $\mathrm{CVID}_{(\mathrm{p}) \rightarrow}$ the original value for sample/pixel p, $\mathrm{CVID}_{\min } \rightarrow$ the minimum CVID value in the dataset, $\mathrm{CVID}_{\max } \rightarrow$ the maximum CVID value in the dataset, CVIDnorm(p): the normalized value after scaling. To further adapt to the pre-trained ResNet50 model, channel-wise standardization as per Eq. (3) was carried out based on the statistics from the ImageNet dataset:

$I_{s t d}(p)=\frac{C V I D_{\text {norm }(p)-\mu}}{\sigma}$    (3)

Here, μ = [0.485, 0.456, 0.406] and σ = [0.229, 0.224, 0.225] denote the mean and standard deviation for the RGB channels, respectively. This pre-processing step guarantees that the input data is both scale-invariant and numerically stable, which is crucial for optimizing DL models.

3.1.2 CLAHE enhancement

To enhance the visual quality and highlight subtle disease-related features in CXR and CT images, CLAHE was applied as part of the pre-processing pipeline. Unlike global histogram equalization, CLAHE operates on insignificant regions within the image, improving local contrast while avoiding noise amplification. The image is initially segmented into M × N sections, and histogram equalization is applied separately to each section. To avoid excessive enhancement, a contrast limiting threshold T is utilized to trim the histogram and uniformly redistribute any clipped pixels. The mathematical transformation function for a pixel intensity p contained within a section is expressed as follows:

$p_{e n h}=C D F(p) *(L-1)$       (4)

where, CDF is the increasing delivery role of the trimmed histogram, and L is the number of gray levels. Bilinear interpolation is used to merge the enhanced tiles, providing seamless transitions at the edges. By greatly improving image clarity, our method aids in identifying radiological features, similar lung consolidations, and ground-glass opacities, which are necessary for exact diagnosis.

To boost the local contrast without over-amplifying the noise in CXR and CT images, a clip limit value of 2.0 and a tile grid size of 8 × 8 were used in the CLAHE pre-processing stage. The Entropy-based U-Net model uses a patch size of 256 × 256 pixels for lung segmentation, to balance spatial information and maintain computational efficiency and compatibility with the ResNet50 backbone.

3.2 Segmentation: Entropy-based U-Net

To enhance the effectiveness of disease localization and segmentation, the entropy-driven U-Net segmentation framework is developed to accurately distinguish lung regions in CXR and CT images. By integrating entropy metrics into the U-Net architecture, the model emphasizes regions with high uncertainty, which often indicate potential clinical issues.

This methodology enables the segmentation network to dynamically concentrate on areas with differing intensities and textures, resulting in better boundary definition between healthy and affected lung tissues. The segmented lung areas provide refined inputs for subsequent classification models, ultimately improving diagnostic performance for various conditions. Figure 2 illustrates the segmentation process.

Figure 2. Segmentation using Entropy-based U-Net

3.2.1 Patch-wise entropy map computation

Patch-wise entropy map computation is a critical pre-processing step in the Entropy-based U-Net that helps identify informative regions, typically areas with greater uncertainty or variation, such as infected lung tissues. Here is a detailed breakdown:

The input prepressed chest image $\operatorname{CVID}(x, y) \in R^{H * W}$ is divided into small overlapping or non-overlapping patches, usually of size $k \times k$ as mentioned in Eq. (5):

$p_{i, j}=\{\operatorname{CVID}(x, y) \mid x \in[i, i+k], y \in[j, j+k]\}$   (5)

For each patch $p_{i, j}$, compute the histogram of pixel intensity values. Normalize it using Eq. (6) to obtain the probability distribution $p_i$ for each intensity bin.

$p_i=\frac{n_i}{\sum_{i=1}^N n_i}$    (6)

where, $n_i$ is the count of pixels with intensity level i, and N is the number of bins. Apply the Shannon entropy formula to compute the entropy for each patch as in Eq. (7):

$H(i, j)=\sum_{i=1}^N p_i \log _2\left(p_i+\epsilon\right)$   (7)

In this context, ϵ is a small constant introduced to prevent log(0). A higher entropy value suggests greater texture, which is frequently associated with lung infections or abnormalities. The entropy value H(i,j) is assigned to the central pixel of each patch and is replicated throughout the image using Eq. (8) to create the entropy map EM(x,y):

$E M(x, y)=H(i, j)$ for $(x, y) \in$ center of $p_{i, j}$    (8)

Normalize the values of EM(x,y) to the range [0, 1] for consistency using Eq. (9):

$E M^{\prime}(x, y)=\frac{E M(x, y)-\min (E M)}{\max (E M)-\min (E M)}$    (9)

The normalized entropy map is subsequently active to adjust the input image within the U-Net framework, steering the model's attention towards normal or abnormal regions of the lung.

3.2.2 Weighted input to U-Net encoder

The entropy map plays a role in the Weighted Input stage of the Entropy-based U-Net, improving feature learning by directing the model’s focus towards regions of uncertainty or anomalies, typically linked to lung areas impacted by illness. The input image is adjusted based on the entropy map to emphasize significant (high-entropy) areas. The weighted input CVID ′(x,y) is divided as per Eq. (10):

$\operatorname{CVID}^{\prime}(x, y)=\operatorname{CVID}(x, y) *\left(1+\lambda * E M^{\prime}(x, y)\right.$   (10)

where, $\lambda \in R^{+}$is a hyperparameter controlling the influence of entropy, high values $E M^{\prime}(x, y)$ increase the pixel intensity in uncertain (informative) regions, and low values $E M^{\prime}(x, y)$ preserve normal regions. The weighted image CVID $^{\prime}(x, y)$ is passed into the encoder path of U-Net, and each encoder block applies as per Eq. (11).

$f^{(l)}=\sigma\left(W^{(l)} * f^{(l-1)}+b^{(l)}\right)$   (11)

where, $f^{(l)}$ is the feature map at layer l, σ is the activation function, W and b are learnable weights and biases. This process ensures that feature maps are high in relevant disease features, especially in boundary regions that are often difficult to segment accurately.

3.2.3 U-Net segmentation

The U-Net segmentation module forms the backbone of the Entropy-based segmentation pipeline. It is a symmetric encoder-decoder architecture specifically designed for pixel-wise segmentation tasks, such as extracting lung regions from CXR or CT images.

The input $\operatorname{CVID}(x, y) \in R^{H * W}$ image carries enhanced focus on informative lung regions and is passed into the U-Net model. The encoder's function is to diminish spatial resolution while preserving hierarchical traits. Each encoder block consists of.

Two convolutional layers as represented in Eq. (12):

$f^{(l)}=\operatorname{Re} L u\left(w^{(l)} * f^{(l-1)}+b^{(l)}\right)$    (12)

where, f(l) denotes the output feature map at the lthconvolutional layer, while f(l−1) represents the input feature map obtained from the previous layer. The parameter W(l) that extracts important spatial features such as edges, textures, and infection-related patterns from the input image. Max pooling for down-sampling is given in Eq. (13).

$f_{\text {pool }}^{(l)}=\operatorname{MaxPool}\left(f^{(l-1)}\right)$    (13)

In Eq. (13), $f_{\text {pool }}^{(l)}$ represents the pooled feature map after the max-pooling is performed at lth layer, while f (l−1). The feature map of the previous convolutional layer is represented as MaxPool(), which is a spatial downsampling operation in which the maximal value in a user-defined pooling window (usually 2 × 2) will be retained. This operation will lower the dimensionality of the feature map but keep the most important and discriminative features. Max-pooling also aids in reducing the complexities of the computation, overfitting, and provides translational invariance, which makes the network able to capture predominant abnormalities of the lungs and infection-related patterns in the CXR and CT images. To create a segmentation mask, the decoder upsamples the feature maps back to the original resolution. Each decoder block performs. Transposed convolution (upsampling) is done using Eq. (14):

$f_{u p}^{(l)}=U p \operatorname{Conv}\left(f^{(l-1)}\right)$    (14)

In Eq. (14), $f_{u p}^{(l)}$ is the upsampled feature map resulting at the lth. The decoder layer will be the following one. f(l−1) represents the input feature map of the previous decoder stage. The UpConv() operation denotes transposed convolution (up-convolution or deconvolution), which is the operation performed during the decode stage of the U-Net architecture to upsample the feature maps. This operation can be used to restore fine spatial detail and original image size when downsampling is performed in the encoder path. The decoder fully recovers lung boundaries and infection regions, allowing for accurate segmentation of COVID-19 affected tissues by gradually expanding on the feature maps. Concatenation with corresponding encoder features via skip connection using Eq. (15).

$f_{\text {Concat }}^{(l)}=\operatorname{Cat}\left(f_{u p}^{(l)}, f_{\text {encode }}^{(l)}\right)$   (15)

In Eq. (15), $f_{\text {Concat}}^{(l)}$ represents the concatenated feature map at the $l^{\text {th }}$ decoder layer. The term $f_{u p}^{(l)}$ denotes the corresponding downsampled feature map for this upsampled feature map $f_{\text {encode}}^{(l)}$ the feature map corresponds to the feature map in the ‘encoder path'. A Cat() operation is a feature concatenation operation performed by skip connections in U-Net architecture. At the final layer of the decoder, apply a 1 × 1 convolution to reduce feature maps to one channel (binary mask), and use a sigmoid activation to generate the segmentation probability mask as in Eq. (16):

$S(x, y)=\frac{1}{1+e^{-z(x, y)}}$   (16)

The output is a binary mask of the same size as the input, where pixel value ≈ 1 → lung region, and pixel value ≈ 0 → background. This mask is then used for further classification or visualization.

Table 1. Model parameters and hyperparameter settings

Parameter Category

Parameter

Value / Range

Entropy Map Computation

Patch Size

16 × 16

 

Entropy Normalization Range

[0,1]

Entropy Influence Parameter

0.7

U-Net Segmentation

Input Patch Size

256 × 256

 

Encoder Convolution Kernel

3 × 3

Pooling Size

2 × 2

Decoder Upsampling Kernel

2 × 2

Bayesian Hyperparameter Optimization

Learning Rate

1 × 10-15 to

1 × 10-3

 

Batch Size

8–32

Optimizer Types

Adam

Number of Epochs

50–100

Dropout Rate

0.2–0.5

Data Augmentation

Rotation Range

±15°

 

Horizontal Flipping

Applied

Scaling Factor

0.9–1.1

Contrast Enhancement

Applied

The proposed framework uses 5 randomly initialized starting points and optimizes the hyperparameters for 100 iterations of Bayesian Optimization to ensure good exploration of the hyperparameter search space. A Gaussian Process (GP) based surrogate model and the Expected Improvement (EI) acquisition function were used to achieve a tradeoff between exploration and exploitation during optimization. The optimization process looked for the optimal setting of the hyperparameters as listed in Table 1.

3.3 Model architecture

ResNet50 architecture integrated with CNNs-Lattice, combining DL's residual feature extraction with lattice-based transformations for improved classification, as exposed in Figure 3.

Figure 3. Proposed ResNet50 architecture integrated with CNNs-Lattice

3.3.1 ResNet50 encoder

To improve diagnostic accuracy in medical imaging, segmented input images typically generated using entropy-based U-Net are used as input to the ResNet50 network. The segmented image, denoted as $S(x, y)$, contains only the lung regions with the background removed, focusing the model on diagnostically relevant features. The ResNet50 model, made up of convolutional layers and residual blocks, takes the image as its input. To obtain low-level features, the initial convolutional layer utilizes a 7 × 7 kernel with a stride of 2 as per Eq. (17):

$F_1=\operatorname{Re} L U\left(B N\left(W_1 * S+b_1\right)\right)$    (17)

This is followed by a max-pooling operation as given in Eq. (18).

$\left.F_2=\operatorname{MaxPoo} 1\left(F_1\right)\right)$    (18)

The resulting features are then passed through a series of residual blocks from Conv2_x to Conv5_x. Each residual block computes as per Eq. (19):

$\begin{aligned} F^{(l)}=\sigma\left(W_1^{(l)} *( \right. & \left.\left.\sigma\left(W_1^{(l)} * F^{(l-1)}+b_1^{(l)}\right)\right)+b_2^{(l)}\right) \\ & +F^{(l-1)}\end{aligned}$     (19)

where, σ is the ReLU activation, W1and W2are weight tensors of convolution layers, $F^{(l-1)}$ is the input to the block, and is also used in the skip connection. These residual connections help maintain gradient flow and preserve spatial detail across many layers. After passing through the deepest block (Conv5_x), the network outputs a high-dimensional feature map in Eq. (20):

$F_{\text {ResNet }} \in R$    (20)

This $F_{\text {ResNet}}$ feature map encodes rich hierarchical information about the segmented lung region, including shapes, textures, and structural abnormalities in Figure 4. It serves as the deep feature representation and can be further processed using global average pooling as per Eq. (21):

$f=G A P\left(F_{\text {ResNet }}\right) \in R$     (21)

where, GAP(): Global Average Pooling, which averages each feature channel across spatial dimensions. f the resulting feature vector after pooling. Advanced modules, CNNs-Lattice, were directly applied for non-linear transformations and classifications. With segmented input, ResNet50 fully concentrates on the pertinent lung anatomy, enhancing its generalization and reliability in identifying conditions like COVID-19 or pneumonia.

(a) Chest X-rays (CXR) images

(b) Computed tomography (CT) images

Figure 4. Feature map

3.3.2 Lattice layer transformation

The CNNs-Lattice layer applies a trainable piecewise-linear function to the input features. Each output dimension is a function of multiple input dimensions using a lattice function as mentioned in Eq. (22):

$y_k=L_k\left(f ; \theta_k\right) \in R$      (22)

where, $L_k$ is the lattice function for output node $k, \theta_k$ are the learnable lattice parameters, the transformation is nonlinear, monotonic, and interpretable. The CNNs-Lattice layer models structured decision boundaries by learning interpolated outputs over a multi-dimensional grid (the lattice), which is effective for high-variance medical features.

3.3.3 Final prediction layer

To decrease the dimensionality of the feature map $f$, we apply GAP, which summarizes each feature channel into a single value using Eq. (23):

$f_i=\frac{1}{H \cdot W} \sum_{x=1}^H \sum_{y=1}^W y_k(x, y, i)$    (23)

The final prediction is obtained by spreading over a softmax function depending on the task, as in Eq. (24).

$\hat{y}=\frac{e^{z_i}}{\sum_{j=1}^C e^{z_i}} \quad$ for $i=1,2,3, \ldots \ldots, C$     (24)

The final output is a vector of probabilities for each class for a 3-class classification COVID/Pneumonia/Normal.

3.4 Hyperparameter tuning

To improve the classification effectiveness of the ResNet50 model integrated with the CNNs-Lattice approach, Bayesian Optimization was employed for tuning hyperparameters. This technique effectively navigates the hyperparameter landscape by building a surrogate probabilistic model, often a Gaussian Process (GP), to approximate the target function, such as validation accuracy. The main hyperparameters that were optimized include the learning rate (η), batch size (B), and type of optimizer (O). The search space AD is defined as in Eq. (25):

$A D=\left\{\begin{array}{c}\eta \approx \log \left(1 X 10^{-5}, 1 X 10^{-2}\right. \\ B \in\{16,32,64,128\} \\ O \in\{\text { Adam }\}\end{array}\right.$     (25)

Bayesian Optimization recursively chooses the next set of hyperparameters θ= (η, B, O) AD by maximizing an acquisition function a(θ), which assesses both anticipated performance and uncertainty derived from the surrogate model using Eq. (26):

$\theta^*=\underset{\theta \in A D}{\operatorname{argmaxa}}(\theta)$    (26)

The performance objective is to maximize validation accuracy, or equivalently, minimize validation loss, denoted as in Eq. (27):

$\theta^*=\underset{\theta \in A D}{\operatorname{argmax}} A C C_{\text {val }}(\theta)$    (27)

where, θ: a candidate set of model parameters or hyperparameters. AD: the search space/domain of allowable configurations. $A C C_{v a l}(\theta)$ validation accuracy achieved using configuration θ. argmax: “the argument that maximizes,” meaning the value of θ producing the largest accuracy. $\theta^*$ the best/optimal configuration found. By continuously updating the surrogate model based on observed outcomes, Bayesian Optimization efficiently converges to an optimal or near-optimal hyperparameter configuration, achieving better generalization performance with significantly fewer model evaluations compared to exhaustive search techniques.

The proposed CNN-Lattice integrated ResNet50 framework shows good classification and segmentation accuracy, whereas it does not extensively discuss the robustness under small-sample conditions and noisy medical images. The quality of the images acquired by CT and CXR may differ in the practical clinical setting, as image acquisition settings, imaging devices, motion artifacts, and noise levels may differ and impact the model performance and its ability to generalize. While preprocessing methods like CLAHE enhancement and entropy segmentation using U-Net have been employed to enhance the image quality and minimize the presence of noise and irrelevant artifacts, further testing on more diverse and challenging clinical datasets is still required to validate the robustness and reliability of the proposed method in practical settings.

4. Results and Discussion

Python was used as the working platform to accomplish our proposed significant CNN-Lattice breast image categorization. Using the suggested characterisation, the COVIDx CT and COVIDx CXR-4 datasets are used to investigate the classification of COVID images into Normal, Pneumonia, and COVID-19. The current DenseNet121 [21], EfficientNetB0 [22], and VGG16 [23] classifiers, on the basis of accuracy, sensitivity, specificity, and AUC, are displayed in the work that is being presented.

4.1 Data collection

4.1.1 COVIDx CXR-4 dataset

A complete collection of CXR images referred to as the COVIDx CXR-4 dataset is utilized for training and assessing DL models in the classification of COVID-19, pneumonia, and healthy individuals [24]. As shown in Table 2, it comprises 1626 images of COVID-19 cases, 1800 images of pneumonia, and 1802 images depicting normal individuals. The training subsets consist of 1300, 1440, and 1441 images, respectively, while the testing sets include 326, 360, and 361 images. By incorporating pneumonia and normal cases alongside COVID-19, the dataset guarantees that the model can accurately distinguish between similar respiratory conditions, which enhances its reliability and diagnostic capabilities in clinical environments. Figure 5 shows the sample output of the images.

(a) Input images

(b) Resize images

(c) Normalized images

(d) Enhanced images

(f) Segmented images

Figure 5. COVIDx CXR-4 dataset sample output

Table 2. COVIDx CXR-4 dataset details

Category

Training

Testing

Total

COVID

1300

326

1626

Pneumonia

1440

360

1800

Normal

1441

361

1802

4.1.2 COVIDx CT dataset

The COVIDx CT dataset consists of three groups: COVID, Pneumonia, and Normal, which are separated into training and testing subsets through an 80:20 distribution [24]. Table 3 illustrates that the COVID group contains 577 training images and 145 testing images (a total of 722), the Pneumonia category has 262 training images and 66 testing images (totaling 328), and the Normal class includes 118 training images and 30 testing images (adding up to 148). This distribution ensures that a significant slice of the data is used for training while preserving adequate unseen samples for performance evaluation. The distribution also shows a greater representation of COVID cases, which is beneficial for focusing on precise COVID-19 detection, while still allowing for differentiation from pneumonia and normal cases during classification, and Figure 6 displays the sample output of the COVIDx CT dataset.

Table 3. COVIDx CT dataset details

Category

Training

Testing

Total

COVID

577

145

722

Pneumonia

262

66

328

Normal

118

30

148

The data set COVIDx CT has been separated into a training set and a test set in the ratio 80:20, but the data set is not balanced, and the COVID-19 class has a larger number of samples than the other two classes, Pneumonia and Normal. To overcome this, data augmentations such as scaling, contrast enhancement, horizontal flipping, and rotation were used during training to enhance sample diversity, and the model was less biased. Additionally, balanced batch sampling was used to guarantee a more equal contribution from every class throughout the learning process. The proposed CNN-Lattice integrated ResNet50 framework was enhanced by these strategies without using a weighted loss function to boost the robustness, generalization ability, and classification stability.

(a) Input images

(b) Resize images

(c) Normalized image

(d) Enhanced images

(f) Segmented images

Figure 6. COVIDx computed tomography (CT) dataset sample output

4.2 Performance metrics

The experiment was conducted using five criteria to evaluate the performance metrics. Table 4 offers a summary of these performance metrics along with their corresponding calculations.

Table 4. Performance metrics

Performance Metrics

Formula

Accuracy

$\frac{(\mathrm{TNi}+\mathrm{TPi})}{(\mathrm{TNi}+\mathrm{TPi}+\mathrm{FNi}+\mathrm{FPi})}$

Precision

$\frac{T P i}{(T P i+F P i)}$

Recall

$\frac{T P i}{(T P i+F P i)}$

F1-score

$F i=\frac{2 P R i}{P i+R i}$

Specificity

$\frac{T N i}{T N i+F p i}$

4.3 Segmentation: Performance analysis

Figure 7(a) illustrates a comparison of segmentation accuracy for CT images using three distinct methods. The suggested method achieves the highest accuracy of around 93%, surpassing U-Net at approximately 84% and the traditional method at roughly 77%. This demonstrates that the proposed approach is more efficient in accurately segmenting CT images, making it a superior choice for medical applications such as diagnosis and treatment planning. Figure 7(b) illustrates a comparison of segmentation accuracy for CXR images among three algorithms: Proposed, U-Net, and Traditional Method (TM). The Proposed method accomplishes the highest accuracy of approximately 96%, underscoring its strong capability in segmenting complex structures within X-ray images. The U-Net model, recognized as a standard DL technique, reaches approximately 87%, while the TM trails at 76%. This emphasizes the effectiveness of the proposed method in accurately extracting anatomical information from X-ray images, which is vital for dependable diagnosis and clinical decision-making.

(a) Computed tomography (CT) dataset segmentation accuracy

(b) Chest X-rays (CXR) dataset segmentation accuracy

Figure 7. Segmentation accuracy comparison

Figure 8(a) presents a comparison of the Dice Similarity Coefficient (DSC) for CT image segmentation. The Proposed method achieves the highest DSC value of roughly 0.93, indicating an excellent similarity between the predicted and ground truth regions. U-Net receives a DSC of about 0.89, showing commendable performance but falling slightly short of the proposed model. With a DSC of approximately 0.84, the TM exhibits the poorest performance, suggesting a reduced level of segmentation precision. These results emphasize the enhancement in accuracy and reliability of CT image analysis through the proposed method.

(a) Computed tomography (CT) dataset DSC

(b) Chest X-rays (CXR) dataset DSC

Figure 8. Dice Similarity Coefficient (DSC) comparison

The DSC values are presented for comparison in Figure 8(b). The superior ability of the recommended method to match the segmented regions with the actual reference is evident from its highest DSC, nearing 0.92. U-Net also demonstrates strong performance with a DSC of about 0.88, while the TM ranks third with a score of approximately 0.85. These outcomes emphasize the efficacy of the proposed methodology in providing accurate and reliable segmentation results for CXR images, which is important for precise medical interpretation.

The Illustration in Figure 9(a) depicts the Intersection over Union (IoU) evaluation for CT image segmentation. The proposed approach attains the highest IoU value of approximately 0.94, suggesting a significant overlap between the predicted and actual segmentation areas. Following this, U-Net achieves an IoU of about 0.83, indicating good performance, albeit lower than the proposed method. With an IoU close to 0.70, the TM demonstrates the lowest positive result, indicating a reduced level of segmentation precision. These results unmistakably illustrate the efficiency of the proposed method in providing accurate and reliable CT image segmentation.

(a) Computed tomography (CT) dataset IoU

(b) Chest X-rays (CXR) dataset IoU

Figure 9. IoU comparison

In Figure 9(b), the IoU comparison for CXR images is shown. The proposed method reaches the highest IoU, around 0.95, indicating outstanding overlap between the predicted and actual segmented regions. U-Net also performs admirably with an IoU close to 0.88, while the Traditional Method exhibits the weakest performance, scoring an IoU of approximately 0.70. These findings emphasize the heightened segmentation accuracy of the proposed technique, making it more trustworthy for precise CXR image analysis in clinical contexts.

4.4 Comparative analysis

Figure 10(a) illustrates a comparison of accuracy for classifying CT images. The Proposed method achieves the highest accuracy at 98.83%, showcasing its exceptional capacity to detect and classify features within CT scans accurately. VGG16 [23] comes next with an accuracy of 95.82%, reflecting robust performance but slightly less effectiveness. DenseNet121 [21] posts an accuracy of 92.64%, and EfficientNetB0 [22] ranks the lowest among the evaluated models with an accuracy of 90.07%. These outcomes emphasise the proposed model's improved feature extraction and classification capabilities for CT images, establishing it as the most dependable option for precise medical diagnosis in this area.

Figure 10(b) provides an accuracy comparison of the classification of CXR images. The Proposed method accomplishes the highest accuracy of 98.92%, signifying its outstanding performance in accurately classifying X-ray images. Following this, VGG16 achieves an accuracy of 95.1%, showing strong results but still falling short of the proposed model. DenseNet121 accomplishes an accuracy of 92.7%, while EfficientNetB0 has the lowest accuracy at 89.52%. These figures demonstrate the proposed model's superior capability to extract significant features from CXR images, rendering it more reliable for clinical diagnostics than the other standard architectures.

The precision comparison depicted in Figure 11(a) for the classification of CT and CXR images demonstrates that the proposed model surpasses other well-known methods. For CT images, the proposed model achieves the highest precision at 98.11%, with VGG16 following at 95.91%, DenseNet121 at 92.3%, and EfficientNetB0 at 89.41%.

(a) Computed tomography (CT) dataset accuracy

(b) Chest X-rays (CXR) dataset accuracy

Figure 10. Accuracy comparison

(a) Computed tomography (CT) dataset precision

(b) Chest X-rays (CXR) dataset precision

Figure 11. Precision comparison

(a) Computed tomography (CT) dataset recall

(b) Chest X-rays (CXR) dataset recall

Figure 12. Recall comparison

Likewise, Figure 11(b) demonstrates that for CXR images, the proposed model leads again with a precision of 98.91%, while VGG16, DenseNet121, and EfficientNetB0 score 95.37%, 93.6%, and 90.24%, respectively. These outcomes underscore the exceptional capability of the proposed model to accurately identify positive cases while reducing the rate of false positives. High precision is vital in medical diagnostics to prevent healthy individuals from being incorrectly identified as having a disease. Consequently, the proposed method proves to be more effective and trustworthy in aiding clinical decision-making for both CT and CXR image evaluations.

Figure 12(a) and Figure 12(b) provide a comparison of the recall performance for various DL models (Proposed, VGG16, DenseNet121, and EfficientNetB0) across CT and CXR image datasets. In both instances, the proposed model significantly surpasses the others, achieving the highest recall values of 98.7% for the first dataset and 98.98% for the second. VGG16 closely follows with recall rates of 95.24% and 96.5%, respectively. With performance rates of 95.24% and 96.5%, VGG16 ranks second. EfficientNetB0 consistently registers the lowest recall rates, at 90.03% and 90.79%, while DenseNet121 achieves a satisfactory recall level at 92.28% and 93.51%. These results indicate that the proposed model is more effective in identifying positive instances in CT CXR images.

The evaluation of F1-scores for the two CT CXR datasets indicates that the suggested model outperforms established architectures such as VGG16, DenseNet121, and EfficientNetB0. In Figure 13(a) related to the first CT dataset, the suggested model achieved an F1-score of 97.78%, in comparison to VGG16's 96.35%, DenseNet121's 92.21%, and EfficientNetB0's 90.33%. In the second dataset depicted in Figure 13(b), the proposed model once again recorded the highest F1-score at 98.35%, which is greater than VGG16’s score of 95.82%, DenseNet121’s 92.57%, and EfficientNetB0’s 89.71%. These findings underscore the efficacy and reliability of the proposed model in achieving stability among precision and recall, establishing it as an exceptional option for medical diagnostics using CT CXR images.

The AUC evaluations for CT CXR classification further reinforce the durable enactment of the proposed model. In both datasets presented in Figure 14(a) and Figure 14(b), the proposed model recorded the highest AUC values at 98.4% and 98.51%, demonstrating outstanding discrimination between positive and negative cases. VGG16 performed slightly lower with scores of 96.01% and 95.89%, followed by DenseNet121 with moderate results of 92.69% and 92.71%. EfficientNetB0 consistently displayed the lowest AUC scores of 89.65% and 90.31%. These outcomes highlight the proposed model's superior capability to differentiate between infected and non-infected CT CXR images, making it highly appropriate for clinical diagnostic assistance.

(a) Computed tomography (CT) dataset F1-score

(b) Chest X-rays (CXR) dataset F1-score

Figure 13. F1-score comparison

(a) Computed tomography (CT) dataset AUC

(b) CXR dataset AUC

Figure 14. Area under Curve (AUC) comparison

4.5 Confusion Matrix comparison

The Confusion Matrixs (CMs) reveal the classification precision of four distinct models: DenseNet121, EfficientNetB0, VGG16, and the proposed model across three medical conditions: COVID-19, Pneumonia, and Normal. Among these, the proposed model illustrated in Figure 15(d) showcases the strongest performance with almost flawless classification: 141 COVID cases were accurately identified with just 3 total misclassifications, 65 Pneumonia cases were correctly classified without errors, and only one out of 28 Normal cases was misclassified.

In contrast, DenseNet121 depicted in Figure 15(a) accurately classifies 134 COVID cases, but misclassifies 10 others (5 as Pneumonia and 5 as Normal), along with achieving 62 out of 65 correct Pneumonia predictions and 26 out of 29 for Normal. EfficientNetB0, shown in Figure 15(b), exhibits somewhat lower accuracy, particularly with 11 errors in COVID classification and 5 in Pneumonia, although the classification of Normal cases remains consistent. VGG16, represented in Figure 15(c), performs adequately, accurately classifying 138 COVID cases and 64 Pneumonia cases, but it does have a few mistakes in the Normal classification. In general, the suggested model outperforms all alternatives, exhibiting higher accuracy and fewer misclassifications in all three categories.

The CMs for the CXR image dataset are illustrated in Figure 16. The proposed model displayed in Figure 16(d) achieves the highest performance, securing near-perfect accuracy with 319 correct predictions for COVID, 354 for Normal, and 357 for Pneumonia, resulting in only 9 misclassifications in total. VGG16 illustrated in Figure 16(c), also performs well, appropriately classifying 310 COVID, 342 Normal, and 349 Pneumonia cases, but it has slightly more misclassifications compared to the proposed model. DenseNet121, shown in Figure 16(a), accurately predicts 302 COVID, 334 Normal, and 340 Pneumonia cases, yet it has slightly higher rates of misclassification, especially when classifying Normal and Pneumonia. EfficientNetB0, depicted in Figure 16(b), ranks as the least accurate among the four in this dataset, with 291 correct COVID classifications and greater confusion across all classes, particularly between COVID and Normal. Overall, the proposed model consistently stands out with the highest accuracy and the least confusion across all three classes, demonstrating its robustness on the CXR dataset as well.

(a) CT dataset CM using DenseNet121                       

(b) CT dataset CM using EfficientNetB0

(c) CT dataset CM using VGG16                        

(d) CT dataset CM using the proposed model 

Figure 15. Comparison of Confusion Matrix (CM) of various models using Computed tomography (CT) dataset

(a) CXR dataset CM for DenseNet121                   

 (b) CXR dataset CM for EfficientNetB0

(c) CXR dataset CM for VGG16                           

(d) CXR dataset CM for Proposed model

Figure 16. Comparison of CM of various models using CXR dataset

Discussion

The confusion matrix results clearly show that the proposed CNN-Lattice enhanced ResNet50 achieves better classification performance than DenseNet121, EfficientNetB0, and VGG16. DenseNet121 correctly classifies 134 COVID-19 cases while misclassifying 10 samples (5 as Pneumonia and 5 as Normal), showing moderate confusion between the different classes of pulmonary diseases for the CT dataset shown in Figure 15. EfficientNetB0 shows poorer performance with 11 errors in COVID-19 classification and extra Pneumonia misclassifications, indicating that it has less feature discrimination ability. The accuracy of the classification increases for VGG16, the model correctly classifying 138 COVID-19 and 64 Pneumonia cases, while some Normal cases are yet to be correctly predicted. Contrary to that, the proposed model classifies the lowest number of errors, thereby making it more robust and yielding better feature learning.

The proposed model gives the maximum classification accuracy for the CXR dataset presented in Figure 16, which is 319 COVID-19 cases, 354 Normal cases, and 357 Pneumonia cases, with only 9 cases misclassified. However, VGG16 has also achieved a good performance on 310 correct COVID-19 predictions, 342 Normal predictions, and 349 Pneumonia classifications, albeit with a higher level of confusion than the proposed method. Meanwhile, DenseNet121 is able to successfully classify 302 and predict 334 COVID-19 cases and 340 Normal cases in the dataset, and EfficientNetB0 achieves only 302 correct COVID-19 classifications and has the highest inter-class confusion.

A detailed study also shows that the majority of misclassifications are between the COVID-19 and Pneumonia categories. This is mainly due to the similar appearances of both diseases on the X-ray, including ground-glass opacities, lung infiltrates, and consolidation areas. By using the CNN-Lattice framework, these confusions are minimized by using entropy-based U-Net segmentation and multi-scale feature fusion, enhancing the localization of boundaries and extraction of disease-specific features. However, there are a few COVID–Pneumonia errors still present, indicating the intrinsic difficulties of categorising pulmonary disease by CT and CXR.

4.6 Hyperparameter tuning

The depiction in Figure 17, which illustrates accuracy and validation loss throughout the iterations of Bayesian Optimization for CT classification, demonstrates that the new model consistently outperforms VGG16, DenseNet121, and EfficientNetB0. In the left graph, the accuracy of the proposed model gradually increases, surpassing 95% by the 100th iteration, while the other models underperform. VGG16 reaches nearly 89%, DenseNet121 remains around 86%, and EfficientNetB0 falls below 83%. Currently, the graph on the right illustrates that the proposed model reaches the lowest validation loss, which consistently and significantly declines to below 0.3, suggesting improved generalization. In comparison, the alternative models exhibit greater final losses, with EfficientNetB0 displaying the poorest performance. This supports the conclusion that the proposed method is effective and competent in improving CT classification performance.

The updated Figure 18 for CXR classification using Bayesian Optimization again confirms the consistent superiority of the proposed model in both accuracy and validation loss. On the left, the accuracy curve of the proposed model rises steadily, reaching above 95% by the 100th iteration, higher than VGG16 (~89%), DenseNet121 (~86%), and EfficientNetB0 (~82%). On the right, the validation loss for the suggested model exhibits the quickest and most significant decrease, falling below 0.3, which signifies effective learning and limited overfitting. In contrast, the other models reach a plateau at higher loss levels, with EfficientNetB0 showing the weakest performance. These findings underscore the proposed model’s strength and efficacy in enhancing CXR image classification.

Figure 17. Accuracy and validation loss over Bayesian Optimization for the Computed tomography (CT) image dataset

Figure 18. Accuracy and validation loss over Bayesian Optimization for the Chest X-rays (CXR) image dataset

4.7 Error analysis

Figure 19 is the comparative error analysis of the proposed model, U-Net, and TM based on the MAE, MSE, and RMSE criteria. The proposed model had a minimum error value in terms of MAE, MSE, and RMSE, which were 0.05, 0.01, and 0.10, respectively, which shows that the accuracy in prediction is better and the deviation between the actual and predicted output is minimal. Compared to this, U-Net showed moderate error values, MAE = 0.12, MSE = 0.04, and RMSE = 0.20, and TM had the highest error values, MAE = 0.20, MSE = 0.08, and RMSE = 0.28, indicating comparatively low prediction performance. The obtained error metrics of the proposed model indicate that the model is effective, robust, and possesses better learning ability than the previous model.

Figure 19. Comparative error analysis of the proposed model and existing methods

4.8 Confidence intervals

Figure 20 is a confidence interval comparison between the proposed model, U-Net, and TM with respect to accuracy performance. The Proposed model achieved the highest accuracy of 98%, with its narrow range of confidence between 97% and 99%, showing very stable and reliable performance. U-Net achieved moderate consistency with an accuracy of 92% (90% to 94%), and TM had the lowest accuracy of 85% (82% to 88%), suggesting lower prediction reliability. The Proposed model outperformed comparative models in most cases, showing high accuracy and shorter confidence intervals, thus proving the statistical significance and stability of the proposed model.

Figure 20. Confidence interval comparison of the proposed model with exiting method

4.9 Comparative analysis

Here's a comparative analysis table along with a discussion paragraph comparing Table 5, showing the proposed model with the referenced state-of-the-art methods [16-20, 24, 25], focusing on datasets, techniques used, and key performance metrics.

Table 5. Comparing the proposed model with the referenced state-of-the-art methods

Ref

Year

Modality

Model/Technique

Dataset

Training/Class Distribution

Classification Type

Accuracy (%)

AUC (%)

[16]

2022

X-ray

Deep CNN

Public X-ray dataset

COVID-19 and Normal classes

Binary (COVID vs Normal)

97.4

97

[17]

2022

X-ray

CNN with data augmentation

COVIDx

Binary class-balanced dataset

Binary

96.6

95.8

[18]

2023

X-ray, CT, Cough

Multi-modal DL

Custom dataset

Multi-modal chest disease dataset

Multi-class (5 types)

95.1

94.5

[19]

2023

X-ray

CDC_Net (Custom CNN)

NIH CXR

Five chest disease categories

Multi-class (5 diseases)

96.8

96.3

[20]

2023

CT

Transfer Learning (ResNet, VGG)

Private CT dataset

COVID and Non-COVID CT scans

Binary

96.3

95.9

[24]

2024

CT + CXR

DenseNet + MobileNet

COVIDx + COVID-CT

COVID-19, Pneumonia, and Normal classes

Multi-class

94.3

97.0

[25]

2024

Chest X-ray

ResNet-101

ChestX-ray8

Pneumonia and Normal categories

Binary

93.7

94.0

Proposed

This work (2025)

CXR & CT

CNNs-Lattice + ResNet50 + U-Net + BO

COVIDx CXR-4, COVIDx CT

COVID-19: 577 train / 145 test, Pneumonia: 262 train / 66 test, Normal: 118 train / 30 test

Multi-class (COVID, Pneumonia, Normal)

98.92 CXR), 98.83 (CT)

98.51 (CXR), 98.4 (CT)

Notes: CT = Computed tomography; CXR = Chest X-rays; CNN = convolutional neural network; AUC = Area under Curve.

In Table 5, the suggested model shows superior performance compared to existing methods regarding accuracy and AUC for both CXR and CT modalities. In comparison to previous studies like Gouda et al. [16] and Islam et al. [24], which obtained AUC standards of up to 97%, the new method achieves 98.51% for X-ray and 98.4% for CT, thereby providing improved multi-class classification for COVID-19, pneumonia, and normal cases. The enhanced performance stems from integrating CNNs with a lattice framework, ResNet50, entropy-driven U-Net segmentation, and Bayesian optimization, making it more reliable and efficient in clinical environments compared to previous models.

5. Conclusion

This research introduces a dependable and accurate DL designed for the automatic detection of COVID-19, pneumonia, and healthy subjects using chest CXR and CT scans. The proposed architecture integrates ResNet50 with enhancements derived from CNNs-Lattice, resulting in a significant boost in feature representation and classification performance. A significant advantage of this framework lies in its organized pre-processing pipeline, where image quality is enhanced through CLAHE, and irrelevant areas are eliminated via entropy-based U-Net segmentation, allowing focus on the lung region. The research employs two balanced datasets, COVIDx CXR-4 and COVIDx CT, ensuring equitable training and assessment. Comprehensive testing and comparative evaluations against leading models such as DenseNet121, EfficientNetB0, and VGG16 indicate that the proposed methodology consistently outperforms the others regarding accuracy, precision, recall, F1-score, and AUC. Significantly, it attains enhanced segmentation outcomes, as demonstrated by higher DSC and IoU metrics for both CT and CXR images. Furthermore, the use of Bayesian Optimization for hyperparameter adjustment, including learning rate, batch size, and optimizer, has led to faster convergence and enhanced generalization. An examination using a confusion matrix also reveals that the proposed model has the lowest misclassification rate, highlighting its dependability in clinical applications. In summary, the proposed CNNs-Lattice enhanced ResNet50 model offers an efficient, interpretable, and scalable solution for the prompt and precise identification of COVID-19 and related pulmonary conditions, making it particularly well-suited for real-time application in medical diagnostic systems, especially in resource-limited environments where RT-PCR testing may be restricted. The proposed framework is a very high performing one, but some aspects of this are still limited. The model was tested primarily on publicly available datasets and could not be optimized for smaller datasets, images captured by different imaging devices, and real clinical settings. Future work will focus on multi-center clinical validation and improving model generalization for real-time deployment.

Nomenclature

AI

Artificial Intelligence

AUC

Area Under Curve

BL

Borderline

CAM

Class Activation Mapping

C-GAN

conditional generative adversarial network

CLAHE

Contrast Limited Adaptive Histogram Equalization

CM

Confusion Matrix

CNN

Convolutional Neural Networks

COVID-19

Coronavirus Disease 2019

CT

Computed Tomography

CXR

Chest X-rays

DL

Deep Learning

GAP

Global Average Pooling

IoU

Intersection over Union

LC

Lung Cancer

RT-PCR

Reverse Transcription Polymerase Chain Reaction

SARS-CoV-2

Severe Acute Respiratory Syndrome Coronavirus 2

TB

Tuberculosis

U-Net

Universal Net

  References

[1] Bhatele, K.R., Jha, A., Tiwari, D., Bhatele, M., Sharma, S., Mithora, M.R., Singhal, S. (2024). COVID-19 detection: A systematic review of machine and deep learning-based approaches utilizing chest X-rays and CT scans. Cognitive Computation, 16(4): 1889-1926. https://doi.org/10.1007/s12559-022-10076-6

[2] Jangam, E., Barreto, A.A.D., Annavarapu, C.S.R. (2022). Automatic detection of COVID-19 from chest CT scan and chest X-Rays images using deep learning, transfer learning and stacking. Applied Intelligence, 52(2): 2243-2259. https://doi.org/10.1007/s10489-021-02393-4

[3] Xue, X., Chinnaperumal, S., Abdulsahib, G.M., Manyam, R.R., Marappan, R., Raju, S.K., Khalaf, O.I. (2023). Design and analysis of a deep learning ensemble framework model for the detection of COVID-19 and pneumonia using large-scale CT scan and X-ray image datasets. Bioengineering, 10(3): 363. https://doi.org/10.3390/bioengineering10030363 

[4] Hayat, A., Baglat, P., Mendonça, F., Mostafa, S.S., Morgado-Dias, F. (2023). Novel comparative study for the detection of COVID-19 using CT scan and chest X-ray images. International Journal of Environmental Research and Public Health, 20(2): 1268. https://doi.org/10.3390/ijerph20021268

[5] Abdullah, M., berhe Abrha, F., Kedir, B., Tagesse, T.T. (2024). A hybrid deep learning CNN model for COVID-19 detection from chest X-rays. Heliyon, 10(5): e26938. https://doi.org/10.1016/j.heliyon.2024.e26938

[6] Goyal, L., Dhull, A., Singh, A., Kukreja, S., Singh, K.K. (2023). VGG-COVIDNet: A novel model for COVID detection from X-Ray and CT Scan images. Procedia Computer Science, 218: 1926-1935. https://doi.org/10.1016/j.procs.2023.01.169

[7] Lee, M.H., Shomanov, A., Kudaibergenova, M., Viderman, D. (2023). Deep learning methods for interpretation of pulmonary CT and X-ray images in patients with COVID-19-related lung involvement: A systematic review. Journal of Clinical Medicine, 12(10): 3446. https://doi.org/10.3390/jcm12103446

[8] Alshahrni, M.M., Ahmad, M.A., Abdullah, M., Omer, N., Aziz, M. (2023). An intelligent deep convolutional network based COVID-19 detection from chest X-rays. Alexandria Engineering Journal, 64: 399-417. https://doi.org/10.1016/j.aej.2022.09.016

[9] Sekar, K., Dheepa, T. (2024). Accurate detection of COVID-19 and pneumonia from chest X-rays and CT images using DNN. In 2024 International Conference on Distributed Computing and Optimization Techniques (ICDCOT), Bengaluru, India, pp. 1-6. https://doi.org/10.1109/icdcot61034.2024.10516026

[10] Mathesul, S., Swain, D., Satapathy, S.K., Rambhad, A., Acharya, B., Gerogiannis, V.C., Kanavos, A. (2023). COVID-19 detection from chest X-ray images based on deep learning techniques. Algorithms, 16(10): 494. https://doi.org/10.3390/a16100494

[11] Talukder, M.A., Layek, M.A., Kazi, M., Uddin, M.A., Aryal, S. (2024). Empowering COVID-19 detection: Optimizing performance through fine-tuned EfficientNet deep learning architecture. Computers in Biology and Medicine, 168: 107789. https://doi.org/10.1016/j.compbiomed.2023.107789

[12] Mukherjee, H., Ghosh, S., Dhar, A., Obaidullah, S.M., Santosh, K.C., Roy, K. (2024). Shallow convolutional neural network for COVID-19 outbreak screening using chest X-rays. Cognitive Computation, 16(4): 1695-1708. https://doi.org/10.1007/s12559-020-09775-9

[13] Mukhi, S.E., Varshini, R.T., Sherley, S.E.F. (2023). Diagnosis of COVID-19 from multimodal imaging data using optimized deep learning techniques. SN Computer Science, 4(3): 212. https://doi.org/10.1007/s42979-022-01653-5

[14] Gaur, L., Bhatia, U., Jhanjhi, N.Z., Muhammad, G., Masud, M. (2023). Medical image-based detection of COVID-19 using deep convolution neural networks. Multimedia Systems, 29(3): 1729-1738. https://doi.org/10.1007/s00530-021-00794-6

[15] Aslani, S., Jacob, J. (2023). Utilisation of deep learning for COVID-19 diagnosis. Clinical Radiology, 78(2): 150-157. https://doi.org/10.1016/j.crad.2022.11.006

[16] Gouda, W., Almurafeh, M., Humayun, M., Jhanjhi, N.Z. (2022). Detection of COVID-19 based on chest X-rays using deep learning. Healthcare, 10(2): 343. https://doi.org/10.3390/healthcare10020343

[17] Bhattacharyya, A., Bhaik, D., Kumar, S., Thakur, P., Sharma, R., Pachori, R.B. (2022). A deep learning based approach for automatic detection of COVID-19 cases using chest X-ray images. Biomedical Signal Processing and Control, 71: 103182. https://doi.org/10.1016/j.bspc.2021.103182

[18] Malik, H., Anees, T., Al-Shamaylehs, A.S., Alharthi, S.Z., Khalil, W., Akhunzada, A. (2023). Deep learning-based classification of chest diseases using X-rays, CT scans, and cough sound images. Diagnostics, 13(17): 2772. https://doi.org/10.3390/diagnostics13172772

[19] Malik, H., Anees, T., Din, M., Naeem, A. (2023). CDC_Net: Multi-classification convolutional neural network model for detection of COVID-19, pneumothorax, pneumonia, lung Cancer, and tuberculosis using chest X-rays. Multimedia Tools and Applications, 82(9): 13855-13880. https://doi.org/10.1007/s11042-022-13843-7

[20] Kathamuthu, N.D., Subramaniam, S., Le, Q.H., Muthusamy, S., et al. (2023). A deep transfer learning-based convolution neural network model for COVID-19 detection using computed tomography scan images for medical applications. Advances in Engineering Software, 175: 103317. https://doi.org/10.1016/j.advengsoft.2022.103317

[21] Arulananth, T.S., Prakash, S.W., Ayyasamy, R.K., Kavitha, V.P., Kuppusamy, P.G., Chinnasamy, P. (2024). Classification of paediatric pneumonia using modified DenseNet-121 deep-learning model. IEEE Access, 12: 35716-35727. https://doi.org/10.1109/access.2024.3371151

[22] Kumar, S., Kumar, H. (2024). Efficient-VGG16: A novel ensemble method for the classification of COVID-19 X-ray images in contrast to machine and transfer learning. Procedia Computer Science, 235: 1289-1299. https://doi.org/10.1016/j.procs.2024.04.122

[23] Srinivas, K., Gagana Sri, R., Pravallika, K., Nishitha, K., Polamuri, S.R. (2024). COVID-19 prediction based on hybrid Inception V3 with VGG16 using chest X-ray images. Multimedia Tools and Applications, 83(12): 36665-36682. https://doi.org/10.1007/s11042-023-15903-y

[24] Islam, N., Mohsin, A.S., Choudhury, S.H., Shaer, T.P., Islam, M.A., Sadat, O., Taz, N.H. (2024). COVID-19 and Pneumonia detection and web deployment from CT scan and X-ray images using deep learning. PLoS One, 19(7): e0302413. https://doi.org/10.1371/journal.pone.0302413

[25] Ibrahim, A.U., Ozsoz, M., Serte, S., Al-Turjman, F., Yakoi, P.S. (2024). Pneumonia classification using deep learning from chest X-ray images during COVID-19. Cognitive Computation, 16(4): 1589-1601. https://doi.org/10.1007/s12559-020-09787-5