© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
The integration of Internet of Things (IoT) and Artificial Intelligence (AI) offers significant opportunities for proactive cardiac healthcare. However, existing solutions often struggle with noisy ECG data, lack real-time IoT adaptability, and provide limited deployment strategies. This study introduces a novel IoT-enabled hybrid framework that combines Empirical Mode Decomposition (EMD)-based signal denoising with a CNN-LSTM deep learning model for robust and real-time ECG analysis. Unlike prior works, the proposed approach emphasizes adaptive preprocessing, cloud-based deployment with EHR interoperability, and strong security protocols for clinical integration. Experimental results on the MIT-BIH dataset demonstrate 96.5% accuracy, 95.2% precision, and an F1-score of 94.9%, outperforming conventional CNN and LSTM models. Statistical tests confirm significance (p < 0.05). This research bridges the gap between algorithmic accuracy and practical IoT deployment, paving the way for intelligent, secure, and real-time cardiac monitoring systems.
IoT, deep learning, AI in healthcare, heart monitoring, ECG, real-time health monitoring, smart devices, EMD
Cardiovascular diseases (CVDs) remain one of the leading causes of mortality worldwide, posing a critical challenge to healthcare systems. Traditional cardiac monitoring methods, which rely on periodic clinical evaluations, often fail to detect early signs of cardiac abnormalities, resulting in delayed diagnosis and treatment. Recent advancements in the Internet of Things (IoT) and Artificial Intelligence (AI), particularly Deep Learning techniques, offer a promising solution for continuous and intelligent cardiac monitoring.
IoT-enabled wearable and implantable devices allow real-time acquisition of physiological signals such as ECG, heart rate, and oxygen saturation in patients’ daily environments. This continuous stream of high-resolution data provides an opportunity for proactive and remote monitoring. However, the large volume and complexity of ECG signals require advanced AI models capable of extracting clinically relevant features and predicting potential abnormalities accurately.
Existing research primarily focuses on either Convolutional Neural Networks (CNNs) for spatial feature extraction or Long Short-Term Memory (LSTM) networks for temporal pattern analysis. While these models perform well individually, limited work has explored an optimized combination of both approaches in a real-time IoT framework. Moreover, issues such as interoperability with Electronic Health Records (EHR), data security, and deployment scalability are often overlooked in current studies.
To address these gaps, this paper proposes an IoT-enabled hybrid deep learning framework that integrates CNN and LSTM to leverage their complementary strengths. The CNN component effectively captures morphological features of ECG signals, such as QRS complex and ST-segment variations, while the LSTM component models temporal dependencies for accurate prediction of rhythm irregularities. This end-to-end architecture supports real-time data acquisition, cloud-based analytics, secure communication protocols, and integration with EHR systems. The proposed solution aims to enable early detection of cardiac anomalies, timely alerts, and improved patient outcomes, particularly for individuals with chronic cardiovascular conditions or those in remote locations.
This work [1] proposes an AIoT-based crowd counting system using deep learning and edge-cloud integration for accurate, real-time estimation, validated on benchmark datasets. As urbanization grows, cities struggle to ensure secure, sustainable living. IoT and AI together enable smarter, more efficient cities by connecting devices and analyzing data. This review [2] covers smart city concepts, IoT architecture, communication technologies, AI algorithms, and their integration with 5G, highlighting their role in improving urban sustainability and quality of life. This study [3] analyzes the rise of AI, IoT, and Big Data in enabling environmentally sustainable smart cities, highlighting rapid growth, evolving challenges, and implications for policy and practice. This paper [4] presents a data fusion and dynamic load balancing approach to reduce redundancy and optimize edge server use in dense IoT networks, improving QoS and energy efficiency. This paper [5] reviews the edge–fog–cloud computing paradigm, highlighting its role in enhancing IoT by addressing latency, bandwidth, and scalability challenges through distributed AI and analytics.
Deep learning mimics brain functions to analyze medical data for disease diagnosis and management. This paper [6] reviews DL applications in healthcare, compares key studies, and highlights challenges and future research. 5G-IoT is vital for e-health applications, where securing patient data is crucial. This paper [7] proposes CNN-DMA, a deep learning model using CNN layers to detect malware attacks in cloud-stored health data. Tested on the Malimg dataset, CNN-DMA achieved 99% accuracy, outperforming existing methods. Accurate brain tumor classification is vital for brain cancer diagnosis in IoT healthcare. This study [8] proposes an improved CNN model using MRI data, enhanced by data augmentation and transfer learning, achieving higher accuracy than existing methods, making it suitable for IoT-based diagnosis. This study [9] presents a lightweight CNN for cardiovascular disease classification from ECG images that runs efficiently on a single CPU and improves traditional machine learning performance through feature extraction. This study [10] presents a 1D CNN model for early heart disease detection using balanced data and clinical parameters, overcoming limitations of traditional machine learning and improving diagnostic accuracy.
This paper [11] analyzes LSTM-based deep learning models for ECG heartbeat classification using the MIT-BIH dataset. The bi-directional LSTM variant performs best, showing strong accuracy and reliability, confirming LSTM’s suitability for heart disease diagnosis. This study [12] presents a hybrid method combining PCA and LSTM for classifying 16 types of cardiac arrhythmias from ECG data. It improves accuracy by reducing noise and handling ECG signal variability more effectively than traditional approaches. This study [13] proposes a smart healthcare system for heart disease diagnosis using ECG signals by combining CNN and LSTM to extract signal features. It addresses data imbalance with SMOTE and uses gated pooling for feature reduction, showing strong performance on standard ECG datasets. This study [14] enhances remote cardiac care by combining IoT data and clinical records, using XGBoost and Bi-LSTM for accurate heart disease prediction, outperforming traditional models. IoT combined [15] with cloud computing enables proactive healthcare through deep learning. Using Bi-LSTM on IoT and health record data, the system predicts heart disease risk with high accuracy, outperforming existing methods.
The literature reveals three main gaps:
(1) limited critique of hybrid approaches in real-world IoT deployments,
(2) inadequate focus on preprocessing strategies tailored for ECG noise reduction, and
(3) absence of an end-to-end architecture addressing data acquisition, AI-driven analytics, and secure integration with healthcare systems.
The proposed CNN-LSTM-based IoT framework aims to fill these gaps by combining morphological and temporal feature learning, implementing robust preprocessing pipelines and ensuring cloud-based scalability with secure EHR integration for continuous cardiac health monitoring.
Table 1 presents a comparative analysis of various deep learning architectures applied to cardiac signal processing, while Table 2 summarizes key research studies focusing on CNN-LSTM and hybrid deep learning frameworks in healthcare applications.
Table 1. Comparison of deep learning models for cardiac signal analysis
Model |
Strengths |
Suitable Tasks |
CNN |
Spatial feature extraction |
Beat classification, waveform analysis |
LSTM |
Temporal sequence modeling |
Rhythm prediction, temporal pattern recognition |
CNN + LSTM |
Combined spatial and temporal learning |
Complex heartbeat classification, predictive modeling |
Transformer |
Long-range dependency and attention |
Long sequence analysis, multi-modal data |
Autoencoder |
Unsupervised anomaly detection |
Detecting unknown or rare anomalies |
Table 2. Key studies on CNN-LSTM and hybrid deep learning models in healthcare
No. |
Authors |
Application Domain |
Dataset/Platform |
Key Contributions/Findings |
1 |
Rayan et al. [16] |
Medical systems enhancement |
Not specified |
Improved medical system performance using CNN-LSTM hybrid |
2 |
Rai and Chatterjee [17] |
Myocardial infarction detection |
Large ECG datasets |
Automated MI detection with high accuracy using ensemble approach |
3 |
Elbagoury et al. [18] |
Stroke prediction |
Mobile AI smart hospital platform |
Novel hybrid deep learning model for stroke prediction on mobile platform |
4 |
Begum et al. [19] |
Breast cancer diagnosis |
Not specified |
Combined CNN-LSTM with RF for improved breast cancer diagnosis |
5 |
Amin et al. [6] |
Healthcare broadly |
N/A |
Discusses issues, challenges, and opportunities in DL healthcare |
6 |
Ayus and Gupta [20] |
Alzheimer’s identification |
Biomedical signal datasets |
Novel hybrid DL system for Alzheimer’s detection |
7 |
Hiriyannaiah et al. [11] |
Heartbeat classification |
MIT-BIH arrhythmia dataset |
Comparative analysis of LSTM models for ECG heartbeat classification |
8 |
Sowmya and Jose [21] |
Arrhythmia signal classification |
ECG signals datasets |
CNN-LSTM based model for arrhythmia classification |
The proposed IoT-enabled cardiac monitoring system consists of three interconnected layers: IoT Layer, Data Processing Layer, and AI/Deep Learning Layer. These layers collaboratively enable secure, real-time data acquisition, analysis, and integration with clinical systems such as Electronic Health Records (EHR). The components of the IoT architecture layer are depicted in Figure 1.
Figure 1. IoT layer components
3.1 IoT layer
3.2 Data processing layer
3.3 AI/Deep learning layer
The architecture's first phase for extracting spatial features is the Convolutional Neural Network (CNN) component. This involves identifying significant localized signal features, such as peaks, slopes, and durations within the waveform—more particularly, elements like the P-wave, QRS complex, and T-wave that are involved in the processing of ECG signals. To find these local features, 1D convolutional layers typically employed, in which filters traverse across the ECG signal. Activation functions (such as ReLU), pooling layers (like MaxPooling1D) to reduce dimensionality, and batch normalization to enhance training stability may also be included in this step. Once the CNN has extracted the spatial features, the output is passed through a Flatten or Reshape layer to prepare the data for temporal modeling. This transformation reshapes the data into a format compatible with the Long Short-Term Memory (LSTM) network, converting the multi-channel spatial outputs into a time-series-like sequence of feature vectors.
The model's temporal learning stage is depicted by the LSTM component, which appears next. A specific type of Recurrent Neural Network (RNN), LSTMs are exceptionally good at discovering sequential patterns and long-term dependencies in time-series data. The LSTM learns to recognize irregular intervals, rhythmic patterns, and temporal anomalies between heartbeats in ECG signals by recording how features shift over time. This is particularly critical for identifying diseases like arrhythmias, which are characterized by erratic heartbeat patterns and temporal spacing.
The outcome is passed into a number of Dense (fully connected) layers for final feature integration and decision-making after passing through the LSTM layers. Depending on the task, the output layer, which is the last layer, utilizes the proper activation function. For instance, a sigmoid is used for binary classification tasks (such as normal vs. abnormal), whereas a softmax activation is used for multi-class classification (such as distinguishing between different cardiac conditions).
Using the CNN and LSTM components, the model first investigates the provided ECG signal to extract crucial features, both temporally and spatially. The form of heartbeat found in each ECG segment is then determined using these features that have been extracted. For instance, it may differentiate between different kinds of arrhythmic patterns, including bundle branch blocks, supraventricular ectopic beats, and premature ventricular contractions (PVCs), and a typical sinus rhythm. This classification aids doctors in determining whether the heart is beating normally or if there are anomalies that call for additional research or medical attention.
The model can be trained to predict possible risk patterns in addition to classification. Learning from a wide range of structured ECG signals, including those linked to previous cardiac events, the model is able to identify minute irregularities or shifts from normal patterns that could point to a higher risk of cardiac problems in subsequent years. Potential warning indicators could include, for example, persistent irregularities in heartbeat intervals or a gradual lowering of specific ECG components.
IoT-based wearable devices and remote health monitoring systems, which regularly gather and evaluate real-time ECG data, benefit greatly from these predictive features. The CNN-LSTM model can be used in these systems to continuously track patients and notify medical professionals of possible risks before clinical signs appear.
The model is divided following phases:
4.1 Data acquisition
To assure complete representation and generalization, cardiovascular signals from multiple sources are gathered and aggregated during the data gathering phase. The MIT-BIH Arrhythmia Database, which provides comprehensive ECG recordings with appropriate diagnostic labels, is one example of an annotated dataset. Furthermore, IoT-enabled biomedical sensors—such as wearable and implantable devices—provide real-time data streams by gathering physiological signals from patients in clinical or mobile settings. The dataset is robust for training and testing intelligent models due to the integration of these sources, which assures that it reflects both typical and unconventional cardiac behavior under various conditions.
4.2 Preprocessing
In addition to technical constraints, patient movement, and environmental disturbances, raw cardiac signals from sensors are vulnerable to noise and irregularities. Preprocessing is therefore required for transforming noisy, unstructured inputs into clean, standardized data that can be used to train models.
4.2.1 Signal denoising
In an attempt to reduce unwanted noise while retaining clinically significant features, denoising is used. Baseline wander, powerline disruption, and muscle distortions are removed, and the frequency factors of interest are distinguished using methods like bandpass filtering, wavelet decomposition, and Empirical Mode Decomposition (EMD).
4.2.2 Normalization
Normalization methods like min-max scaling or z-score standardization are used to remove bias brought on by variances in signal amplitudes among sensors or subjects. This enhances efficiency and reliability during training by ensuring that the input signals received by the deep learning model fall within a consistent range.
4.2.3 Segmentation
In addition to the persistent nature of ECG signals, temporal analysis is made simpler by segmenting the data into distinct windows or cardiac cycles. Individual heartbeat segments can be extracted with the aid of techniques like peak detection-driven segmentation and predetermined-length windowing. Deep learning models then use these segments as units of input.
4.2.4 Artifact removal
Noise removal methods like threshold-based filters or Independent Component Analysis (ICA) are used to reduce non-cardiac noise and improve signal quality even more. Cleaner signal inputs enhance model precision and lower false alarms as a result.
These preprocessing methods work together to convert noisy, unstructured raw data into clean, standardized, and well-segmented inputs, which help deep learning algorithms detect and anticipate cardiac abnormalities. Therefore, the success of the intelligent IoT-enabled heart monitoring system recommended in this study depends critically on efficient preprocessing.
4.3 Input representation
After preprocessing, each ECG segment is formatted as a one-dimensional time-series vector:
$x=\left[x_1, x_2, \ldots, x_T\right], \cdots \cdots x \in R^T$ (1)
where, T denotes the number of time steps in a single segment. This representation serves as the input to the convolutional layers, capturing both local and global morphological features necessary for downstream classification.
4.4 CNN-based feature extraction
The initial stage of the deep learning pipeline involves Convolutional Neural Networks (CNNs) that extract spatial features from the input signal. CNNs are particularly effective in identifying morphological patterns in ECG signals, such as QRS complexes, P waves, and T wave irregularities.
4.4.1 Convolution operation
Convolutional filters slide across the input signal to detect local patterns:
$y_i=\sum_{j=1}^k w_j * x_{i+j}+b$ (2)
where, k is the kernel size, $w_j$ are the filter weights, and b is the bias.
4.4.2 Batch normalization and ReLU
Batch normalization is applied to stabilize learning, followed by a ReLU activation function:
$f(x)=\max (0, x)$ (3)
These steps improve model convergence and address vanishing gradient problems.
4.4.3 Max pooling and dropout
Max pooling downsamples the feature map, reducing computational load and emphasizing dominant features:
$y_i=\max \left(x_i, x_{i+1}, \ldots, x_{i+p}\right)$ (4)
Dropout regularization prevents overfitting by randomly disabling a fraction of neurons during training.
4.4.4 Fully connected layer
Extracted features are passed through a dense layer to consolidate local patterns into higher-order representations:
$y=w_x+b$ (5)
4.5 LSTM-based temporal modeling
The output of the CNN block is fed into Long Short-Term Memory (LSTM) layers to model the temporal dependencies between heartbeats across segments.
4.5.1 LSTM mechanics
LSTM networks capture long-term dependencies using gates and memory cells. The core operations include:
Forget gate:
$f_t=\sigma\left(W_{f\left[h_{t-1}, x_t\right]}+b_f\right)$ (6)
Input gate:
$\begin{aligned} i_t=\sigma\left(W_{i\left[h_{t-1}, x_t\right]}\right. & \left.+b_i\right), \cdots C_t^{\sim} \\ & =\tanh \left(W_c\left[h_{t-1}, x_t\right]+b_c\right.\end{aligned}$ (7)
Cell state update:
$C_t=f_t * C_{t-1}+i_t * C_t^{\sim}$ (8)
These mechanisms allow the model to retain clinically important temporal dynamics that span across multiple cardiac cycles.
4.5.2 Final dense and softmax layers
The LSTM output is passed through one or more fully connected layers, followed by a softmax layer:
$P\left(y_i\right)=\frac{e^{z_i}}{\sum_{j=1}^N e^{z_j}}, y=\operatorname{argmax} P\left(y_i\right)$ (9)
This produces the class probabilities, allowing the model to predict heart conditions such as normal rhythm, arrhythmia, or other cardiac abnormalities.
4.6 Customized CNN–LSTM for ECG signal processing
To address ECG-specific characteristics and ensure robust detection of arrhythmias and related anomalies, we introduce a customized CNN–LSTM model that combines spatial and temporal feature learning with physiological constraints. Unlike generic CNN and LSTM formulations, this design incorporates beat-synchronous segmentation, multi-scale convolutions matched to ECG wave durations, morphology-aware regularization, RR-interval conditioning, and a temporally consistent early-alarm loss. The proposed model processes ECG signals as follows:
4.6.1 Beat-synchronous input representation
Each ECG segment is extracted around R-peak positions to ensure alignment with cardiac cycles:
$x_t=\Pi\left(X,\left[r_t-|W / 2|, r_t+|W / 2|\right]\right) \in \mathbb{R}^{L \times W}$ (10)
where, X is the multi-lead ECG input, rtr_trt is the R-peak index for beat t, L is the number of leads, and W is the window length in samples.
4.6.2 Lead-attention fusion
Physiologically important leads (e.g., V1, V2 for QRS morphology) are adaptively weighted:
$x_t^{\sim}(\tau)=\sum_{l=1}^L x_t^{(l)}(T), \quad \alpha_{\ell}=\frac{\exp \left(\alpha_{\ell}\right)}{\sum_{j=1}^L \exp \left(\alpha_j\right)}$ (11)
where, $\alpha_{\ell}$ are attention scores for each lead.
4.6.3 Multi-scale dilated convolutions
To capture P-wave, QRS, and T-wave patterns, we use multiple convolution kernels with physiologically tuned receptive fields:
$\begin{gathered}f_t^{(m)}(T)=\sigma\left(b_m+\sum_{k=0}^{K_m-1} w_k^{(m)} x_t^{\sim}\left(T+d_m k\right)\right) \\ m=1, \ldots, M\end{gathered}$ (12)
where, $d_m$ controls dilation (spacing) and $K_m$ is the kernel size.
4.6.4 Morphology regularization for QRS complex
To enforce sharp, near-symmetric QRS detection, we regularize CNN kernels:
$\begin{aligned} R_{\text {morph }}=\lambda_{\text {sym }} & \cdot \sum_{m \in Q} \sum_k\left(w_k^{(m)}+w_{K_m-1-k}^m\right)^2 \\ & +\lambda_{t v} \sum_{m \in Q} \sum_k\left(w_{k+1}^{(m)}-w_{K_{m+1}}^{(m)}\right. \\ & \left.-w_k^{(m)}\right)^2\end{aligned}$ (13)
where, Q is the set of QRS-focused kernels.
4.6.5 RR-interval conditioning
Rhythm variability is encoded by concatenating RR intervals with CNN features:
$s_t=\varphi\left(G A P\left(f_t\right) \| z_t\right)$ (14)
where, $z_t$ represents a short RR history and GAP is global average pooling.
4.6.6 LSTM temporal modeling
$\begin{gathered}i_t=\sigma\left(W_i s_t+U_i h_{t-1}+b_i\right), \\ f_t=\sigma\left(W_f s_t+U_f h_{t-1}+b_f\right), \\ \tilde{c}_t=\tanh \left(W_c s_t+U_c h_{t-1}+b_c\right), \\ o_t=\sigma\left(W_o s_t+U_o h_{t-1}+b_o\right), \\ c_t=f_t \odot c_{t-1}+i_t \odot \tilde{c}_t, \\ h_t=o_t \odot \tanh \left(c_t\right)\end{gathered}$ (15)
4.6.7 Early-alarm prediction & loss
To encourage early detection, predictions are made h beats ahead:
$p_{t+h}=\operatorname{softmax}\left(V h_t+b\right)$ (16)
The overall loss combines class-weighted cross-entropy, temporal consistency, early-alarm emphasis, and morphology regularization:
$\begin{aligned} J=L_{C E} & +\lambda_{\text {temp }} L_{\text {temp }}+\lambda_{\text {early }} L_{\text {early }} \\ & +\lambda_{\text {morph }} R_{\text {morph }}+\lambda_2 \Sigma| | \Theta| |^2\end{aligned}$ (17)
where, Θ denotes all learnable parameters.
4.7 Model training
The training process for the proposed CNN-LSTM model involves leveraging annotated cardiovascular signal datasets, where each signal segment is labeled to indicate specific heart conditions such as normal rhythm, arrhythmias, or other abnormalities. Initially, the preprocessed data—consisting of clean, normalized, and segmented heart signal windows—is fed into the Convolutional Neural Network (CNN) component. The CNN automatically extracts spatial features, such as morphological patterns in the heartbeats, by applying a series of filters across the signal. These high-level features are then passed to the Long Short-Term Memory (LSTM) layers, which are responsible for learning the temporal dependencies and sequential dynamics present in cardiac signals over time. The model undergoes supervised learning, where the predicted outputs are compared to ground truth labels, and the error is minimized through backpropagation and optimization algorithms such as Adam or RMSprop. To enhance generalization and robustness, techniques like dropout, early stopping, and batch normalization are employed during training. Furthermore, the trained model is validated using a separate set of real-time or unseen input data to evaluate its accuracy, sensitivity, specificity, and overall performance in detecting cardiac anomalies. This ensures the model's reliability and readiness for deployment in real-world, IoT-enabled heart monitoring systems.
The CNN-LSTM model was trained using the MIT-BIH Arrhythmia dataset and real-time ECG segments. The training process involved supervised learning with categorical cross-entropy as the loss function, optimized using the Adam optimizer with an initial learning rate of 0.001. Hyperparameters were tuned through a grid search over learning rates [0.0001, 0.001, 0.01][0.0001, 0.001, 0.01][0.0001,0.001,0.01], batch sizes [32,64,128][32, 64, 128][32,64,128], and dropout rates [0.2,0.5][0.2, 0.5][0.2,0.5]. The final configuration employed a batch size of 64, dropout of 0.3, and early stopping with a patience of 10 epochs to prevent overfitting. The model was trained for 50 epochs, with an 80:10:10 split for training, validation, and testing. Regularization was achieved through L2 weight decay (λ = 0.0001) and dropout layers.
4.8 Deployment
Once the deep learning model (e.g., CNN or LSTM) is trained and validated using preprocessed cardiac data, it is deployed to a cloud-based server environment. This cloud infrastructure provides the scalability, processing power, and storage capacity required to handle continuous streams of data from multiple IoT-enabled wearable or implantable sensors. The model is integrated into a backend system that supports real-time inference, allowing it to analyze incoming heart signals and detect anomalies on the fly. To ensure accessibility and usability, the system is connected to a user-facing mobile or web application. This application serves as the interface through which patients, caregivers, or healthcare professionals can monitor heart activity in real time, receive alerts about irregular patterns, and access historical health data. The deployment not only ensures 24/7 availability and seamless access but also facilitates remote healthcare by bridging the gap between real-time data acquisition and clinical decision-making. Security and privacy measures, such as encryption and authentication protocols, are implemented to safeguard sensitive health data during transmission and storage. The methodology is illustrated in Figure 2.
Figure 2. CNN-LSTM model architecture
Table 3 presents a comparative analysis of the performance of three deep learning models—CNN, LSTM, and the hybrid CNN+LSTM—evaluated using four key metrics: Accuracy, Precision, Recall, and F1-Score. This comparison highlights the strengths and weaknesses of each model in processing and interpreting cardiac signals. Correspondingly, Figure 3 illustrates these results in a graphical format, clearly showing that the CNN+LSTM hybrid model consistently outperforms both standalone models across all evaluation criteria. This superior performance underscores the hybrid model's robustness and enhanced capability in accurately detecting and classifying heart signal anomalies, making it a more effective solution for real-time cardiac monitoring applications.
Table 3. Performance comparison of CNN, LSTM and proposed model
Metric |
CNN Value |
LSTM Value |
CNN+LSTM Value |
Accuracy |
92.41% |
91.89% |
96.5% |
Precision |
92.12% |
91.34% |
95.2% |
Recall |
91.98% |
90.68% |
94.7% |
F1-Score |
92.03% |
91.11% |
94.9% |
Figure 3. Performance comparison of CNN, LSTM and proposed model
These intervals indicate that the proposed CNN+LSTM model consistently outperforms standalone models with a high level of confidence, reinforcing its robustness for real-world deployment.
Statistical Significance Analysis:
To validate whether the performance improvements of the proposed CNN+LSTM model over standalone CNN and LSTM models are statistically significant, we conducted a paired t-test on cross-validation folds. The null hypothesis (H₀) assumes no significant difference between models, while the alternative hypothesis (H₁) assumes the CNN+LSTM model performs better.
Results indicate p-values < 0.05 for all metrics (Accuracy, Precision, Recall, F1-score), confirming that the performance gains of the hybrid model are statistically significant as shown in Table 4.
Table 4. Statistical significance (p-values) of performance comparison between models
Comparison |
Accuracy p-Value |
Precision p-Value |
Recall p-Value |
F1-Score p-Value |
CNN vs CNN+LSTM |
0.002 |
0.004 |
0.003 |
0.005 |
LSTM vs CNN+LSTM |
0.001 |
0.003 |
0.002 |
0.004 |
The integration of IoT and AI-driven deep learning provides a transformative approach for continuous, real-time cardiac monitoring. By leveraging wearable devices for data acquisition and a hybrid CNN–LSTM model for spatial and temporal feature analysis, the proposed framework enhances diagnostic accuracy and enables proactive interventions. Cloud integration further ensures scalability, secure data access, and seamless connectivity for remote healthcare applications. Experimental results validate its robustness, with the hybrid model achieving 96.5% accuracy, 95.2% precision, and an F1-score of 94.9%, outperforming standalone CNN and LSTM models. This work demonstrates a practical, intelligent solution for early detection and management of cardiac anomalies, bridging the gap between algorithmic innovation and real-world deployment.
The proposed CNN-LSTM IoT framework, while effective on benchmark datasets, faces key limitations. First, reliance on the MIT-BIH dataset introduces demographic bias, limiting generalizability. Second, real-world deployments may encounter noise, motion artifacts, and inconsistent sensor quality.
Future work will explore bias reduction via federated learning, lightweight edge-compatible models, adaptive noise handling, and large-scale clinical trials for real-world validation.
[1] Zhang, T., Zhao, Y., Jia, W., Chen, M.Y. (2021). Collaborative algorithms that combine AI with IoT towards monitoring and control system. Future Generation Computer Systems, 125: 677-686. https://doi.org/10.1016/j.future.2021.07.008
[2] Alahi, M.E.E., Sukkuea, A., Tina, F.W., Nag, A., Kurdthongmee, W., Suwannarat, K., Mukhopadhyay, S.C. (2023). Integration of IoT-enabled technologies and artificial intelligence (AI) for smart city scenario: recent advancements and future trends. Sensors, 23(11): 5206. https://doi.org/10.3390/s23115206
[3] Bibri, S.E., Alexandre, A., Sharifi, A., Krogstie, J. (2023). Environmentally sustainable smart cities and their converging AI, IoT, and big data technologies and solutions: an integrated approach to an extensive literature review. Energy Informatics, 6(1): 9. https://doi.org/10.1186/s42162-023-00259-2
[4] Jan, M.A., Zakarya, M., Khan, M., Mastorakis, S., Menon, V.G., Balasubramanian, V., Rehman, A.U. (2021). An AI-enabled lightweight data fusion and load optimization approach for Internet of Things. Future Generation Computer Systems, 122: 40-51. https://doi.org/10.1016/j.future.2021.03.020
[5] Firouzi, F., Farahani, B., Marinšek, A. (2022). The convergence and interplay of edge, fog, and cloud in the AI-driven Internet of Things (IoT). Information Systems, 107: 101840. https://doi.org/10.1016/j.is.2021.101840
[6] Amin, R., Al Ghamdi, M.A., Almotiri, S.H., Alruily, M. (2021). Healthcare techniques through deep learning: issues, challenges and opportunities. IEEE Access, 9: 98523-98541. https://doi.org/10.1109/ACCESS.2021.3095312
[7] Anand, A., Rani, S., Anand, D., Aljahdali, H. M., Kerr, D. (2021). An efficient CNN-based deep learning model to detect malware attacks (CNN-DMA) in 5G-IoT healthcare applications. Sensors, 21(19): 6346. https://doi.org/10.3390/s21196346
[8] Haq, A.U., Li, J.P., Khan, S., Alshara, M.A., Alotaibi, R.M., Mawuli, C. (2022). DACBT: Deep learning approach for classification of brain tumors using MRI data in IoT healthcare environment. Scientific Reports, 12(1): 15331. https://doi.org/10.1038/s41598-022-19465-1
[9] Abubaker, M.B., Babayiğit, B. (2022). Detection of cardiovascular diseases in ECG images using machine learning and deep learning methods. IEEE Transactions on Artificial Intelligence, 4(2): 373-382. https://doi.org/10.1109/TAI.2022.3159505
[10] Hussain, S., Nanda, S.K., Barigidad, S., Akhtar, S., Suaib, M., Ray, N.K. (2021). Novel deep learning architecture for predicting heart disease using CNN. In 2021 19th OITS International Conference on Information Technology (OCIT), Bhubaneswar, India, pp. 353-357. https://doi.org/10.1109/OCIT53463.2021.00076
[11] Hiriyannaiah, S., GM, S., MHM, K., Srinivasa, K.G. (2021). A comparative study and analysis of LSTM deep neural networks for heartbeats classification. Health and Technology, 11(3): 663-671. https://doi.org/10.1007/s12553-021-00552-8
[12] Khan, M.A., Kim, Y. (2021). Cardiac arrhythmia disease classification using LSTM deep learning approach. Computers, Materials & Continua, 67(1): 427. https://doi.org/10.32604/cmc.2021.014682
[13] Bukhari, M., Yasmin, S., Naz, S., Durrani, M.Y., Javaid, M. (2023). A smart heart disease diagnostic system using deep vanilla LSTM. Computers, Materials & Continua, 77(1): 1251. https://doi.org/10.32604/cmc.2023.040329
[14] Alzakari, S.A., Menaem, A.A., Omer, N., Abozeid, A., Hussein, L.F., et al. (2024). Enhanced heart disease prediction in remote healthcare monitoring using IoT-enabled cloud-based XGBoost and Bi-LSTM. Alexandria Engineering Journal, 105: 280-291. https://doi.org/10.1016/j.aej.2024.06.036
[15] Nancy, A.A., Ravindran, D., Raj Vincent, P.D., Srinivasan, K., Gutierrez Reina, D. (2022). Iot-cloud-based smart healthcare monitoring system for heart disease prediction via deep learning. Electronics, 11(15): 2292. https://doi.org/10.3390/electronics11152292
[16] Rayan, A., Alaerjan, A.S., Alanazi, S., Taloba, A.I., Shahin, O.R., Salem, M. (2023). Utilizing CNN-LSTM techniques for the enhancement of medical systems. Alexandria Engineering Journal, 72: 323-338. https://doi.org/10.1016/j.aej.2023.04.009
[17] Rai, H.M., Chatterjee, K. (2022). Hybrid CNN-LSTM deep learning model and ensemble technique for automatic detection of myocardial infarction using big ECG data. Applied Intelligence, 52(5): 5366-5384. https://doi.org/10.1007/s10489-021-02696-6
[18] Elbagoury, B.M., Vladareanu, L., Vlădăreanu, V., Salem, A.B., Travediu, A.M., Roushdy, M.I. (2023). A hybrid stacked CNN and residual feedback GMDH-LSTM deep learning model for stroke prediction applied on mobile AI smart hospital platform. Sensors, 23(7): 3500. https://doi.org/10.3390/s23073500
[19] Begum, A., Dhilip Kumar, V., Asghar, J., Hemalatha, D., Arulkumaran, G. (2022). A combined deep CNN: LSTM with a random forest approach for breast cancer diagnosis. Complexity, 2022(1): 9299621. https://doi.org/10.1155/2022/9299621
[20] Ayus, I., Gupta, D. (2024). A novel hybrid ensemble based Alzheimer’s identification system using deep learning technique. Biomedical Signal Processing and Control, 92: 106079. https://doi.org/10.1016/j.bspc.2024.106079
[21] Sowmya, S., Jose, D. (2022). Contemplate on ECG signals and classification of arrhythmia signals using CNN-LSTM deep learning model. Measurement: Sensors, 24: 100558. https://doi.org/10.1016/j.measen.2022.100558