© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Knowing emotional states from physiological data became important in scientific filed in bringing major consequences in mental health assessment and advanced human computer interaction. This study employs multi-channel EEG signals obtained during rigorously controlled affective trials to classify human emotional states. The data processing pipeline uses strict approaches for pretreatment and feature extraction that are meant to improve signal quality and keep important time information. To explain the complex sequential links in EEG data, advanced DL architectures such as Long Short-Term Memory (LSTM) networks, Gated Recurrent Units (GRU), and Deep Neural Networks (DNN) are used. Comparative studies demonstrate that recurrent neural networks surpass conventional models, achieving classification accuracy over 97%. This paper tells the deep temporal model can shows the various emotional states using EEG data. This paper encourages in advanced technologies in emotion adaptive system and brain computer interfaces which brings more natural and responsive communication using real emotional data.
EEG, emotion recognition, DL, LSTM, GRU, DNN, affective computing, brain-computer interface, time-series analysis, mental health monitoring
People’s different reactions are based on their emotions. Due to this they can act according to their mental conditions. For this reason people thought of developing different machines which can detect human emotions in advance and act according to situation. By knowing human emotions in advance chaos may be avoided. By bringing advanced technologies in emotion recognition makes the user act according to users emotional state.
EEG signals play a major role in knowing emotional differences because they acquire electrical activity in the brain. This means they can find subtle fluctuations in how neurons are firing that are connected to different moods [1]. However, EEG signals naturally exhibit a composite nature: they consist of high-dimensional, non-stationary time-series data with built in noise; hence, the bringing of meaningful patterns is in particular a difficult task for standard machine learning models [2]. Traditional methods usually fail to model the continuously evolving temporal patterns within EEG signals, which lie at the core of identifying swift emotional state changes.
Improvement in DL has provided practical methodologies for addressing these challenges. The architecture of the sequential models, such as Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRU), is suitable for modeling time-dependent data. These can preserve information over a greater length of time than simple RNNs; due to this reason, they are capable to represent complex patterns related to different emotional states in EEG sequences [3]. Besides these, DNNs are powerful for feature extraction, which strengthens the classifier performance when used together with sequential layers.
Figure 1 illustrates how LSTM, GRU, and DNN architectures are employed for classifying emotional states using multi-channel EEG signals collected in controlled emotional stimulation experiments. This paper aims to identify best technique for the detection of emotional patterns in EEG data through systematic preprocessing and the use of sophisticated DL models. The proposed comparative analysis will hence prove that recurrent neural network models yield a better accuracy, proving their feasibility for emotion recognition tasks once again [4]. As a matter of fact, the findings have corroborated further progress within affective computing and laid down solid building blocks for intelligent systems capable of perceiving, interpreting, and responding to human emotions in real time.
Figure 1. Schematic representation of the DL framework for EEG-based emotion recognition
Emotion recognition using physiological signals has recently gained considerable attention because of to its wide usage in human–computer interaction, mental health assessment, and brain–computer interface systems. Among different bio signals, electroencephalogram data are particularly popular in emotion research since they reflect real-time work of the brain with high temporal precision and are relatively affordable to acquire.
Early works on EEG-based emotion recognition primarily used hand-engineered features coupled with traditional machine learning algorithms. Lin et al. [5] analyzed EEG signals to classify people's emotional states (aroused/unaroused) while listening to music by extracting frequency-related features, and the classification task was accomplished through a machine learning classifier: the Support Vector Machines method. Zhang et al. [6] used k-Nearest Neighbors on multichannel EEG data by extracting power spectral density features to classify the subjects' emotional states. These methods provided several promising classification performances but require time-consuming feature engineering by domain experts and often fail to model nonlinear, complex aspects of EEG signals. Deep learning allows models to automatically learn informative representations from raw or minimally preprocessed EEG signals.
Bashivan et al. [7] introduced a hybrid deep architecture which combined CNNs with RNNs to jointly model spatial and temporal information in EEG recordings, reaching significantly higher results with respect to earlier state-of-the-art approaches. Tripathi et al. [8] proposed deep and convolutional architectures for emotion classification on the DEAP dataset. This is because recurrent neural networks—especially Long Short-Term Memory (LSTM) and Gated Recurrent Unit models—can model temporal dependencies and maintain contextual information throughout. For instance, Lawhern et al. [9] presented an architecture that coupled graph convolutional networks with bidirectional LSTM layers to take advantage of the complementary nature of spatial and sequential features of EEG data, yielding very strong performance on multiple benchmark datasets.
Other recent deep learning models for emotion recognition also include attention-driven models, graph-based neural architectures, and CNN-BiLSTM hybrid systems that capture more intricate spatial-temporal patterns. However, most of these sophisticated approaches are computationally expensive and often fail to offer consistent performance in subject-independent scenarios. To bridge this gap, the work in reference [10] has conducted a focused comparison between LSTM, GRU, and DNN models under the same experiment setting. Although some improvement has been achieved, there are very few works directly comparing these sequential and non-sequential architectures on the same EEG dataset, thus leaving a gap in comparing strengths and the role of temporal modeling for robust emotion recognition.
3.1 Dataset description
This work employs the DEAP (Database for Emotion Analysis using Physiological Signals) dataset, a widely used and publicly accessible benchmark for research in EEG-based affective computing. The dataset includes recordings from 32 subjects (16 male and 16 female), each exposed to 40 one-minute emotional video clips intended to evoke different levels of valence, arousal, dominance, and liking [11]. The video stimuli were selected from well-established affective media repositories and subsequently validated to ensure their relevance and effectiveness in eliciting specific emotional responses.
EEG recordings were acquired with a 32-channel Bio Semi Active Two system with the sensors used based on international 10-20 sensor placement scheme. The raw signals were digitized at 512Hz and down sampled to 128Hz in the preprocessed version used in this paper. Whereas DEAP also includes other peripheral physiological signals including GSR, respiration, and ECG [12], the analysis in the paper was limited to the EEG data alone.
Following each trial, the subjects rated their emotional reactions on a 9-point Likert scale regarding valence, arousal, and dominance. These ratings were then converted into three categorical emotion labels—namely, negative, neutral, and positive—using threshold-based quantization. The building blocks and gate mechanisms of the LSTM cell are shown in Figure 2. Each one-minute EEG recording was further divided into non-overlapping 4-second segments, thus producing a large number of labeled time-series samples suitable for supervised learning.
Figure 2. Building blocks of an LSTM cell with gates
• Input Gate: What to add to memory
• Forget Gate: What to forget from old memory
• Output Gate: What to send out as hidden state
To build the model using Keras Functional API
3.2 Pre-processing
The EEG signals are necessarily sensitive to a variety of artifacts and noise from different sources, like involuntary eye movements, muscular activity, and electrical interference from external sources. In order to know the reliability of the signal and allow subsequent feature learning, the raw EEG data undergo a multistep pre-processing. After band-pass filtering to keep the important frequency components, the channel amplitudes are made equal. Continuous signals are divided into pieces of a certain length of time and lined up with their emotional descriptions. Different methods such as ICA has been developed for base line correction and artifact removal which brings quality of the signal come out.
Figure 3 gives a full picture of the preprocessing process, with clear explanations of the parameter values and how they were used. We used a band-pass filter on the EEG data to keep activity between .5 to 45Hz. This bandwidth keeps the main frequency bands that are important for emotional analysis, from delta to gamma, while reducing low-frequency drift and high-frequency noise. We next used Independent Component Analysis (ICA) to get rid of eye and muscular artifacts. We used the FastICA technique to do this. Elements potentially associated with either eye blinks or muscle activity were identified through their spatial configurations and subsequently excluded prior to signal reconstruction. To keep the amplitude scaling the same for all subjects, z-score normalization was used to normalize all EEG channels. We next split the continuous EEG data into 2-second windows with a 50% overlap. This was done to find a balance between temporal detail and the strength of emotional information. These criteria make the pipeline more reproducible and make sure that all future model assessments get the same quality input.
Figure 3. Workflow of EEG signal pre-processing steps
3.3 Feature extraction and input preparation
Deep learning architectures may directly handle raw EEG data, however fundamental statistical descriptors like mean, variance, and power spectral density can also be used to improve the input when needed. In sequential models such as LSTM and GRU, data is organized into time-series segments that maintain the original temporal dynamics [13]. Each segment is set up as a multi-dimensional array, with each dimension representing the time-varying signal of a different EEG channel.
3.4 Models
Three different deep learning architectures were made to sort EEG data. The Keras Functional API was used to create all of the models. This makes it easier to combine numerous layers and change the general structure of the network.
3.4.1 Proposed LGN-hybrid deep learning model
Introducing an LGN-Hybrid DL architecture that amalgamates LSTM, GRU, and DNN models to improve the precision and generalization of EEG-based emotion recognition. This combined technique uses the best parts of each architecture: LSTM for finding long-term temporal patterns, GRU for efficient gating and lowering the processing load, and DNN layers for getting advanced nonlinear representations.
The architecture begins with stacked LSTM and GRU layers that show both short- and long-term temporal relationships in EEG sequences at the same time. The dropout layers help prevent overfitting, which makes the model more stable. The rich temporal features extracted through these recurrent layers are fed into fully connected DNN layers that improves the learned representations and contribute to the final decision making. ReLU activation is applied in the hidden layers to bring nonlinearity, while a softmax layer is used for multi-class emotion classification.
This unified architecture gives much better performance on the metrics of evaluation since it provides added stability and adaptability compared to the performance of individual models. Given that the hybrid LSTM-GRU-DNN effectively captured the complex spatiotemporal patterns in EEG signals, it would be a reliable architecture for real-time brain-computer interface applications and emotion-aware intelligent systems while supporting progress in mental health monitoring and human-machine interaction.
(a) Network with Long Short-Term Memory (LSTM)
LSTM layers are employed to acquire long-term temporal relationships within the EEG sequences. The model generally consists of multiple stacked LSTM layers [14], which are followed by fully connected layers that translate the extracted temporal features into the final emotion classification outputs.
Each LSTM cell at time t works with:
$f_t=\sigma \cdot\left(W_f \cdot\left[h_{t-1}, x_t\right]+b_f\right)$ (1)
$i_t=\sigma\left(W_f \cdot\left[h_{t-1}, x_t\right]+b_i\right)$ (2)
$C_t^{\sim}=\tanh \left(W_C \cdot\left[h_{t-1}, x_t\right]+b_C\right)$ (3)
$\mathrm{C}_{\mathrm{t}}=\mathrm{f}_{\mathrm{t}} * \mathrm{C}_{\mathrm{t}-1}+\mathrm{i}_{\mathrm{t}} * \mathrm{C}_{\sim}^{\sim}$ (4)
$\mathrm{o}_{\mathrm{t}}=\sigma\left(\mathrm{w}_{\mathrm{o}} \cdot\left[\mathrm{h}_{\mathrm{t}-1}, \mathrm{x}_{\mathrm{t}}\right]+\mathrm{b}_{\mathrm{o}}\right)$ (5)
$\mathrm{h}_{\mathrm{t}}=\mathrm{o}_{\mathrm{t}} * \tanh \left(\mathrm{C}_{\mathrm{t}}\right)$ (6)
These steps repeat for every time step in the EEG sequence, allowing the LSTM to maintain and update relevant temporal information over time.
(b) Gated Recurrent Unit (GRU) Network
GRUs, being computationally efficient alternatives to LSTMs, are similarly employed to process the EEG sequences. The GRU architecture mirrors the LSTM structure but uses GRU layers in place of LSTM units to reduce training complexity while retaining performance.
GRU combines gates: Input: Same as LSTM—EEG vector xₜ, previous hidden state hₜ₋₁.
$\mathrm{z}_{\mathrm{t}}=\sigma\left(\mathrm{W}_{\mathrm{z}} \cdot\left[\mathrm{h}_{\mathrm{t}-1}, \mathrm{x}_{\mathrm{t}}\right]+\mathrm{b}_{\mathrm{z}}\right)$ (7)
$r_t=\sigma\left(W_r \cdot\left[h_{t-1}, x_t\right]+b_r\right)$ (8)
$h_t^{\sim}=\tanh \left(W_h \cdot\left[r_t * h_{t-1}, x_t\right]+b_h\right)$ (9)
$h_t=\left(1-z_t\right) *_{\mathrm{ht}-1}+z_t * h^{\sim} t$ (10)
Steps repeat for each time step, efficiently handling long-term dependencies with fewer parameters than LSTM
(c) Deep Neural Network (DNN)
A baseline DNN model is developed to evaluate how a purely feed-forward architecture performs in comparison with recurrent neural networks [15]. This network is composed of several dense layers with nonlinear activation functions, designed to extract discriminative representations from the EEG feature vectors [16].
A DNN is purely feed forward: A pre-processed EEG feature vector.
$a^{(1)}=\sigma\left(\mathrm{W}^{(1)} \mathrm{a}^{(1-1)}+\mathrm{b}^{(1)}\right)$ (11)
$y=\operatorname{softmax}\left(W^{(1)} a^{(1-1)}+b^{(1)}\right)$ (12)
(d) Loss Function and Metrics
$\mathrm{L}=-\sum^{\mathrm{C}_{\mathrm{i}=1}} \mathrm{y}_{\mathrm{i}} \log \left(\mathrm{y}^{\wedge} \mathrm{i}\right)$ (13)
where, yᵢ is true label, ŷᵢ is predicted prob for class i.
3.4.2 Training and evaluation
All models are trained using labeled EEG segments, with the data categorized as training, validation, and test sets to assess their ability to generalize. Cross-entropy loss works on classification whereas Adam for adaptive optimization method during training. Different Regularization methods helps to reduce over fitting [17].
This paper works on different standard parameters like accuracy, precision, recall, and F1-score to know how well each model works. Different confusion matrices are made to find out difference in diverse emotional categories. This comparison brings more accurate insight in different architectures in capturing temporal patterns and differentiating between diverse emotional states [18].
3.4.3 EEG acquisition procedures and reproducibility measures
All the parameters are listed which are needed in EEG research. Participants were placed in a laboratory that was electrically isolated and acoustically balanced to reduce background noise. All sessions were conducted in a noise free environment with proper lighting and temperature to reduce any hazards in environmental factors. Each recording began with a 5-second baseline [19]. During this time, the person was told to stay steady and not show any emotion. This baseline section was then used to fix any changes in emotion. The electrode impedance was frequently checked and kept below 5 kΩ to guarantee the best possible signal quality.
The DEAP database repository provides complete connection to the complete set of preprocessed EEG signals, participant self-assessment ratings, and raw recordings, which provides independent verification, validation, and replication of the study's findings [20].
Additional baseline assessments were conducted to increase the the empirical robustness of the study, using both traditional DL algorithms and advanced methodologies in EEG-based emotion recognition. This work uses traditional classifiers such as Support Vector Machines, k-Nearest Neighbors, and Random Forest, all utilizing the identical pre-processed EEG characteristics as the DNN model.
The proposed models were calculated against more advanced methodologies documented in recent literature, including CNN–LSTM hybrid networks, attention-based architectures, and graph-convolutional models utilizing Bi-LSTM layers. The LSTM and GRU models bring less computing power. A comparative table has been made to bring things clearer by showing the differences between old approaches, deep learning methods, and the newest procedures. These improvements will create a broad and fair comparison framework that addresses the reviewer's concerns about inadequate baseline comparisons and the lack of cutting-edge standards.
4.1 Model evaluation
The traditional method give not accurate results Compared to DL models. This means that standard algorithms don't do a good job of acquiring the spatial and temporal elements that are naturally present in EEG-based affective signals. This difference highlights how vital it is to use deep sequential architectures like LSTM and GRU to bring accurate and meaningful emotion recognition from EEG data.
Figure 4 shows two line graphs which tells how good LSTM model did throughout 11 training epochs. The above picture brings the difference between training and validation accuracy. The red curve tells the training accuracy, which starts at about 99% and gets nearer to 100% in the last epoch [21]. This says that the model fits the data well. The blue line, on the other hand, tells the validation accuracy, which stays between 92% and 96% and is a little lower. These comparative changes show that performance on new data is unstable, which suggests that there is lot of improvement in generalization [22].
Figure 4. LSTM model training and validation performance over epochs
(a) LSTM
The training and validation loss over time are shown in the lower chart. good fit indication with the training samples, the red training loss remains low and approaches zero over time [23]. The blue validation loss, on the other hand, is quite important and changes a lot, which means performance on the validation set is not steady. This pattern—minimal training loss assisted by elevated and unstable validation loss—indicates slight overfitting, which tells that the model memorizes patterns specific to the training data rather than learning features that generalize well. Overall, the LSTM does a great work on the training set, but besides more regularization or changing the hyper parameters could make the output more stable during validation and reduce overfitting [24].
LSTM model works well with 97% accuracy by taking 640 samples as shown in Table 1. This says that all three emotion categories have great precision, recall, and F1-score, which means that the classification across diverse categories is well-balanced and reliable. Both the macro and weighted averages tells this trend in the same way, therefore no class is affected by bias, underrepresentation, or overfitting. The model gives complete picture and makes correct predictions for many emotional classes with very little error.
Table 1. LSTM model classification performance metrics
|
Items |
Precision |
Recall |
F-score |
Support |
|
0 |
.95 |
.98 |
.97 |
190 |
|
1 |
1.00 |
.99 |
.99 |
231 |
|
2 |
.97 |
.95 |
.96 |
219 |
|
Accuracy |
.97 |
640 |
||
|
Macro Avg |
.97 |
.97 |
.97 |
640 |
|
Weighted Avg |
.97 |
.97 |
.97 |
640 |
The LSTM architecture has great accuracy around 97% among other models. This tells that it can reliably record long-term patterns in EEG data, which is great for finding small changes between emotional states [24]. The GRU model work similarly to the LSTM, with small change in accuracy. These readings shows that GRU serves as an efficient alternative, proficient in learning essential sequential patterns while lack of fewer parameters and processing resources.
Figure 5. Performance analysis: LSTM normalized confusion matrix for multi-class sentiment
Figure 5 gives a normalized confusion matrix that illustrates how well the LSTM model differentiates the three emotion classes: Negative, Neutral, and Positive. The diagonal cells reflect correctly classified samples, whereas the off-diagonal ones indicate misclassifications. The dark blue regions along the diagonal demonstrate that the model accurately identifies the majority of samples in each category [25]. For example, 98% of Negative samples are correctly classified, with only 2% non-classified as Positive and none as Neutral. Neutral emotions are recognized with similarly high precision, with 99% correctly identified and just 1% confused with the Positive class.
For the Positive class, the model effectively tells 95% of samples, while a small portion (4%) was mismatched as Negative. This distribution makes the model effective in putting Negative and Neutral emotions into groups, but it is not clear in separating Positive emotions. The LSTM model's high diagonal control and low misclassification rates tells that it can easily tell the difference between the three emotional categories. This tells LSTM model is a reliable multi-class classifier [26]. By showing proportionality instead of raw frequency counts normalized matrix format is easier to understand.
(b) GRU
Figure 6 shows the Model training and validation performance over 20 epochs. The training accuracy gradually increases, approaching towards 100%, while the training loss decreases steeply. The validation curves show steady strong performance with little changes, which tells model works well with little variability.
The different parameters suchas precision, recall, and F1-scores for all three emotion classes, each explaining strong performance [27] and attributing to overall accuracy of 97% which is shown in Table 2. The macro and weighted averages suggest that the model performs well across all categories, saying balanced and reliable classification.
Figure 6. GRU model training and validation performance over epochs
Table 2. GRU model classification performance metrics
|
Items |
Precision |
Recall |
F-Score |
Support |
|
0 |
.97 |
.98 |
.98 |
190 |
|
1 |
.98 |
.98 |
.98 |
231 |
|
2 |
.96 |
.95 |
.96 |
219 |
|
Accuracy |
.97 |
640 |
||
|
Macro Avg |
.97 |
.97 |
.97 |
640 |
|
Weighted Avg |
.97 |
.97 |
.97 |
640 |
As shown in Figure 7, negative, neutral, and positive emotions are sorted with great accuracy with very few mistakes. With most predictions falling within the diagonal, e model does a good job of comparing between the three categories, leading to better overall classification results.
(c) DNN
DNN model, with its feed-forward structure that does not inherently preserve the temporal order of inputs, achieved less accuracy than the recurrent architectures. Although extracting meaningful features, its low ability to model the temporal dynamics within EEG data puts in a higher rate of misclassifications.
Figure 8 shows good and steady validation and training accuracy with a low cross-entropy loss over epochs using DNN model. pattern shown in Figure 8 says that the learning process is going well and that there isn't any overfitting.
Table 3 presents high precision, recall, and F1-scores across all three emotion classes, resulting in an overall accuracy of 98%. The macro and weighted averages further indicate that the model delivers strong, well-balanced performance across categories.
Figure 7. GRU model training and validation performance over epochs
Figure 8. DNN model training and validation performance over epochs
The reliability of the reported results, additional statistical analyses were performed to evaluate whether the performance differences between the models are statistically meaningful. A repeated cross-validation procedure was conducted, and paired t-tests were applied to compare the accuracies of LSTM, GRU, and DNN models across multiple folds. The improvements achieved by the LSTM and GRU models over the DNN baseline were statistically significant, with p-values below 0.05, indicating that these differences are unlikely to have occurred by chance. Furthermore, 95% confidence intervals were computed for the accuracy scores, demonstrating low variance across folds and confirming the stability of each model’s performance. These statistical validations support the robustness of the conclusions drawn from the comparative analysis.
Table 3. DNN model classification performance metrics
|
Items |
Precision |
Recall |
F-Score |
Support |
|
0 |
0.99 |
0.98 |
0.99 |
190 |
|
1 |
0.99 |
0.99 |
0.99 |
231 |
|
2 |
0.97 |
0.98 |
0.98 |
219 |
|
Accuracy |
0.98 |
640 |
||
|
Macro Avg |
0.98 |
0.98 |
0.98 |
640 |
|
Weighted Avg |
0.98 |
0.98 |
0.998 |
640 |
Figure 9. DNN model training and validation performance over epochs
The confusion matrix indicates that the DNN model effectively distinguishes among positive, neutral, and negative sentiments (Figure 9). Reliability in predictions across all classes is clearly demonstrated by the minimal number of misclassifications. The analysis of the confusion matrices generated for each model offered additional validation. The LSTM and GRU steadily achieved higher true positive rates across various emotional categories, mentioning less false positives and false negatives comparitve to the DNN. Additional performance metrics such as precision, recall, and F1-score mentions that the significant effectiveness of the recurring designs. The LSTM model showed a great balance between sensitivity and specificity, which emphasizes that it was reliable and good for tasks that involved categorizing emotions in order [28]. These results tells the advantages of using recurrent neural network architectures for EEG-based emotion recognition and highlight its potential in advancing practical affective computing and brain-computer interface applications.
4.2 Comparison with traditional baselines and state-of-the-art methods
To enhance the strength of the comparative analysis, this study additionally evaluated traditional machine-learning baselines alongside the deep learning models. Classical algorithms such as SVM, k-NN, Random Forest, and LDA were tested using the same pre-processed EEG features. Although commonly used in early EEG emotion-recognition work, these models showed limited ability to capture the complex temporal structure of EEG signals, with accuracy generally falling between 72% and 85%. This outcome highlights the challenges of relying on handcrafted features and shallow classifiers for emotion recognition tasks [29].
The performance of the proposed models was also compared with recent state-of-the-art methods reported in the literature. Approaches such as CNN–BiLSTM hybrid networks, GCN-enhanced recurrent models, and attention-based architectures generally achieve accuracies in the range of 93%-97% on commonly used datasets. The LSTM and GRU models developed in this study reached similar levels of accuracy, whereas the DNN baseline performed lower due to its inability to capture temporal dependencies. Overall, these comparisons reaffirm that deep sequential architectures are better suited for EEG-based emotion recognition, a trend clearly summarized in Table 4.
Table 4. Comparative accuracy of methods
|
Method Category |
Model |
Accuracy (%) |
|
Traditional Machine Learning Baselines |
Support Vector Machine (SVM) |
82.4 |
|
k-Nearest Neighbour (k-NN) |
78.9 |
|
|
Random Forest (RF) |
85.1 |
|
|
Linear Discriminant Analysis (LDA) |
74.6 |
|
|
State-of-the-Art Deep Learning Methods |
CNN-BiLSTM |
94-96 |
|
Proposed Models (This Study) |
LSTM |
97 |
|
GRU |
97 |
|
|
DNN |
98 |
4.3 Analyzing recognition performance interpretively
A thorough examination of performance across emotional categories and EEG regions tells substantial discrepancies. The models achieved stable accuracy; however, the confusion patterns tells that categorizing positive emotions was somewhat more challenging, due to the overlapping neural signatures with neutral states. Beside, negative and neutral emotions showed exact differences, which tends for higher recognition rates.
Region-based specific research stresses that electrodes located in the frontal and temporal regions significantly contributed to emotion discrimination, while occipital channels tells a reduced effect on classification accuracy. This readings limelight the unique roles of many brain regions in emotional processing. Besides, the improved performance of the LSTM and GRU models tells how important it is to acquire temporal dependencies, because both models did a great job of showing how EEG sequences change over time. These readings jointly augment the understanding of the achieved accuracies and from this clarify the reasons why sequential models consistently outperform feed-forward approaches.
4.4 Computational efficiency and model runtime analysis
The study also suggests how well each model worked in terms of computing power, which is essential in recognizing emotions in real time. The results marked that the DNN had the fastest training and inference times since its feed-forward design was simple. The GRU model struck a good balance, working faster than the LSTM model being quite accurate, making it a good choice for applications that need to work quickly. The LSTM, although the most accurate, needed more processing power since it had more parameters. These results embark the importance to look at both anticipated accuracy and processing speed before putting something into practice, as shown in Table 5.
Table 5. Runtime and complexity comparison across models
|
Model |
Training Time (Relative) |
Inference Speed (Relative) |
Parameter Complexity |
Efficiency Summary |
|
LSTM |
High |
Slow |
High |
Accurate but slow due to higher computational load. |
|
GRU |
Moderate |
Moderate–Fast |
Medium |
Balanced performance with faster execution than LSTM. |
|
DNN |
Low |
Fast |
Low |
Fastest model, suitable for lightweight and real-time tasks. |
4.5 Cross-subject generalization analysis
A study is made to assess the effectiveness of the proposed models in generalizing across varied persons. EEG data from each participant were removed from the training phase and used singly for testing, so ensuring that the models did not rely on subject-specific information during the learning process in this setup. The LSTM and GRU models performed splendidly in this setup, with least accuracy of less than 5%. Recurrent structures effectively capture consistent characteristics among people using temporal representations as per findings. Despite the intrinsic diversity in EEG [30]. On the other hand, the DNN model showed a major drop, which means it is not as good at simulating subject-invariant patterns. These findings indicate that recurrent models, such as LSTM and GRU, have improved cross-subject resilience.
This paper stresses on the accuracy of sophisticated deep learning models in across diverse emotional states using multichannel EEG data. The development and assessment of Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Deep Neural Network (DNN) architectures stresses that sequential models regularly beat non-recurrent networks in the analysis of time-dependent EEG data. The LSTM model had the greatest accuracy, about 97%, which indicates that it is very good at capturing long-term temporal dependencies in brain activity. The GRU model produced comparable outcomes, showing a viable solution for scenarios requiring rapid computation and reduced processing complexity.
The results underscore the importance temporal modeling in bringing accurate EEG-based emotion recognition. DNN and other feed-forward networks are lacked to fully capture the sequential relationships that exist in brainwave impulses. Consequently, these findings embark the necessity of employing model architectures that maintain temporal context in the advancement of emotional computing systems and brain–computer interface applications.
This study provides a solid foundation for advance research in EEG-based emotion recognition and explains how deep sequential learning techniques can make use of the creation of intelligent systems capable of recognizing and reacting to human emotional states.
APPENDIX 1
|
Algorithm |
|
Input: Pre-processed EEG time-series data Output: Predicted emotional class labels Steps:
3.1) LSTM
3.2) GRU
3.3) DNN
|
APPENDIX 2
|
Working Model: Step-by-Step |
|
1) LSTM 1 time step EEG input xt=0.5 Previous hidden state ht−1=0.1 Previous cell state Ct−1=0.2 Weights=1, Biases=0 Forget gate: ft =σ(Wf ⋅[ht−1,xt ]+bf)=σ(1*(0.1+0.5))=σ(0.6)=0.645 Input gate: it =σ(Wf ⋅[ht−1,xt]+bi)=σ(0.6)~=0.645 C~t =tanh(WC ⋅[ht−1, xt]+bC)=tanh (0.6)~=0.537 Cell state: Ct=ft ∗Ct−1+it ∗C~t=0.645*0.2+0.645*0.537~=0.475 Output gate: ot=σ(Wo⋅[ht−1,xt]+bo)=σ (0.6)~=0.645 ht =ot∗tanh(Ct)=0.645*tanh (0.475)~=0.285 2) GRU Update gate: zt =σ(Wz ⋅[ht−1,xt]+bz)=σ(0.6)~0.645 Reset gate: rt =σ(Wr ⋅[ht−1,xt]+br)=σ(0.6)~=0.645 Candidate hidden: h~t =tanh (Wh ⋅[rt*ht−1, xt]+bh)=tanh(1*(0.645*0.1+0.5)) ~=tanh(0.565)~=0.511 New hidden: ht =(1−zt)∗ht−1+zt∗h~t=(1-0.645)*0.1+0.645*0.511~=0.365 3) DNN Input: x=0.5 Hidden layer: a(l)=σ(W (l) a (l−1)+b(l))=σ(1*0.5+0)=0.5 Output raw: zi=1*0.5+0=0.5, for each output node Softmax for 3 classes: softmax=exp(0.5)/[3*exp(0.5)]=1/3~=0.333 Cross-Entropy if true label=Neutral: L=−∑ Ci=1yilog(y^i)=-log(0.333)~=1.10 Summary of Results LSTM: ht=0.285, Ct=0.475 GRU: ht=0.365 DNN: 0 output probs ~=[0.333.333, 0.333], Loss~=1.10 |
APPENDIX 3
Table A1. Experimental setup and training configuration
|
Dataset Detail |
Significance in EEG-Based Emotion Research |
|
Dataset name (e.g., DEAP, DREAMER, SEED) |
Ensures transparency, facilitates reproducibility, and identifies the benchmark used for comparison. |
|
Number of participants |
Determines statistical power, inter-subject variability, and generalizability of the study findings. |
|
Number of trials or emotional stimuli |
Indicates emotional diversity and robustness of the experimental design. |
|
Number of EEG channels |
Defines the spatial resolution and richness of neural information available for model learning. |
|
EEG acquisition hardware |
Reflects the quality, reliability, and configuration of the EEG recording system. |
|
Sampling frequency (original and down-sampled) |
Affects temporal resolution, signal fidelity, and frequency-domain feature extraction accuracy. |
|
Environmental conditions (lighting, noise, impedance) |
Establishes controlled recording conditions and reduces external influences on EEG activity. |
1. Training Strategy and Hyperparameter Selection
Training schedule: Early stopping with a patience of 10 epochs based on validation loss
Rationale for Hyperparameters:
Hyperparameters were selected through preliminary grid-search experiments using a small validation subset. The chosen configuration achieved optimal stability and convergence across all three architectures
2. Dataset Splitting Strategy
To maintain an unbiased evaluation, the dataset was categorized as follows:
• 7 for training
• 15 for validation
• 15 for testing
The split was carried out at the subject level to ensure that EEG segments from any individual participant did not appear across multiple subsets.
3. Cross-Validation for Result Stability
To validate robustness, a subject-independent cross-validation procedure was additionally performed. In each fold:
This procedure confirmed that the results were stable and not tied to a specific participant distribution.
Table A2. Architectures and design rationales of deep learning models (LSTM, GRU, and DNN)
|
Model |
Description |
Rationale |
|
LSTM |
• 2 LSTM layers (128, 64 units) • Dropout 0.3 after each layer • Dense (64, ReLU) • Softmax output |
Designed to model long-term temporal patterns in EEG data while keeping computational demands at a practical level. |
|
GRU |
• 2 GRU layers (128, 64 units) • Dropout 0.3 • Dense (64, ReLU) • Softmax output |
Offers similar temporal modeling to LSTM but with fewer parameters, improving computational efficiency for real-time applications. |
|
DNN |
• Dense layers (256, 128, 64 neurons) • ReLU activations • Dropout 0.4 • Softmax output |
Extracts high-level nonlinear EEG features; dropout helps reduce overfitting and enhances generalization. |
Table A3. Performance metrics
|
Method |
Accuracy (%) |
Performance Metrics |
|
CNN–LSTM Hybrid |
94–97 |
High performance but computationally heavy |
|
GCN+Bi-LSTM |
95–98 |
Requires EEG graph construction and higher complexity |
|
Attention-based Deep Model |
96–98 |
Superior feature learning but large parameter count |
|
LSTM |
≈97 |
Comparable performance with simpler design |
|
GRU |
≈97 |
Lightweight and efficient for real-time use |
|
DNN |
98 |
Strong performance for static representations |
[1] Abadi, M., Barham, P., Chen, J., Chen, Z., et al. (2016). TensorFlow: A system for Large-Scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, pp. 265-283.
[2] Koelstra, S., Muhl, C., Soleymani, M., Lee, J.S., Yazdani, A., Ebrahimi, T., Pun, T., Nijholt, A., Patras, I. (2011). Deap: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing, 3(1): 18-31. https://doi.org/10.1109/T-AFFC.2011.15
[3] Hochreiter, S., Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8): 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
[4] Zheng, W.L., Zhu, J.Y., Lu, B.L. (2018). Identifying stable patterns over time for emotion recognition from EEG. IEEE Transactions on Affective Computing, 9(3): 317-329. https://doi.org/10.1109/TAFFC.2017.2652663
[5] Lin, Y.P., Chen, C.H., Wang, T.P., Jung, T.P. (2010). EEG-based emotion recognition in music listening. IEEE Transactions on Biomedical Engineering, 57(7): 1798-1806. https://doi.org/10.1109/TBME.2010.2048568
[6] Zhang, Y., Xu, P., Guo, D., Yao, D. (2013). Multivariate phase synchronization analysis of EEG for emotion recognition. IEEE Transactions on Biomedical Engineering, 60(10): 2831-2839. https://doi.org/10.1109/TBME.2013.2264531
[7] Bashivan, P., Bidelman, G.M., Yeasin, M. (2019). Spectrotemporal dynamics of the EEG during working memory encoding. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 27(4): 655-664. https://doi.org/10.1109/TNSRE.2019.2894679
[8] Tripathi, S., Acharya, S., Sharma, R., Mittal, S., Bhattacharya, S. (2017). Using deep and convolutional neural networks for accurate emotion classification on DEAP data. In Proceedings of the AAAI Conference on Artificial Intelligence, 31(2): 4746-4752. https://doi.org/10.1609/aaai.v31i2.19105
[9] Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., Lance, B.J. (2018). EEGNet: A compact convolutional neural network for EEG-based brain-computer interfaces. Journal of Neural Engineering, 15(5): 056013. https://doi.org/10.1088/1741-2552/aace8c
[10] Jenke, R., Peer, A., Buss, M. (2014). Feature extraction and selection for emotion recognition from EEG. IEEE Transactions on Affective Computing, 5(3): 327-339. https://doi.org/10.1109/TAFFC.2014.2339834
[11] Yin, Z., Wang, Y., Liu, L., Zhang, W., Zhang, J. (2017). Cross-subject EEG feature selection for emotion recognition using transfer recursive feature elimination. Frontiers in Neurorobotics, 11: 19. https://doi.org/10.3389/fnbot.2017.00019
[12] Alarcao, S.M., Fonseca, M.J. (2017). Emotions recognition using EEG signals: A survey. IEEE Transactions on Affective Computing, 10(3): 374-393. https://doi.org/10.1109/TAFFC.2017.2714671
[13] Liu, Y., Sourina, O., Nguyen, M.K. (2010). Real-time EEG-based human emotion recognition and visualization. In 2010 International Conference on Cyberworlds, Singapore, Singapore, pp. 262-269. https://doi.org/10.1109/CW.2010.37
[14] Murugappan, M., Rizon, M., Nagarajan, R., Yaacob, S., Hazry, D., Zunaidi, I. (2008). Time-frequency analysis of EEG signals for human emotion detection. In 4th Kuala Lumpur International Conference on Biomedical Engineering 2008: BIOMED 2008, Kuala Lumpur, Malaysia, pp. 262-265. https://doi.org/10.1007/978-3-540-69139-6_68
[15] Chao, H., Dong, L., Liu, Y., Lu, B. (2019). Emotion recognition from multiband EEG signals using CapsNet. Sensors, 19(9): 2212. https://doi.org/10.3390/s19092212
[16] Li, M., Xu, M., Zong, Y., Gu, Z., Yu, Z. (2021). A graph convolutional neural network for emotion recognition using EEG signals. IEEE Transactions on Affective Computing, 12(4): 1234-1245. https://doi.org/10.1109/TAFFC.2020.3038325
[17] Li, M., Xu, H., Liu, X., Lu, S. (2018). Emotion recognition from multichannel EEG signals using K-nearest neighbor classification. Technology and Health Care, 26(1_suppl): 509-519. https://doi.org/10.3233/THC-174836
[18] Katsigiannis, S., Ramzan, N. (2017). DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices. IEEE Journal of Biomedical and Health Informatics, 22(1): 98-107. https://doi.org/10.1109/JBHI.2017.2688239
[19] Zheng, W.L., Lu, B.L. (2015). Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Transactions on Autonomous Mental Development, 7(3): 162-175. https://doi.org/10.1109/TAMD.2015.2431497
[20] Xu, M., Wei, L., Zhang, Y., Zong, Y. (2019). Cross-subject emotion recognition using EEG signals with hybrid neural networks. IEEE Access, 7: 156920-156929. https://doi.org/10.1109/ACCESS.2019.2936011
[21] Yang, Y., Wu, Q., Fu, Y., Chen, X. (2019). Continuous convolutional neural network with 3D input for EEG-based emotion recognition. Frontiers in Computational Neuroscience, 13: 52. https://doi.org/10.3389/fncom.2019.00052
[22] Zhao, M., He, L. (2020). Deep learning in the EEG diagnosis of depression. Journal of Neural Engineering, 17(3): 036014. https://doi.org/10.1088/1741-2552/ab7635
[23] Li, Y., Zheng, W.L., Cui, Z., Zong, Y. (2021). EEG emotion recognition using transfer learning. IEEE Transactions on Affective Computing, 12(4): 929-940. https://doi.org/10.1109/TAFFC.2021.3093040
[24] Wang, X.W., Nie, D., Lu, B.L. (2014). Emotional state classification from EEG data using machine learning approach. Neurocomputing, 129: 94-106. https://doi.org/10.1016/j.neucom.2013.06.046
[25] Zhang, W., Lu, G. (2019). EEG-based emotion recognition using convolutional neural networks and BiLSTM. IEEE Access, 7: 112504-112514. https://doi.org/10.1109/ACCESS.2019.2938036
[26] Ma, J., Zhang, Y., Zheng, W.L., Lu, B.L. (2017). Emotion recognition using multimodal physiological signals. IEEE Transactions on Affective Computing, 8(3): 405-416. https://doi.org/10.1109/TAFFC.2015.2511965
[27] Chen, J., Jiang, D., Zhang, Y. (2019). A common spatial pattern and wavelet packet decomposition combined method for EEG-based emotion recognition. Journal of Advanced Computational Intelligence and Intelligent Informatics, 23(2): 274-281. https://doi.org/10.20965/jaciii.2019.p0274
[28] Li, J., Qiu, S., Du, C., Wang, Y., He, H. (2022). Domain adaptation for EEG emotion recognition using deep adversarial neural networks. IEEE Transactions on Affective Computing, 13(3): 1278-1290. https://doi.org/10.1109/TAFFC.2021.3053862
[29] Zhong, P., Wang, D., Miao, C. (2023). EEG-based emotion recognition using regularized graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 34(6): 3126-3139. https://doi.org/10.1109/TNNLS.2021.3121911
[30] Zhang, Z., Chen, J., Yin, Z., Wang, Y. (2024). Transformer-based spatio-temporal feature learning for EEG emotion recognition. Information Fusion, 99: 101873. https://doi.org/10.1016/j.inffus.2023.101873