© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Lie detection is a well-known word that refers to one person acting in such a way that the other person believes something that is incorrect. Lie detection plays a sensitive part in various Scope including, national security, law enforcement, and psychology. To address this issue, lie detection has received a lot of interest lately. In this research, a deep learning algorithm with a new dataset and protocol is employed to automatically detect truth from electroencephalography (EEG) data. This experiment utilized the OpenBCI Ultracortex "Mark IV" EEG Headset, which acquired 14 channel of EEG data from ten participant. The acquired signal was pre-processing and then inputted individually into three classifiers—MLP, LSTM, and CNN—in order to distinguish between honest or guilty statements in the EEG data and also select the model with the best performance. The indicated manner is non-surgical, effective, and powerful, with least time complication, consequently appropriate for real-time applications. To implement the experiment on EEG signal for deceit detection, a novel dataset and protocol based on video was created. In addition, we compared the outcomes of our method to an existing dataset called Dryad Dataset, which used image protocol. The finding of the proposed system is evaluated using various measures such as accuracy, F1 score, recall, and precision. According to the testing outcomes, the CNN technique achieves the highest incredible accuracy of 99.96% on the EEG data set in our dataset and 99.36% on the Dryad dataset. Finally, the suggested system provides impressive results in comparison with existent algorithms presented in the literature and is precise, scalable, and fault-tolerant.
electroencephalogram, deep learning algorithms, lie detection, convolution neural network, long short-term memory, multilayer perceptron
Deception detection is a modern technology used to combat criminality, replacing traditional methods like physiological tests that measure heart rate and respiratory rate. These methods influenced by fear or tension. Also, Due to the ability of skilled criminals to manipulate physiological parameters. So, polygraph test results are neither legally admissible nor credible. Therefore, there is a need for an enhanced lie detection technique that exceeds those restrictions like using EEG.Newer methods like forensic electroencephalogram (EEG)-based lie detection it shown promise in detection deception [1]. EEG methods provide a non-intrusive alternative to traditional methods for detecting lying. Due to its privacy, sensitivity, difficult to steal, recreate and control.EEG data acquired by scalp electrodes show brain activity and are secretive, critical, and complicate to imitate. Analysing EEG data allows us to better understand brain impulses across different moods and activities [2]. The evolution of a Lie Detection System using EEG signal processing is a newly method in neuro-technology and forensic psychology. The growth of crime rates has raised the importance for studies on lie detection employ EEG signals, On the other hand, there are just a few research that are now available because of limited datasets and recent developments [3].
1.1 Related work
Recently, several researchers have been encouraged to enhance lie detection system using distinct methods. The study [4] presented a deep learning algorithm for automated deceit detection from EEG data using a convolution neural network (CNN), utilizing Fourteen channel EEG signals as input for classification into truth or lies subject. In the article [5] proposes a 16-EEG system for lie detection using SMOTE manner to delete imbalanced data and machine learning techniques. The system uses classifiers like SVM, DT, LR, RM, and KNN for accuracy. The SVM technique was achieve an impressive accuracy on the EEG dataset. In research [6] proposed a system to enhance BCI-based lie detection utilizing bio signals from five individuals. Data was acquired from a 16-EEG channel and analyzed using a variety of methods. The ensemble deep learning model was found to be the most effective. In the article [7] presents a novel method for classifying EEG signals into guilty and honest states using fraud identification tests. The method uses wavelet packet transform for feature extraction and linear discriminant analysis for classification, achieving a high accuracy of 91.67%. This article [8] presents a method for identifying deceit using brain EEG signals, using a Deep Neural Network (DNN) for binary classification of truth and lie. The study uses all 16 channels and the WPT technique to extract features, revealing varied performance across channels for future enhancement. The research [9] developed a deceit identification system using EEG data, optimizing performance through binary particle swarm optimization and support vector machine hyper-parameter optimization, eliminating noisy channels. This paper [10] introduces a novel ICA based on ASD, which enhances SNR and distinguishes guilty subjects from innocent ones using machine learning techniques like LDA, SVM, KNN, and BPNN, reducing noise and removing ocular objects. The paper [11] presents an efficient lie detection system using ELM, STFT, and BBAT optimization approaches. It captures brain EEG signals using a 16-channel data acquisition system, identifying guilty and innocent individuals using P300 responses. This method outperforms several advanced lie detecting systems. The paper [12] presents a deep learning framework for lie detection tests, utilizing pre-processing techniques to extract temporal feature maps from EEG images. This model, combined with a Cascading attention model and a deep learning CNN (V-TAM), achieved optimal prediction accuracy and time efficiency. The literature reviews indicated a lack of research on real-life settings using eyewitness investigation in EEG analysis for deceit detection. Furthermore, EEG data processing has accuracy and efficiency issues, and the lack of modern technology makes real-time application even more difficult. Furthermore, it is noticeable that the use of deep learning approaches is not common. Therefore, it is potential to use deep learning techniques to investigate the possibilities for improving the performance of the system. This paper introduces a new dataset and protocol that utilizes investigation with eyewitness for a crime robbery from a store seen in a video, making it more realistic. Secondly, the study utilized various deep learning algorithms to enhance accuracy and evaluated the model and protocol using various metrics like Accuracy, F1 score, Recall, and Precision. Lastly, comparisons the performance of method with a public dataset like Dryad to ensure the system's reliability and accuracy. The study procedure was organized as follows: section 1 contains an introduction about lie detection and discusses of previous studies. Section 2 describes the experiment design and dataset utilized in our work and discusses of the research methodology. Section 3 display the finding of the research. Section 4 includes the conclusion of the suggested model.
2.1 Participants
A new experiment was conduct at the University of Baghdad's Al-Khwarizmi College of Engineering Department of Biomedical Engineering to detect deception using EEG signal. The study involved 10 participants between the ages of 18 and 23, with no medical records of any psychological disorders. Additionally, to guarantee the safety of participants and to guarantee a high-quality EEG signal, we excluded those with skin diseases or dense hair. All individuals involved were undergraduate students enrolled at the same university. The subjects were recruited from a college student population via campus-wide advertisements within the department, as well as an intensive search for individuals inside the college who expressed an interest in joining part in the study. The recruitment process spanned up to two months, beginning in mid-December 2023 and concluding in mid-February 2024. The participants provided written informed consistent form (ICF) before data recording, ensuring normal or corrected vision. There was no remarkable variation in gender or age amidst the two groups. Each subject was record for two sessions: first when lying and second for an innocent session. Two conditions were analyse truth or lie. Participants were reward with a gift for their collaboration after the experiment.
2.2 EEG data acquisition
The study used an OpenBCI Ultracortex "Mark IV” EEG Headset that used 16 electrodes for data acquisition. The OpenBCI Ultracortex Mark IV is a researches device widely used in multiple applications such as brain-computer interface development, neuroscience research, and educational applications. This device wasn’t medical grade devices, and its signal quality may be restricted in comparison to medical devices [14]. two electrodes were neglected due to the fact that it has issues and receives a poor signal. The electrodes positions were FP1, FP2, F4, F8, F7, C4, C3, T8, T7, P8, P4, P3, O1, and O2 according to the International 10-20 electrodes system as shown in Figure 1. The reference electrodes were connected to each ear lobules. The EEG signal was recording with a sampling frequency of 125 Hz.
Figure 1. Displays the 3-D printed hardware configuration for data collection and precise electrode placements in accordance with the 10-20 International Standard [13]
2.3 Dryad protocol
This study used a standard three-stimuli protocol to investigate the detection of stolen jewels. Subjects were separated into two distinct categories: guilty and innocent. Six distinct jewels were prepared, and their images were used as stimuli during detection. Subject were giving a safe with one or two jewels, and instructed to open the safe and memorize the specifics of the objects. The guilty group was instructed to steal one object, the other being the T stimulus, and the remaining four images as the I stimuli. The innocent group was not stolen. In addition, the T stimulus was the innocent object. The subjects were then instructing to per-form the detection by facing a video screen, with each item presented randomly for 0.5 seconds with 30 iterations for one session. Each session lasted around 5 minutes and included 2 minutes of resting time. The inter-stimulus interval was 1.6 seconds. Each participant was instructed to complete five sessions with one push button given to each subject. The guilty group was instructed to press the "Yes" and "No" buttons when faced with known and unknown items, while the innocent set made truthful responses to all stimuli [15]. To understand the process of recording, Figure 2 illustrates the recording process, including the periods between the image, questions, and the session's end.
Figure 2. Sequence of the stimuli scheme of dryad dataset [15]
2.4 Experiment protocol
The protocol was design based on Comparison Question Test (CQT) technology. A short film of a theft crime from a store was used, published on the YouTube platform [16]. The video was divided into three short clips, each featuring a different scenario and excerpt from the video. After watching each of the three videos, he is asking four questions related to the video that was watched to subject, and he answers YES or NO. In the first session, the subject He asked him to tell the truth. The second session is the lying session, begins after completing all three parts and questions of all part of the first session. The three parts and the questions are repeated to the subject but in this time was ask him to say words contrary to the truth, or lying, to compare the signals for each subject in telling a lie and telling the truth. To minimize recording time and ensure that only the most significant signal was capture, the signal for all parts of the videos and questions is recording independently. The protocol is designing to adapt to the cases of eyewitnesses in order to find out what they say. For the truth or not, and also in most of the thefts that take place currently, the criminal has his face masked, and during the investigation, for example, a crime suspect is shown a picture of the place of the theft, and during watched place brain signal is recording to know whether he is telling the truth or not. Consequently, by giving Subjects a film simulating a theft taking happening in a store and subsequently respond to specific questions on the depicted events. Similar to what is seen in legal or forensic situations, this design replicates the conditions of recalling and reporting information under stress. The utilization of this video-based approach greatly improves the ecological validity of protocol in comparison to traditional protocol, which frequently depend on contrived activities or abstract questions that fail to comprehensively capture the intricacies of real-world deception and memory retrieval. Through the use of a realistic and engaging situation, suggested methodology effectively evokes more authentic cognitive and emotional reactions from participants. This methodology not only enhances the applicability of EXP data analysis findings but also guarantees that our dataset is more suitable for real-world applications in detecting lying. The data is recording in two computers. The first displays the video to the subject, and the distance between the screen and the subject is 65 cm approximately. The second monitors the signal from the brain to ensure the integrity of the electrodes and monitors the recording process, as in Figure 3. To clarify the recording mechanism and the periods in the recording process, Figure 4 shows the periods between the video, the questions, and the end of the session.
Figure 3. Display the experimental setup used for lie detection test
Figure 4. Sequence of the stimuli scheme of our protocol
The major goal of the proposed project is to create a deep learning model that can accurately detect whether someone is telling the truth or lying. The workflow of the model is illustrated in Figure 5. The method involves using convolutional neural networks (CNN), long-short-term memory (LSTM), and multilayer perceptron (MLP) models. The selection of these models was based on their capacity to autonomously extract intricate characteristics from EEG signals. Convolutional Neural Networks (CNNs) are well-suited for capturing spatial patterns in the data, Long Short-Term Memory (LSTM) models excel at modelling temporal dependencies, and Multilayer Perceptron (MLPs) are effective at processing non-linear interactions. In contrast to conventional machine learning models that depend on manually designed features, deep learning models provide a more resilient method by directly learning features from the data, resulting in enhanced classification performance. These models are created utilising tuning algorithms to select the optimal model and hyper-parameters, which are then, trained using EEG data for lie detection. The system is divided into four parts: the pre-processing stage, the classification stage, the training stage, and the testing stage. During the pre-processing stage, EEG signals were recorded for 30 seconds during a question-and-answer session. These signals come from 14 EEG channels, ranging from FP1 to EEG.O2. Frequency-domain filtering is then applied to enhance classification. The raw data from the 14 EEG channels is shown in Figure 6. The selected channels (FP1, FP2, F4, F8, F7, C4, C3, T8, T7, P8, P4, P3, O1, and O2) are displayed in different colours, with the x-axis representing the time domain and the y-axis showing the EEG signal in the figure.
Figure 5. Proposed procedure of lie detection system from EEG signal
Figure 6. An OpenBCI GUI for real time recording of row EEG signal
3.1 Pre-processing
The evolution of a trustworthy Lie Detection System using EEG signal processing poses several challenges due to its inherent noise and potential for distortion of brain activity patterns. EEG data must go through various pre-processing steps to enhance analysis. The data cleaning process involves removing unwanted channels from an EEG dataset. These channels, such as eye movements, muscle activity, and heart activity, are often used for monitoring and identifying artefacts. However, they can introduce complexity and interference when examining brain activity. By carefully deleting these channels, the system can focus on EEG data, simplifying the dataset and enhancing clarity. Non-EEG channels are identified by their labels and removed, ensuring only relevant EEG signals are retained. This approach improves data accuracy and maximizes efficiency in signal processing and analysis. The removing tool was performed using Python's panda library, resulting in a more efficient and accurate technique for analysing EEG data. During capture, the Signals can be filtered within specified frequency bands to help with EEG data interpretation and get better the analysis of EEG data. The Figure 7 shows the data distribution process and its density in channels during lie and truth in this study, the experiment dataset underwent pre-processing steps, which included applying Fast Fourier Transform (FFT) filter with a range of 0.1–30 Hz. This specific range was chosen as it is the frequency most commonly associated with mental tasks [17]. Each recording was sampled at 192 Hz. The data was then normalised with StandardScaler and labelled. Figure 8 shows the treatment process before and after the preprocessing steps.
3.2 Classification step
Because it establishes the system's performance, the classification stage is essential. It makes use of signals, such as EEG signals, to forecast a class of data or outcome based on those signals' characteristics. In the classification stage of our study, we used CNN, LSTM, and MLP models.
3.2.1 Convolutional neural networks (CNN)
This article employs a convolutional neural network (CNN) architecture for classification, to prevent overfitting, it utilizes two convolutional layers, batch normalization, dropout regularization, and 128 filters with a kernel size of 3. The model uses 192 filters with a lower dropout rate of 0.1 in the second convolutional layer. The feature maps are down sampled using max pooling layers after each block. The model has highly connected layers, with the first layer having 256 units activated by the dropout rate and the Scaled Exponential Linear Unit (SELU) function. The model's output layer, consisting of a single neuron without activation function, is designed for binary categorization, and the Adam optimizer aids training with a 128-batch size and 0.001 learning rate. The model's hyper parameters were optimised using a tuning algorithm, and early stopping was applied to prevent overfitting. Furthermore, 5-fold cross-validation was employed to evaluate the algorithm performance, exposing its robustness.
3.2.2 Long short-term memory (LSTM)
This research also utilising the Long Short-Term Memory (LSTM) manner of deep learning algorithm for classification, with numerous layers and dropout rates set via hyper-parameter optimization using a tuning algorithm. The exploration resulting better hy-per-parameters consisting of a first LSTM layer set at 160 units and 0.3-dropout rate, and then succeed through two additional LSTM set as 23 and 90 units respectively. The algorithm employs a final LSTM layer at 224 units, with 0.4 dropout rate, 160 units fully connected layer, and 0.1 dropout rate, trained employ the Adam optimizer with value of 0.001 of a learning rate and batch size of 256. The classification threshold is set at 0.5, and early stopping is used during training to avoid overfitting and progress generalisation by intercepting the process if validation loss does not improve over a given time. The algorithm's resilience is demonstrated by 5-fold cross validation, which shows its performance across several data sets.
3.2.3 Multi-layer perceptron (MLP)
This study used as well Multi-Layer Perceptron (MLP) algorithm for classification utilizing an adaptable architecture with hyper-parameter tuning for preferable performance, consisting of many dense layers customizable in unit value, activation function, and dropout rate. The best structure for a dense layer was specified over a systematic hyper-parameter seeking, with a primary dense layer of 64 units activated via the exponential linear unit (ELU) function and a dropout rate of 0.4. Subsequent by dense layers were built with increasing units and various activation functions to moderate overfitting. For stability and convergence, batch normalisation is used after dense layers, and the algorithm is trained with the Adam optimizer, which has a high learning rate of 0.01, and can handle data complexity. The model was tested with a 64-batch batch size to balance computational efficiency and performance, as well as 5-fold cross-validation to ensure complete assessment across multiple data subsets. Early stopping in training eliminate overfitting and encourages convergence to the best answer. It stops training if validation loss does not improve across epochs.
3.3 Hyper parameter tuning
The Hyperband technique was implemented in this research as it explores the hyperparameter space in an efficient manner through the use of a bandit-based strategy for hyperparameter optimization. Hyperband task over a series of "bracketed" experiments, in which different hyperparameter configurations are utilized to train the model with each iteration. During each iteration, the model's performance is assessed using a designated measure, such as accuracy or F1 score. The model exhibiting the highest grade of performance is chosen, and the range of hyperparameters is reduced to focus on the most favourable configurations. The Table 1 demonstrates the Hyperband procedure, wherein two nested "for" loops are utilized [18]. The initial iteration cycles from 0 to smax, whereas the subsequent iteration occurs s times. The range of the outside loop is s to smax, while the inner loop, denoted by a bracket, executes s times. During each iteration, an equivalent amount is added to the budget while the quantity of models is reduced by η. The allocated budget is distributed equitably among the peripheral circuits by Hyperband, which utilizes approximately B total resources [18].
Figure 7. The density of the row channel of our data for truth and lie
Figure 8. Before and after preprocessing steps of the channel (0.1-30) Hz
Table 1. The Hyperband optimization method algorithm [18]
Algorithm 1: Hyperband algorithm for hyperparameter optimization Input: R, η (default η=3) Initialization: Smax =⌊ logη (R) ⌋, B= (Smax +1) R 1 for s ϵ {smax,smax -1,…,0} do 2 n=⌈ $\frac{B}{R} \frac{\eta^S}{(S+1)}$⌉, r=R $\eta^{-s}$ // begin SuccessiveHALVING with (n, r) inner loop 3 T=get_hyperparmeter_confguration(n) 4 for i ϵ {0,..., s} do 5 ni=⌊nη-i⌋ 6 ri=rηi 7 L={run_then_return_val_loss(t,ri): t ϵ T} 8 T=top_k(T,L,⌊ni/η⌋) 9 end 10 end 11 return Configuration with the smallest intermediate loss seen so far |
3.4 Cross validation
Cross-validation is a manner applied to minimize the issue of overfitting in deep learning algorithm. The method includes evaluating the paradigm execution by dividing the entire dataset into k folds and testing it over these many splits. The excluded fold is utilized for testing, whilst the remaining k-1 folds are employed for training. The folds are thoroughly switched to guarantee that each one is utilized for testing as well as training objectives. The final performance metrics are obtained by averaging the k values for each test fold. This ensures the utilization of k distinct sets for model evaluation, replicating unseen data and preventing overfitting [19]. For the present study, a 5-fold cross-validation approach was applied, as seen in Figure 9.
Figure 9. Represents the 5-fold cross-validation strategy [19]
3.5 Evaluation metrics
Metric evaluation is of the utmost importance when it involves finding the performance of deep learning algorithm in functions such as classification and regression. These metrics include commensurable statistics like recall, accuracy, and F1 score. The confusion matrix provides a useful tool for assessing the achievement of a classifier via a comparison between the predicted and real outcomes. A thorough comprehension of classification challenges requires an understanding of fundamental concepts, such as false positive (FP), true negative (TN), and false positive (TP). A true positive (TP) occurs when both the predicted and actual values are positive. When the actual value is negative but the predicted value is positive, this is known as a false negative (TN). When the expected outcome is negative but the true value is positive, this is known as a false positive (FP) [20]. The standard for assessment classification performance contains accuracy, which measures quantifies the percentage predicted of accurately samples, and this metrics can be calculated through applying below equation.
${Accuracy}=\frac{T P+F N}{F P+F N+T P+T N}$ (1)
Precision: the ratio of all positive predictions that are predicted accurately.
${Precision}=\frac{T P}{T P+F P}$ (2)
Recall: the proportion of all real positive observations that are correctly predicted.
${Recall}=\frac{T P}{T P+F N}$ (3)
The F1 score, a weighted average of precision and recall, indicates the superiority of classification performance.
$F 1-{Score} =2 * \frac{\text { Precision } * \text { Recall }}{\text { Precision + Recall }}$ (4)
This research generated a dataset called "experiment" (EXP) for the purpose of detecting lies using EEG signals. The dataset included a total of 14 EEG acquired channels. Analyses were execution on this dataset for lie detection. In addition, used the public dataset named Dryad dataset and also 14 -channel. The EXP used video and Dryad used pictures protocol. The suggested model for classification of EEG signal truth or deception statement. The proposed method is implemented on python 3.11.5. This model utilizing 5-fold cross-validation and acquire high fold accuracies as in Figure 10. It ranging from 99.52% to 99.93%. The method described above divides the dataset into five equal-sized folds, with the model training on four and testing on the remaining fold five times. The findings indicate continuous high performance across all folds, demonstrating resilience and generalizability. The 5-fold cross-validation helps reduce concerns like overfitting and offers an accurate assessment of the model's performance on unseen data. These high accuracy values increase confidence in the model's effectiveness in accurately predicting outcomes on new data samples.
The performance of CNN, LSTM, and CNN architecture for classification and truth detection in statements is illustrated in Figure 11 below. They show performance of CNN, LSTM, and MLP for EXP and the Dryad dataset in accuracy, F1 score, recall, and precision. The experimental findings indicate that all of the models exhibited high performance, achieving exceptional levels of precision, recall, F1 score, and accuracy. The CNN model consistently achieved the highest results across all criteria, with the LSTM model closely following. The MLP model performed adequately, albeit it yielded somewhat lower results compared to the other two models. The Dryad model yielded marginally worse metric scores compared to our experimental findings, indicating a comparatively lower performance. The Convolutional Neural Network (CNN) achieved an impressive validation accuracy of around 99.96% on the EEG experiment in our dataset and 99.36% on the Dryad dataset. The LSTM model achieved a validation accuracy of 99.64% in the Dryad dataset and around 99.87% in the EEG dataset. The MLP achieved a remarkable validation accuracy of more than 98.99% on our EEG dataset and 97.67% on the Dryad dataset. The results are ascribed to the utilization of advanced recording techniques, meticulous data, and a distinctive methodology. Both trials provide promising results, with our experiment demonstrating superior improvements attributed to the quality of the data. Nevertheless, their suitability for other situations could be restricted. More participants are required to enhance system dependability, enable real-time execution. In Figure 12 appears the confusion matrix for the EEG deceit detection dataset. The testing process involved 30,000 samples in total. With 15,000 samples per category. Out of these samples, 12 were misclassified. The chart also displays the average results of the CNN model in terms of training and loss accuracy in the lie detection experiment.
Figure 10. Displays all 5-cross validation accuracy results for 20 epochs
Figure 11. Presents’ evaluation metrics for MLP, CNN, and LSTM deep learning models, commonly used to evaluate classification model performance
Table 2. Summary of the comparison between earlier research and our research
NO. |
Author |
Number of Subjects |
Method |
Number of Channel |
Accuracy |
1 |
Baghel et al. [4] |
30 |
CNN |
14 |
84.44% |
2 |
Ramesh and Edla [5] |
9 |
SVM |
16 |
95.64% |
3 |
Khalil et al. [6] |
5 |
Ensemble |
16 |
86.0% |
4 |
Dodia et al. [7] |
20 |
LDA |
16 |
91.67% |
5 |
Edla et al. [8] |
30 |
DNN |
16 |
95% |
6 |
Boddu et al. [9] |
10 |
PSO-SVM |
16 |
96.45% |
7 |
Haider et al. [10] |
15 |
KNN |
14 |
97.9% |
8 |
Dodia et al. [11] |
20 |
ELM |
16 |
88.3% |
9 |
AlArfaj et al. [12] |
931 |
V-TAM |
32 |
98.5% |
10 |
Proposed method |
10 |
CNN |
14 |
99.96% |
Figure 12. Confusion matrix for the CNN model and display validation and training accuracy and loss
Table 2 above summaries existing studies on detecting deceit using EEG signals. The table includes in-formation on dataset used, participant numbers, EEG channel numbers, techniques, classification methods, and outcomes. The objective of this table is to com-pare studies and emphasize the distinguishing features of this research in comparison to others in the literature. However, the results of this experiment are better than earlier lie detection studies employing EEG signals, indicating success. The experimental findings shown in Table 2 demonstrate how different machine learning models perform in lie detection system. Out of all of them, one model stands out for its exceptional precision and efficiency. With only 14 channels, proposed Convolutional Neural Network (CNN) achieved an astounding 99.96% accuracy. This astounding outcome demonstrates how well CNNs capture complex patterns in the data, which makes them indispensable tools for tasks requiring delicate analysis. when conducting a performance comparison between the classifiers generated using both EXP private dataset and a publicly accessible dataset such as Dryad, proposed EXP dataset consistently demonstrated superior accuracy across all models. These results illustrate the effectiveness of suggested method in catching more subtle signal patterns. Furthermore, in comparison to previous research, our classifiers demonstrated higher accuracy, highlighting the excellence of our distinctive methodology and dataset.
This research uses a deep learning algorithm for automated lie identification from EEG signals, using 14 channel EEG signals for classification. A novel dataset and protocol that used video for a robbery from a large store have been developed to implement the experimental investigation on EEG untruth detection. Furthermore, we compared the constructed dataset to other datasets exist, such as the Dryad dataset, and we compared the protocol's created results to others in different metrics. Finally, compare the accuracy and other metrics of several deep learning classification algorithms such as the MLP, LSTM, and CNN models. The models CNN, LSTM, and MLP achieved high accuracy than other method it achieved 99.96%, and 99.87, 98.99% in our dataset respectively and 99.36%, 99.64%and and 97.67% on the Dryad dataset. Overall, the experiment findings show that all of the models performed well, with excellent precision, recall, F1 score, and accuracy values. The CNN model consistently produced highest findings across all metrics. It obtained high metrics in our Experiment Dataset Precision 0.999733, Recall 0.999467, F1-Score 0.9996, and Accuracy 0.9996, and in the Dryad Dataset Precision 0.998855, Recall 0.988333, F1-Score 0.993566, and Accuracy 0. 9936.The results above show that the data acquired was superior and more accurate in all metrics, indicating the efficiency of the acquisition data and protocol followed. This study presents a lot important developments, such as the creation of a novel experimental protocol that combines video stimuli with focused questions, the generation of an original dataset from 10 participants, and the evaluation of three distinct classifiers. Furthermore, this experiment used sophisticated deep learning methods including a hyperparameter tuning algorithm, early stopping to avoid overfitting, and 5-fold cross-validation to guarantee strong and dependable model performance. These methodologies significantly improve the precision and applicability of suggested models. proposed EXP dataset surpasses both public datasets like Dryad and prior research in terms of accuracy, highlighting the importance of suggested methodology. These contributions expand the limits of signal analysis and establish a more efficient model for subsequent research. Potential future enhancements could include augmenting the subject count to improve the real-time precision and dependability of the system.
Ethical approval. The EEG recordings utilized for constructing the lie detection dataset, together with the techniques employed on the subjects during the signal acquisition process, was documented in the informed consent format (ICF). Every individual included in the research process provided explicit and knowledgeable permission.
We would like to express our heartfelt appreciation to everyone who helped us write this paper. Their assistance and direction have been crucial to its accomplishment. We are appreciative of their tremendous support and encouragement. We are also very grateful to the University of Al-Iraqi and the Ministry of Higher Education and Scientific Research for their support and resources, which were very important in getting the research completed.
[1] Vicianova, M. (2015). Historical techniques of lie detection. Europe's Journal of Psychology, 11(3): 522-534. https://doi.org/10.5964/ejop.v11i3.919
[2] Siuly, S., Li, Y., Zhang, Y.C. (2016). EEG signal analysis and classification techniques and applications. Springer Cham. https://doi.org/10.1007/978-3-319-47653-7
[3] Nagale, T., Khandare, A. (2023). Comprehensive review of lie detection in subject based deceit identification. In International Conference on Intelligent Computing and Networking, Singapore, pp. 89-105. https://doi.org/10.1007/978-981-99-3177-4_7
[4] Baghel, N., Singh, D., Dutta, M.K., Burget, R., Myska, V. (2020). Truth identification from EEG signal by using convolution neural network: Lie detection. In 2020 43rd International Conference on Telecommunications and Signal Processing (TSP), Milan, Italy, pp. 550-553. https://doi.org/10.1109/TSP49548.2020.9163497
[5] Ramesh, M., Edla, D.R. (2022). Lie detection with the SMOTE technique and supervised machine learning algorithms. In International Conference on Machine Intelligence and Signal Processing, India, pp. 885-896. https://doi.org/10.1007/978-981-99-0047-3_74
[6] Khalil, M.A., Can, J., George, K. (2023). Deep learning applications in Brain computer interface based lie detection. In 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, pp. 189-192. https://doi.org/10.1109/CCWC57344.2023.10099109
[7] Dodia, S., Edla, D.R., Bablani, A., Ramesh, D., Kuppili, V. (2019). An efficient EEG based deceit identification test using wavelet packet transform and linear discriminant analysis. Journal of Neuroscience Methods, 314: 31-40. https://doi.org/10.1016/j.jneumeth.2019.01.007
[8] Edla, D.R., Dodia, S., Bablani, A., Kuppili, V. (2021). An efficient deep learning paradigm for deceit identification test on EEG signals. ACM Transactions on Management Information Systems (TMIS), 12(3): 1-20. https://doi.org/10.1145/3458791
[9] Boddu, V., Kodali, P. (2023). PSO-based optimization for EEG data and SVM for efficient deceit identification. Soft Computing-A Fusion of Foundations, Methodologies & Applications, 27(14): 9835-9843. https://doi.org/10.1007/s00500-023-08476-3
[10] Haider, S.K., Jiang, A., Jamshed, M.A., Pervaiz, H., Mumtaz, S. (2018). Performance enhancement in P300 ERP single trial by machine learning adaptive denoising mechanism. IEEE Networking Letters, 1(1): 26-29. https://doi.org/10.1109/lnet.2018.2883859
[11] Dodia, S., Edla, D.R., Bablani, A., Cheruku, R. (2020). Lie detection using extreme learning machine: A concealed information test based on short-time Fourier transform and binary bat optimization using a novel fitness function. Computational Intelligence, 36(2): 637-658. https://doi.org/10.1111/coin.12256
[12] AlArfaj, A.A., Mahmoud, H.A.H. (2022). A deep learning model for EEG-based lie detection test using spatial and temporal aspects. Computers, Materials & Continua, 73(3): 5655-5669. https://doi.org/10.32604/cmc.2022.031135
[13] LaRocco, J., Tahmina, Q., Lecian, S., Moore, J., Helbig, C., Gupta, S. (2023). Evaluation of an English language phoneme-based imagined speech brain computer interface with low-cost electroencephalography. Frontiers in Neuroinformatics, 17: 1306277. https://doi.org/10.3389/fninf.2023.1306277
[14] Knierim, M.T., Berger, C., Reali, P. (2021). Open-source concealed EEG data collection for Brain-computer-interfaces-neural observation through OpenBCI amplifiers with around-the-ear cEEGrid electrodes. Brain-Computer Interfaces, 8(4): 161-179. https://doi.org/10.1080/2326263X.2021.1972633
[15] Gao, J., Tian, H., Yang, Y., Yu, X., Li, C., Rao, N. (2014). A novel algorithm to enhance P300 in single trials: Application to lie detection using F-score and SVM. PLoS One, 9(11): e109700. https://doi.org/10.1371/journal.pone.0109700
[16] Films, S. (2022). Thief’s first heist goes terribly wrong. YouTube. https://www.youtube.com/watch?v=DCbkptWz5lo&list=PPSV.
[17] Rosenfeld, J.P., Soskins, M., Bosh, G., Ryan, A. (2004). Simple, effective countermeasures to P300-based tests of detection of concealed information. Psychophysiology, 41(2): 205-219. https://doi.org/10.1111/j.1469-8986.2004.00158.x
[18] Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A. (2018). Hyperband: A novel bandit-based approach to hyperparameter optimization. Journal of Machine Learning Research, 18(185): 1-52.
[19] Mekruksavanich, S., Jitpattanakul, A. (2023). Effective detection of epileptic seizures through EEG signals using deep learning approaches. Machine Learning and Knowledge Extraction, 5(4): 1937-1952. https://doi.org/10.3390/make5040094
[20] Zhang, X., Yao, L. (2021). Deep Learning for EEG-Based Brain–Computer Interfaces: Representations, Algorithms and Applications. World Scientific, Singapore. https://doi.org/10.1142/q0282