© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
This study presents the first integration of EfficientNet with time-distributed layers and K-Nearest Neighbors (KNN) classification for real-time driver drowsiness detection through sequential image processing. Unlike existing approaches that analyze individual frames in isolation, our methodology uniquely captures temporal patterns in facial expressions by combining EfficientNet's compound scaling properties for feature extraction with KNN's effectiveness in sequential pattern classification. The proposed EfficientNet-KNN architecture processes 5-second video segments through time-distributed layers, reducing feature dimensionality from 1,228,800 to 1,280 while preserving critical temporal information. Preprocessing incorporates facial detection using Haar Cascade filters and Global Average Pooling for computational optimization. Experimental validation on the ULg Multimodality Drowsiness Database (DROZY) demonstrates superior performance across varying dataset proportions (25%, 50%, 75%, and 100%). The integrated model achieved 99.52% accuracy with 0.55-second execution time on the complete dataset, representing a 49.9× improvement in computational efficiency compared to baseline KNN approaches while maintaining higher accuracy. Statistical t-test analysis confirmed significant performance differences (p<0.05), with Cohen's d indicating large effect sizes for both accuracy and speed improvements. However, evaluation remains limited to controlled laboratory conditions with potential demographic bias, requiring comprehensive real-world validation across diverse driving environments and population groups. The framework contributes a computationally efficient drowsiness detection system suitable for real-time automotive applications, balancing high accuracy with minimal resource requirements to enhance driver safety systems.
driver drowsiness detection, deep learning, EfficientNet, K-Nearest Neighbors, sequence processing, real-time detection, temporal analysis
Driver drowsiness represents a significant risk factor in road safety, causing momentary attention lapses that can lead to severe accidents [1, 2]. Recent studies indicate that drowsiness-related incidents account for a substantial proportion of traffic accidents globally, with significant economic and human costs [3]. This phenomenon is particularly prevalent during prolonged driving in monotonous traffic conditions, where drivers often underestimate their fatigue levels and continue operating vehicles despite increasing impairment [4]. As highlighted by Kumar and Tarei [4], unsafe driving behaviors significantly contribute to road accidents, with driver drowsiness being a key contributing factor.
Technological interventions, particularly those leveraging artificial intelligence and computer vision, present promising solutions to this critical safety challenge [5, 6]. The application of deep learning techniques for automated drowsiness detection represents a proactive approach to enhancing road safety through early warning systems [7]. Recent advances in this field have demonstrated remarkable potential, with CNN-based approaches achieving accuracy rates exceeding 95% in controlled environments [8]. These AI-powered systems analyze data from in-vehicle cameras to detect preliminary signs of drowsiness, enabling timely alerts and preventive measures. As demonstrated by Gwak et al. [6], ensemble machine learning approaches using hybrid sensing have shown particular promise in early drowsiness detection.
Current state-of-the-art approaches in drowsiness detection demonstrate varying levels of success. Deep learning-based methods, such as Magán et al.'s [9] ADAS system, show the potential of analyzing temporal image sequences, though achieving only moderate accuracy (65%). More advanced CNN implementations, as shown by JR et al. [8], have achieved 95.3% accuracy in controlled environments. Physiological signal-based approaches, exemplified by Ayatollahi et al.'s [10] ECG analysis, offer alternative detection mechanisms but face practical implementation challenges in vehicular environments. While these studies advance the field, they often prioritize accuracy over computational efficiency, limiting their practical deployment in real-time automotive systems.
Despite these advances, current drowsiness detection systems face several critical limitations. First, many approaches analyze individual frames in isolation, failing to capture the progressive nature of fatigue manifestation in facial expressions over time, as noted in recent temporal analysis studies [11, 12]. Second, existing solutions often require substantial computational resources, making real-time deployment challenging in resource-constrained automotive environments [13]. Third, the trade-off between detection accuracy and processing speed remains inadequately addressed, particularly for sequential image analysis. These limitations are further complicated by variations in environmental conditions and individual differences in drowsiness expression patterns [14], highlighting the need for more robust and efficient detection approaches.
This paper addresses these challenges through three primary contributions: 1) Development of a novel architectural integration combining EfficientNet's feature extraction capabilities with KNN classification for optimal drowsiness detection, building upon recent advances in efficient deep learning architectures [15, 16]; 2) Implementation of an efficient sequence processing approach that captures temporal patterns in facial expressions while maintaining minimal computational overhead, addressing the limitations identified in previous temporal analysis studies [17, 18]; 3) Introduction of a comprehensive evaluation framework demonstrating significant improvements in both accuracy (99.52%) and computational efficiency (49.9× faster than baseline approaches).
Our approach uniquely leverages EfficientNet's compound scaling properties for feature extraction while utilizing KNN's effectiveness in classification, particularly for sequential data patterns indicative of drowsiness progression.
The remainder of this paper is organized as follows: Section 2 provides a comprehensive review of drowsiness detection approaches, highlighting the evolution of methodologies in this field. Section 3 details our technical approach, including the integration of EfficientNet feature extraction with KNN classification and our sequence processing methodology. Section 4 presents experimental results demonstrating substantial improvements in both accuracy and computational efficiency. Section 5 discusses the implications of the findings for automotive safety systems, while Section 6 concludes the study and highlights avenues for future research.
Vision-based drowsiness detection systems utilizing Convolutional Neural Networks (CNNs) have emerged as a prominent solution for enhancing road safety by identifying driver fatigue through visual cues. These systems typically analyze facial features, particularly around the eyes and mouth, to detect signs of drowsiness, such as eye closure and yawning. The effectiveness of CNN-based approaches in this domain is underscored by their ability to achieve high accuracy rates, often exceeding 90% in various studies. For instance, a study by Adhithyaa et al. [19] highlights that CNNs have become a state-of-the-art method for detecting drowsiness, despite challenges posed by variations in lighting and facial expressions. Similarly, Florez et al. [20] reported that their CNN-based approach achieved an impressive accuracy of 99.71% in detecting drowsiness through real-time eye state identification. Other studies, such as that by Jahan et al. [21], emphasize the potential of deep learning techniques to automate drowsiness detection, significantly improving the chances of preventing accidents. However, while CNNs demonstrate remarkable accuracy, they also come with limitations. One significant challenge is their computational intensity, which can hinder real-time application, especially on mobile platforms. Sowmyashree and Sangeetha noted that although CNNs provide good accuracy, their computational demands can be a barrier to implementation in resource-constrained environments [22]. Additionally, the accuracy of these systems can be influenced by individual differences among drivers, such as facial structure and the presence of eyewear, which can obstruct the detection process [14]. Furthermore, variations in environmental conditions, such as lighting and camera angles, can also affect the performance of CNN-based drowsiness detection systems [13].
Vision-based drowsiness detection systems using MobileNet architectures have gained traction due to their lightweight nature and efficiency in real-time applications. MobileNet, a family of Convolutional Neural Networks (CNNs), is designed to perform well on mobile and edge devices, making it suitable for drowsiness detection systems that require quick processing without sacrificing accuracy. MobileNet's feature extraction capabilities are particularly beneficial for detecting drowsiness through facial analysis. The architecture utilizes depthwise separable convolutions, which significantly reduce the number of parameters compared to traditional CNNs while maintaining competitive performance. For instance, Kim et al. [23] demonstrated that their lightweight driver monitoring system based on Multi-Task MobileNets achieved an accuracy of 95% in detecting drowsiness by analyzing eye movements and facial expressions. This efficiency is crucial for deployment in vehicles where computational resources may be limited. Another study by Phan et al. [24] highlighted the effectiveness of MobileNet-V2 in detecting driver drowsiness, achieving an accuracy of 93.7% [24]. The authors emphasized the model's ability to analyze video frames in real-time, allowing for immediate feedback to drivers. This capability is essential for practical applications, as timely alerts can significantly reduce the risk of accidents caused by drowsiness. Despite these advantages, there are notable limitations associated with using MobileNet for drowsiness detection. One significant challenge is the model's sensitivity to variations in lighting conditions and facial occlusions, such as sunglasses or masks. As noted by Lee et al. [25], traditional CNNs, including MobileNet, may struggle to maintain accuracy under these conditions, potentially leading to false negatives in drowsiness detection. Furthermore, while MobileNet is designed to be lightweight, the trade-off often results in a reduction in feature extraction depth, which can impact the model's ability to capture subtle signs of drowsiness [13].
Sequential analysis in drowsiness detection using temporal feature extraction methods has become increasingly significant in enhancing the accuracy and reliability of driver monitoring systems. This approach focuses on analyzing sequences of data over time, allowing for the detection of patterns that indicate drowsiness, such as eye closure duration, blink frequency, and head movements. Temporal feature extraction methods can leverage various deep learning architectures, including Long Short-Term Memory (LSTM) networks and CNNs, to process video frames and extract meaningful features that correlate with driver alertness. The work by Saif and Rasyid highlighted the use of CNN architectures to analyze video clips of drivers, achieving superior accuracy in identifying drowsiness compared to traditional methods that rely solely on static images [11]. However, there are limitations associated with temporal feature extraction methods in drowsiness detection. One significant challenge is the computational complexity involved in processing video sequences in real-time. As noted by Wijnands et al. [12], maintaining high accuracy while ensuring low latency can be difficult, particularly in resource-constrained environments such as vehicles. Additionally, variations in lighting conditions and occlusions can affect the quality of the input data, leading to potential inaccuracies in detection.
Here, our research makes a novel contribution by integrating video sequence-based detection with a pre-processing segmentation method before windowing. The architectural combination of EfficientNet and KNN in sequence processing enhances learning performance and improves drowsiness detection accuracy. The uniqueness of our approach lies in its effective integration of video sequence techniques, segmentation strategies, and efficient learning models to significantly boost detection performance.
The primary objective of our research methodology was to develop an algorithm capable of achieving optimal performance in drowsiness detection while maintaining a compact model size and rapid response time. We focus on analyzing the driver's facial features as the central factor in detection, with an emphasis on efficiency to deliver practical solutions for real-world implementation.
Figure 1 illustrates our research framework, encompassing data acquisition and pre-processing, data separation, model design using EfficientNet and KNN approaches in sequence processing, validation using k-fold cross-validation, and comprehensive model evaluation.
Figure 1. Research framework for drowsiness detection
3.1 Dataset and data acquisition
This research utilizes the ULg Multimodality Drowsiness Database (DROZY database) as the primary data source [26]. This multimodal dataset comprises EEG signals, facial expression videos, and subjective assessments using the Karolinska Sleepiness Scale (KSS). The dataset contains 36 videos: 14 depicting non-drowsy conditions and 22 showing drowsy states. Each video is approximately 10 minutes in duration, with a frame resolution of 512×424 pixels, stored in mp4 format at frame rates ranging from 15 to 30 frames per second (fps). These videos were carefully selected to provide diversity and accurately represent drivers' drowsy states, enabling effective detection of various fatigue-related facial expressions.
3.2 Data preprocessing
In the initial preprocessing stage, each video is segmented into 5-second intervals using a windowing approach [27]. This timeframe was selected because microsleep episodes—brief involuntary sleep occurrences—typically manifest within this duration. Moreover, the 5-second temporal window was selected based on neurophysiological evidence and automotive safety requirements, rather than arbitrary choice. 1) This selection is supported by multiple converging factors: Clinical studies demonstrate that microsleep events typically occur within 1-15 seconds, with peak frequency around 3-8 seconds [28]; 2) Drowsiness-related facial changes (progressive eyelid closure, reduced blink frequency, head nodding) manifest over 3-7 second intervals before reaching critical fatigue states [29]; 3) Driver response capability significantly deteriorates within 4-6 seconds preceding sleep onset, making this window critical for intervention timing [30].
Thus, For each 5-second segment, we capture 25 consecutive image frames to document the driver's facial expressions throughout the period, providing diverse facial cues for model training.
Subsequently, each captured image is resized to 128×128 pixels [31, 32] to reduce computational complexity and improve processing efficiency while preserving essential visual information. The Haar Cascade method is then applied to detect faces in each image [33], ensuring the model focuses specifically on facial features relevant to drowsiness detection.
The Haar Cascade technique, based on Haar features, detects specific objects by analyzing light and shadow patterns in images. The method was selected for its effectiveness in real-time object detection [34]. Haar Cascade employs Haar filters, integral images, and cascade phases during the training process. The Haar filter can be visualized as a rectangular function with positive values in one half and negative values in the other, expressed as:
$H(x, y)=\left\{\begin{aligned} 1, & { if\ the\ pixel }(x, y) {is\ inside\ the\ eye\ are } \\ -1, & { others }\end{aligned}\right.$ (1)
where, x and y represent pixel coordinates in the two-dimensional image space. In drowsiness detection, these coordinates help identify pixel locations in the eye area—a critical region for detecting fatigue indicators such as slow eye movements or closure.
The integral image (II) calculation facilitates rapid determination of pixel counts in specific image regions using the cumulative formula:
$I I(x, y)=\sum_{i=0}^x \sum_{j=0}^y I(i, j)$ (2)
where, I(i,j) represents pixel intensity at position (i,j), and II(x,y) is the cumulative result up to coordinates (x,y). This formula provides an efficient representation for calculating pixel quantities, accelerating the Haar Cascade detection process [35].
Figure 2 shows visual representations of facial frames in the driver drowsiness detection process, while Figure 3 illustrates the results of applying the Haar Cascade method to the image data.
Following face detection, the processed image data is normalized by dividing pixel values by 255 [36], adjusting the pixel range from 0-255 to 0-1. This normalization standardizes the data to align with model training requirements, particularly for activation functions that operate optimally with inputs in specific ranges [37, 38].
Figure 2. Visual representations of facial frames in the driver drowsiness detection process (a) normal; (b) drowsiness
Figure 3. (a) Pre-processing visualizations and (b) Haar cascade method results on image data
The prepared dataset is then divided into training (70%) and testing (30%) sets [39], with label stratification to maintain consistent class distribution across subsets. Additional experiments were conducted with varying dataset proportions (100%, 75%, 50%, and 25% of the initial dataset) to investigate the impact of dataset size on model performance [40].
3.3 Model deployment
Our study aims to enhance eye state classification accuracy (open versus closed), exceeding traditional detection systems. The research progresses through two model development strategies: first, implementing a K-Nearest Neighbors (KNN) model; second, employing EfficientNet for feature extraction followed by comprehensive model comparison.
3.3.1 Strategy I: Model construction with the K-Nearest Neighbors algorithm
The first strategy employs the KNN algorithm for drowsiness detection. KNN is a machine learning classification method that considers the majority class from a set of K-Nearest Neighbors [41, 42]. In our context, KNN identifies facial expression patterns associated with drowsiness by comparing similarities with previously acquired training data [43]. Figure 4 presents the complete architecture for this approach.
Figure 4. Classification process using Strategy 1: K-Nearest Neighbors (KNN)
KNN is expected to achieve high accuracy in recognizing specific drowsiness indicators such as closed eyes or slower facial movements. The algorithm classifies items based on their similarity to training data using the Euclidean distance formula:
$d(A, B)=\sqrt{\sum_{i=1}^n\left(A_i-B_i\right)^2}$ (3)
where, Aᵢ and Bᵢ are the i-th components of vectors A and B, and n is the number of features. For classification, KNN identifies the K closest training instances and assigns the class based on majority voting:
$y^{\prime}=\operatorname{argmax}_c\left(\sum_{i=1}^K \delta\left(c_i, c\right)\right)$ (4)
where, argmaxc is the class with the largest count among K nearest neighbors, cᵢ is the class of the i-th neighbor, and δ is the Kronecker delta function (1 if cᵢ = c, 0 otherwise).
For high-dimensional facial feature data, small K-values are preferred to preserve local pattern sensitivity crucial for subtle drowsiness indicators, ability to reduce computational overhead for real-time processing requirements, also maintain discriminative power in sparse high-dimensional spaces [44].
3.3.2 Strategy II: Model construction with KNN and feature extraction using EfficientNet
Our second strategy integrates EfficientNetB0 for feature extraction with KNN for classification. EfficientNetB0 is utilized to extract meaningful features from facial image frames [15, 16].
EfficientNet as feature extraction. EfficientNet represents a family of convolutional neural network models that achieve state-of-the-art accuracy with significantly fewer parameters than traditional architectures. The model employs compound scaling to systematically balance network depth, width, and resolution dimensions. EfficientNetB0, the baseline architecture in this family, utilizes mobile inverted bottleneck convolution (MBConv) as its core building block, incorporating squeeze-and-excitation optimization.
In our implementation, EfficientNetB0 serves as a feature extractor that processes input images and produces rich feature representations. The model's efficient design allows it to capture complex patterns in facial features while maintaining computational efficiency—a crucial consideration for real-time drowsiness detection systems.
Time distributed layer. Since driver videos contain sequences of frames demonstrating temporal changes in facial expressions, we implement a "time-distributed" layer in the EfficientNetB0 architecture [17]. This specialized layer enables sequential feature extraction on each facial frame in the video, facilitating temporal understanding of facial expression evolution [18].
The time-distributed layer wraps the EfficientNetB0 base model, applying it to each temporal slice of input separately. This approach preserves the temporal relationship between consecutive frames while extracting spatial features from each frame. By processing the input as a sequence rather than independent frames, the model can detect temporal patterns indicative of drowsiness development, such as progressive eye closure or head position changes across multiple frames [45].
Global Average Pooling. Following feature extraction, Global Average Pooling (GAP) is applied to reduce the dimensionality of the extracted features [46]. GAP calculates the average feature values across the entire spatial domain of the image, producing a more condensed representation while preserving essential information [47].
In our EfficientNetB0 implementation, GAP is applied after feature extraction to reduce data complexity while retaining discriminative information relevant to drowsiness detection. This condensed feature representation serves as input for the subsequent KNN classification, effectively combining EfficientNetB0's feature extraction capabilities with KNN's classification strengths [48].
Figure 5 illustrates the complete architecture for this integrated approach.
Figure 5. Classification process using Strategy 2: K-Nearest Neighbors (KNN) with feature extraction using
Figure 6. feature extraction visualization using 3-component PCA for awake and drowsy
Feature extraction results with 3-component PCA. To visualize the effectiveness of our feature extraction process, we employed Principal Component Analysis (PCA) to reduce the high-dimensional feature space to three principal components. Figures 6 and Table 1 present the visualization of these components for both awake and drowsy states.
Figure 6 demonstrates the distribution of extracted features in 3D space after dimensionality reduction through PCA. The clear separation between awake (green) and drowsy (red) states validates the discriminative power of our feature extraction approach. This visualization confirms that the features extracted by EfficientNetB0 contain sufficient information to distinguish between the two classes.
Table 1 presents detailed statistical distributions of the extracted features for both classes (y=0 for awake, y=1 for drowsy). The statistical properties of each principal component demonstrate distinct characteristics between the two classes. For the first principal component (PC1), awake subjects show a mean of 2.63 compared to -1.69 for drowsy subjects. PC2 exhibits an inverse relationship with means of -1.03 for awake versus 0.66 for drowsy states. PC3 follows a similar pattern with means of -0.28 for awake versus 0.18 for drowsy states. The variability (standard deviation) is comparable between classes across all components, with slightly higher variance in the awake state. These statistical differences support the discriminative capability of our feature extraction methodology, providing quantitative evidence that the extracted features contain sufficient information to differentiate between drowsiness states.
Table 1. Statistical distribution
|
Y=0 (Awake) |
||
|
PC=1 |
PC=2 |
PC=3 |
Min |
-33.224052 |
-29.862001 |
-25.142054 |
Max |
78.530464 |
55.671883 |
29.180292 |
Mean |
2.630844 |
-1.031686 |
-0.284525 |
Median |
-2.788281 |
-3.733640 |
1.641844 |
Std |
22.256229 |
15.719784 |
10.199729 |
Var |
495.339712 |
247.111608 |
104.034469 |
mode |
-10.447001 |
3.368059 |
-0.692336 |
|
Y=1 (Drowsy) |
||
|
PC=1 |
PC=2 |
PC=3 |
Min |
-59.51278 |
-41.645149 |
-30.733538 |
Max |
65.059044 |
41.759651 |
45.720730 |
Mean |
-1.692284 |
0.663629 |
0.183020 |
Median |
-2.702865 |
-1.614298 |
-0.463261 |
Std |
22.000011 |
14.938714 |
12.677819 |
Var |
484.000480 |
223.165182 |
160.727093 |
mode |
-59.512978 |
-41.645149 |
-30.733538 |
Following model development for each strategy, we employed K-Fold cross-validation with stratification (K=3) to evaluate model performance. This approach divides the dataset into three alternating subsets while maintaining balanced label distribution in each fold, providing robust assessment of model generalizability.
3.4 Model evaluation
We employed multiple performance metrics to evaluate our drowsiness detection models. Accuracy measures the algorithm's overall correctness in classifying the entire dataset [49]. Precision indicates the model's ability to generate accurate positive predictions of drowsiness while avoiding false positives [50]. Recall (sensitivity) assesses the model's capacity to identify all genuine instances of drowsiness. The F1 Score balances precision and recall, offering comprehensive insight into model performance [50].
Additionally, execution time metrics are critical for drowsiness detection systems intended for real-time applications. The model's ability to generate predictions promptly ensures timely response to driver fatigue, potentially preventing accidents.
This section presents the experimental findings from our driver drowsiness detection methodology, evaluating both proposed strategies under various configurations.
4.1 Results of Strategy I: Model construction with the K-Nearest Neighbors algorithm
Strategy I employed KNN with original features of dimension 25×128×128×3 (1,228,800 features), where each sample comprised 25 facial frames at 128×128 resolution with 3 RGB channels. While this extensive feature set captures detailed facial expression patterns, it presents computational challenges due to high dimensionality.
The initial phase involved determining the optimal K value for the KNN algorithm through three-fold cross-validation on a 25% dataset partition. Table 2 presents the accuracy metrics for K=3, K=5, and K=7 across each validation fold.
From Table 2, The experimental results demonstrate that K=3 achieved the highest mean test accuracy at 96.7%, indicating robust classification performance for facial expression patterns indicative of driver drowsiness. This high accuracy establishes a reliable foundation for further model refinement and deployment.
Table 2. Results of determining the optimal k parameters in the KNN algorithm
K |
Fold 1 |
Fold 2 |
Fold 3 |
Mean Test Score |
3 |
0.967 |
0.963 |
0.976 |
0.967 |
5 |
0.955 |
0.963 |
0.963 |
0.961 |
7 |
0.947 |
0.955 |
0.947 |
0.950 |
4.2 Results of Strategy II: Integration of K-Nearest Neighbors with EfficientNet feature extraction
Strategy II implemented feature extraction using EfficientNet before KNN classification. Table 3 presents the architecture of this feature extractor network, processing 25 frames of 128×128 pixel facial images with 3 color channels.
The time-distributed layer sequentially processes each frame, yielding a tensor of dimension (4, 4, 1280) per frame. The GlobalAveragePooling3D layer then calculates the average feature values across the spatial domain, producing a condensed feature vector of length 1280. This represents a significant dimensionality reduction compared to Strategy I's 1,228,800 features.
The total parameter count for this model is 4,049,571 (4,007,548 trainable, 42,023 non-trainable), demonstrating EfficientNet's ability to create rich feature representations with computational efficiency. The dimensionality reduction from 1,228,800 to 1,280 features while maintaining discriminative power highlights the effectiveness of this approach for drowsiness detection.
Table 3. Results of determining the optimal k parameters in the KNN algorithm
Layer (Type) |
Output Shape |
Parameter |
input (InputLayer) |
(None, 25, 128, 128, 3) |
0 |
time_distributed (TimeDistributed) |
(None, 25, 4, 4, 1280) |
4049571 |
global_average_pooling3d (GlobalAveragePooling3D) |
(None, 1280) |
0 |
|
|
|
Total parameters |
|
4,049,571 |
Trainable parameters |
|
4,007,548 |
Non-trainable parameters |
|
42,023 |
4.3 Comparative analysis: Performance evaluation of Strategies I and II
Table 4 presents a comprehensive comparison of both strategies, KNN and EfficientNet + KNN (EFF+KNN) across varying dataset percentages (25%, 50%, 75%, and 100%). Performance metrics include accuracy, precision, recall, F1-score, and execution time in seconds.
Table 4. Comparison of the performance of Strategy Models 1 and 2 based on dataset composition in driver drowsiness detection
Performance Metrics |
KNN |
|||
25% |
50% |
75% |
100% |
|
Accuracy |
0.9810 |
0.9858 |
0.990 |
0.993 |
Precision |
0.9788 |
0.9841 |
0.990 |
0.992 |
Recall |
0.9815 |
0.9861 |
0.989 |
0.993 |
F1-Score |
0.9801 |
0.9851 |
0.990 |
0.993 |
Time (Sec) |
6.81 |
14.29 |
20.73 |
27.43 |
Performance Metrics |
EFF + KNN |
|||
25% |
50% |
75% |
100% |
|
Accuracy |
0.9842 |
0.9921 |
0.9926 |
0.9952 |
Precision |
0.9841 |
0.9907 |
0.9920 |
0.9957 |
Recall |
0.9826 |
0.9928 |
0.9925 |
0.9943 |
F1-Score |
0.9833 |
0.9917 |
0.9923 |
0.9950 |
Time (Sec) |
0.21 |
0.41 |
0.45 |
0.55 |
The results demonstrate that the EFF+KNN model with 100% dataset utilization achieved the highest accuracy at 99.52%, with corresponding precision of 99.57%, recall of 99.43%, and F1-score of 99.50%. Figure 7 visually represents this performance comparison, illustrating the superior metrics of the EFF+KNN approach across all dataset configurations.
Notably, the execution time for the EFF+KNN model on the complete dataset was only 0.55 seconds, compared to 27.43 seconds for the KNN model—a 49.9× improvement in computational efficiency while maintaining superior accuracy. This significant execution time reduction is crucial for real-time drowsiness detection systems where timely alerts can prevent accidents.
Figure 8 presents the confusion matrix for the optimal EFF+KNN model with 100% dataset utilization. The matrix reveals:
- True Negatives (TN): 489 data points correctly classified as Awake
- False Positives (FP): 1 data point incorrectly classified as Drowsy
- False Negatives (FN): 5 data points incorrectly classified as Awake
- True Positives (TP): 768 data points correctly classified as Drowsy
Figure 7. Comparison of the performance of Strategy Models 1 and 2 based on dataset composition in driver
Figure 8. Confusion matrix for optimal EFF+KNN with 100% dataset utilization
The model achieved 99.80% precision and 98.99% recall for the Awake class, with an F1-score of 99.39%. For the Drowsy class, the model demonstrated 99.35% precision and 99.87% recall, with an F1-score of 99.61%. These metrics confirm the model's exceptional ability to discriminate between awake and drowsy states, fulfilling the primary objective of reliable drowsiness detection.
Moreover, compared with other researchers as seen in Table 5, Our proposed method (EFF+KNN) outperformed research from Lin et al. [51] which using multi-aware graph convolutional network, Pahariya et al. [52] also only obtain 86% accuracy using mobilenet with transfer learning, Wunan et al. [53] obtained 98.5% using EfficientNet B0.
Table 5. Comparison with other researchers using DROZY
Author |
Method |
Accuracy |
Precision |
Recall |
our |
EFF+KNN |
99.52 |
99.57 |
99.43 |
[51] |
MAGCN |
95.79 |
95.66 |
99.56 |
[52] |
Mobilenet |
86% |
- |
- |
[53] |
EfficientNet BO |
98.5 |
- |
- |
4.4 Statistical significance of sequential image data processing
Table 6 presents t-test results comparing the KNN and EFF+KNN approaches. With t-statistic values of -3.839 for accuracy and 3.894 for execution time (both with p-values < 0.05), the performance differences between the models are statistically significant.
Table 6. Statistical t-test between Strategy 1 and Strategy 2
Performance Metrics |
t-Statistic Value |
Significant Probability |
Accuracy |
-3.839 |
0.031 |
Time |
3.894 |
0.030 |
The superior performance of the EFF+KNN model can be attributed to the effectiveness of sequential or time-distributed layers in EfficientNetB0. These layers enable sequential feature extraction across multiple facial frames, enhancing the model's awareness of temporal changes in facial expressions. This temporal context is critical for drowsiness detection, as fatigue typically manifests as progressive changes in facial features over time rather than instantaneous transformations.
We also conducted another Statistical analysis on experimental data from Table 7, comparing KNN and EFF+KNN performance across four dataset proportions (25%, 50%, 75%, 100%).
Table 7. Statistical analysis on accuracy and execution time
Metrics |
KNN (mean std) |
EFF KNN (mean std) |
P-value |
Cohen’s d |
95% CI |
Effect Size |
Accuracy |
0.9875±0.0051 |
0.9910±0.0048 |
0.031* |
0.72 |
[0.0008, 0.0062] |
Medium |
Execution Time (s) |
17.32±8.41 |
0.41±0.14 |
0.030* |
2.89 |
[11.23, 22.59] |
Large |
Our experimental results demonstrate that combining EfficientNet for feature extraction with KNN for classification (EFF+KNN) produces superior performance in driver drowsiness detection compared to using KNN alone. Several key factors contribute to this enhanced performance:
5.1 Model architecture and feature representation
The significant improvement in both accuracy and execution time for the EFF+KNN model can be attributed to EfficientNet's ability to extract highly discriminative features while reducing dimensionality. The original feature space of 1,228,800 dimensions (25×128×128×3) was reduced to 1,280 dimensions through EfficientNet's architecture, resulting in a more compact yet informative representation.
This dimensionality reduction directly impacts execution time, with the EFF+KNN model processing data approximately 50 times faster than the KNN-only approach. Such efficiency is crucial for real-time drowsiness detection systems where prompt alerts can prevent accidents. The statistical significance of these performance differences, confirmed through t-tests (p < 0.05), validates the superiority of the combined approach.
5.2 Temporal context benefits
The integration of time-distributed layers in our EFF+KNN model enables sequential processing of facial frames, capturing temporal relationships that are essential for drowsiness detection. This temporal awareness allows the model to recognize progressive patterns such as gradual eye closure, head nodding, or delayed blink recovery—subtle indicators of increasing fatigue that might be missed in frame-by-frame analysis.
The confusion matrix results further support this benefit, with high precision and recall for both classes (Awake and Drowsy). The model's ability to maintain high accuracy while significantly reducing false positives and false negatives demonstrates its robustness in real-world conditions.
5.3 Dataset size implications
Our experiments across varying dataset proportions (25% to 100%) reveal interesting insights about model scalability. Both models showed performance improvements with increased data availability, but the EFF+KNN model demonstrated stronger performance even with limited data. This suggests that the feature extraction capabilities of EfficientNet effectively capture essential drowsiness indicators even from smaller datasets, making it particularly valuable for applications where data collection might be constrained.
5.4 Practical applications
The high accuracy (99.52%) and rapid execution time (0.55 seconds) of our EFF+KNN model have significant implications for practical deployment in vehicles. The model's compact size and computational efficiency make it suitable for integration with limited-resource hardware in automotive systems, potentially enabling widespread adoption of drowsiness detection technology.
The minimal false positive rate is particularly important, as frequent false alarms could lead to driver complacency or system disregard. Conversely, the high detection rate for genuine drowsiness ensures timely interventions when needed, potentially preventing fatigue-related accidents.
5.5 Limitations and considerations
Despite the impressive performance achieved, several critical limitations warrant consideration for transparent reporting and future research guidance. The current study is primarily constrained by demographic bias inherent in the ULg DROZY database, which predominantly comprises Caucasian European participants with limited age diversity (mean age ~23 years), potentially affecting model generalizability across diverse ethnic populations and age groups. This demographic homogeneity may limit detection accuracy for individuals with different facial structures, eyelid morphologies, and cultural variations in fatigue expression patterns that are common in global automotive applications. Additionally, the controlled laboratory evaluation environment may not fully represent real-world driving conditions, including dynamic lighting variations, environmental factors such as vehicle vibrations and weather conditions, and the absence of actual driving stress that could influence fatigue manifestation patterns. The model's reliance on facial features makes it vulnerable to common occlusions such as sunglasses, face masks, or hair coverage, while the single-modal vision-only approach excludes potentially valuable drowsiness indicators from steering patterns, physiological signals, or voice analysis.
To address these limitations, several mitigation strategies are proposed: (1) Demographic diversity enhancement through collaboration with international research institutions to develop multi-ethnic datasets spanning diverse age ranges and professional driver populations, coupled with transfer learning approaches for demographic-specific model adaptation; (2) Environmental robustness improvement via comprehensive data augmentation strategies including synthetic lighting condition generation, weather effect simulation, and multi-environmental field validation across different geographic locations and vehicle configurations; (3) Technical enhancement through multimodal integration incorporating steering pattern analysis, physiological monitoring, and adaptive temporal window sizing for individual-specific fatigue progression modeling; and (4) Validation robustness enhancement using independent cross-cultural datasets and longitudinal studies to assess seasonal variations and individual adaptation patterns. While these limitations must be considered when interpreting results, the study provides valuable contributions in demonstrating the first systematic evaluation of EfficientNet-KNN integration for drowsiness detection, establishing a statistical framework for method comparison, and achieving computational efficiency suitable for real-time automotive applications that inform future development of more robust, inclusive, and practical drowsiness detection systems.
This research presents a novel approach to driver drowsiness detection through the integration of EfficientNet for feature extraction and KNN for classification of sequential image data. Our methodology demonstrates exceptional performance, achieving 99.52% accuracy with efficient execution time of 0.55 seconds, significantly outperforming traditional approaches.
The key contributions of this study include: 1) Development of an efficient drowsiness detection algorithm balancing high accuracy with minimal computational requirements; 2) Effective utilization of sequence processing techniques to capture temporal patterns in facial expressions; 3) Statistical validation of performance improvements through robust experimental design; 4) Practical implementation considerations for real-world automotive applications.
The EfficientNet + KNN (EFF+KNN) model's ability to accurately classify drowsiness states while maintaining computational efficiency addresses a critical gap in existing drowsiness detection systems, potentially enabling more widespread adoption of this safety technology.
Future research directions will focus on several critical technical advancement areas to translate these laboratory achievements into practical automotive deployments. We plan to pursue edge computing optimization as an immediate priority, developing TinyEfficientNet architecture with knowledge distillation to achieve compact model sizes suitable for automotive Electronic Control Units (ECUs). This implementation will involve creating teacher-student distillation frameworks using EfficientNetB0 as teacher and TinyEfficientNet with reduced width and depth multipliers as student, targeting high knowledge retention with significant size reduction. We intend to explore quantization techniques using post-training quantization methods to further reduce memory footprint while maintaining acceptable accuracy levels, coupled with ONNX Runtime optimization to enable cross-platform deployment on ARM Cortex and Qualcomm Snapdragon automotive processors, with comprehensive real-time inference benchmarking on actual automotive hardware across various temperature conditions.
This research is supported by Universitas Dian Nuswantoro, specifically by the Research Center for Intelligent Distributed Surveillance and Security, with a particular emphasis on Artificial Intelligence Studies in Smart Society.
[1] Jannusch, T., Shannon, D., Völler, M., Murphy, F., Mullins, M. (2021). Cars and distraction: How to address the limits of Driver Monitoring Systems and improve safety benefits using evidence from German young drivers. Technology in Society, 66: 101628. https://doi.org/10.1016/j.techsoc.2021.101628
[2] Brown, I.D. (2019). Methodological issues in driver fatigue research. In Fatigue and Driving, pp. 155-166. Routledge. https://doi.org/10.1201/9780203756140-17
[3] Chantith, C., Permpoonwiwat, C.K., Hamaide, B. (2021). Measure of productivity loss due to road traffic accidents in Thailand. IATSS Research, 45(1): 131-136. https://doi.org/10.1016/j.iatssr.2020.07.001
[4] Kumar Gangadhari, R., Kumar Tarei, P. (2021). Qualitative investigation of the influential factors behind unsafe trucking behaviors in India. Transportation Research Record, 2675(1): 67-78. https://doi.org/10.1177/0361198120964724
[5] Yousif, M.T., Sadullah, A.F.M., Kassim, K.A.A. (2020). A review of behavioural issues contribution to motorcycle safety. IATSS Research, 44(2): 142-154. https://doi.org/10.1016/j.iatssr.2019.12.001
[6] Gwak, J., Hirao, A., Shino, M. (2020). An investigation of early detection of driver drowsiness using ensemble machine learning based on hybrid sensing. Applied Sciences, 10(8): 2890. https://doi.org/10.3390/app10082890
[7] Bakker, B., Zabłocki, B., Baker, A., Riethmeister, V., Marx, B., Iyer, G., Anund, A., Ahlström, C. (2021). A multi-stage, multi-feature machine learning approach to detect driver sleepiness in naturalistic road driving conditions. IEEE Transactions on Intelligent Transportation Systems, 23(5): 4791-4800. https://doi.org/10.1109/TITS.2021.3090272
[8] JR, D.K., Harish, N., Priyadharsini, K., Gowtham, S., Gokulraj, G. (2022). Machine learning based drowsiness detection in classrooms. In 2022 International Conference on Edge Computing and Applications (ICECAA), Tamilnadu, India, pp. 1186-1191. https://doi.org/10.1109/ICECAA55415.2022.9936550
[9] Magán, E., Sesmero, M.P., Alonso-Weber, J.M., Sanchis, A. (2022). Driver drowsiness detection by applying deep learning techniques to sequences of images. Applied Sciences, 12(3): 1145. https://doi.org/10.3390/app12031145
[10] Ayatollahi, A., Afrakhteh, S., Soltani, F., Saleh, E. (2023). Sleep apnea detection from ECG signal using deep CNN-based structures. Evolving Systems, 14(2): 191-206. https://doi.org/10.1007/s12530-022-09445-1
[11] Saif, A.F.M.S., Mahayuddin, Z.R. (2020). Robust drowsiness detection for vehicle driver using deep convolutional neural network. International Journal of Advanced Computer Science and Applications, 11(10): 343-350. https://doi.org/10.14569/IJACSA.2020.0111043
[12] Wijnands, J.S., Thompson, J., Nice, K.A., Aschwanden, G.D.P.A., Stevenson, M. (2020). Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks. Neural Computing and Applications, 32(13): 9731-9743. https://doi.org/10.1007/s00521-019-04506-0
[13] Safarov, F., Akhmedov, F., Abdusalomov, A.B., Nasimov, R., Cho, Y.I. (2023). Real-time deep learning-based drowsiness detection: Leveraging computer-vision and eye-blink analyses for enhanced road safety. Sensors, 23(14): 6459. https://doi.org/10.3390/s23146459
[14] Minhas, R., Peker, N.Y., Hakkoz, M.A., Arbatli, S., Celik, Y., Erdem, C.E., Semiz, B., Peker, Y. (2024). Association of visual-based signals with electroencephalography patterns in enhancing the drowsiness detection in drivers with obstructive sleep apnea. Sensors, 24(8): 2625. https://doi.org/10.3390/s24082625
[15] Khan, T., Choi, G., Lee, S. (2023). EFFNet-CA: An efficient driver distraction detection based on multiscale features extractions and channel attention mechanism. Sensors, 23(8): 3835. https://doi.org/10.3390/s23083835
[16] Ravi, V., Acharya, V., Alazab, M. (2023). A multichannel EfficientNet deep learning-based stacking ensemble approach for lung disease detection using chest X-ray images. Cluster Computing, 26(2): 1181-1203. https://doi.org/10.1007/s10586-022-03664-6
[17] Mao, K.N., Zhang, W., Wang, D.B., Li, A., et al. (2022). Prediction of depression severity based on the prosodic and semantic features with bidirectional LSTM and time distributed CNN. IEEE Transactions on Affective Computing, 14(3): 2251-2265. https://doi.org/10.1109/TAFFC.2022.3154332
[18] Vaquerizo-Villar, F., Gutiérrez-Tobal, G.C., Calvo, E., Álvarez, D., Kheirandish-Gozal, L., Del Campo, F., Gozal, D., Hornero, R. (2023). An explainable deep-learning model to stage sleep states in children and propose novel EEG-related patterns in sleep apnea. Computers in Biology and Medicine, 165: 107419. https://doi.org/10.1016/j.compbiomed.2023.107419
[19] Adhithyaa, N., Tamilarasi, A., Sivabalaselvamani, D., Rahunathan, L. (2023). Face positioned driver drowsiness detection using multistage adaptive 3D convolutional neural network. Information Technology and Control, 52(3): 713-730. https://doi.org/10.5755/j01.itc.52.3.33719
[20] Florez, R., Palomino-Quispe, F., Coaquira-Castillo, R.J., Herrera-Levano, J.C., Paixão, T., Alvarez, A.B. (2023). A CNN-based approach for driver drowsiness detection by real-time eye state identification. Applied Sciences, 13(13): 7849. https://doi.org/10.3390/app13137849
[21] Jahan, I., Uddin, K.M.A., Murad, S.A., Miah, M.S.U., Khan, T.Z., Masud, M., Aljahdali, S., Bairagi, A.K. (2023). 4D: A real-time driver drowsiness detector using deep learning. Electronics, 12(1): 235. https://doi.org/10.3390/electronics12010235
[22] Sowmyashree, P., Sangeetha, J. (2023). Multistage end-to-end driver drowsiness alerting system. International Journal of Advanced Computer Science and Applications, 14(4): 464-473. https://doi.org/10.14569/IJACSA.2023.0140452
[23] Kim, W., Jung, W.S., Choi, H.K. (2019). Lightweight driver monitoring system based on multi-task mobilenets. Sensors, 19(14): 3200. https://doi.org/10.3390/s19143200
[24] Phan, A.C., Nguyen, N.H.Q., Trieu, T.N., Phan, T.C. (2021). An efficient approach for detecting driver drowsiness based on deep learning. Applied Sciences, 11(18): 8441. https://doi.org/10.3390/app11188441
[25] Lee, J., Woo, S., Moon, C. (2024). 3D-CNN method for drowsy driving detection based on driving pattern recognition. Electronics, 13(17): 3388. https://doi.org/10.3390/electronics13173388
[26] Massoz, Q., Langohr, T., François, C., Verly, J.G. (2016). The ULg multimodality drowsiness database (called DROZY) and examples of use. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, pp. 1-7. https://doi.org/10.1109/WACV.2016.7477715
[27] Zhang, X., Han, L.X., Han, L.H., Zhu, L. (2020). How well do deep learning-based methods for land cover classification and object detection perform on high resolution remote sensing imagery? Remote Sensing, 12(3): 417. https://doi.org/10.3390/rs12030417
[28] Lal, S.K.L., Craig, A. (2001). A critical review of the psychophysiology of driver fatigue. Biological Psychology, 55(3): 173-194. https://doi.org/10.1016/S0301-0511(00)00085-5
[29] Wierwille, W.W., Ellsworth, L.A. (1994). Evaluation of driver drowsiness by trained raters. Accident Analysis & Prevention, 26(5): 571-581. https://doi.org/10.1016/0001-4575(94)90019-1
[30] Horne, J.A., Reyner, L.A. (1996). Counteracting driver sleepiness: Effects of napping, caffeine, and placebo. Psychophysiology, 33(3): 306-309. https://doi.org/10.1111/j.1469-8986.1996.tb00428.x
[31] Delgado-Centeno, J.I., Sanchez-Cuevas, P.J., Martínez, C., Olivares-Mendez, M.A. (2021). Enhancing lunar reconnaissance orbiter images via multi-frame super resolution for future robotic space missions. IEEE Robotics and Automation Letters, 6(4): 7721-7727. https://doi.org/10.1109/LRA.2021.3097510
[32] Kumar, R., Zhang, X.S., Khan, R.U., Ahad, I., Kumar, J. (2018). Malicious code detection based on image processing using deep learning. In 2018 International Conference on Computing and Artificial Intelligence, Chengdu, China, pp. 81-85. https://doi.org/10.1145/3194452.3194459
[33] Gangopadhyay, I., Chatterjee, A., Das, I. (2019). Face detection and expression recognition using Haar cascade classifier and Fisherface algorithm. In Recent Trends in Signal and Image Processing: Proceedings of ISSIP 2018, pp. 1-11. https://doi.org/10.1007/978-981-13-6783-0_1
[34] Jain, A., Garg, G. (2020). Gun detection with model and type recognition using Haar cascade classifier. In 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, pp. 419-423. https://doi.org/10.1109/ICSSIT48917.2020.9214211
[35] Pollicelli, D., Coscarella, M., Delrieux, C. (2020). RoI detection and segmentation algorithms for marine mammals photo-identification. Ecological Informatics, 56: 101038. https://doi.org/10.1016/j.ecoinf.2019.101038
[36] Pei, X.L., Zhao, Y.H., Chen, L.W., Guo, Q.W., Duan, Z.Q., Pan, Y., Hou, H. (2023). Robustness of machine learning to color, size change, normalization, and image enhancement on micrograph datasets with large sample differences. Materials & Design, 232: 112086. https://doi.org/10.1016/j.matdes.2023.112086
[37] Huang, L., Qin, J., Zhou, Y., Zhu, F., Liu, L., Shao, L. (2023). Normalization techniques in training DNNs: Methodology, analysis and application. IEEE Transactions on pattern Analysis and Machine Intelligence, 45(8): 10173-10196. https://doi.org/10.1109/TPAMI.2023.3250241
[38] Li, B.Y., Wu, F.L., Lim, S.N., Belongie, S., Weinberger, K.Q. (2021). On feature normalization and data augmentation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, pp. 12378-12387. https://doi.org/10.1109/CVPR46437.2021.01220
[39] Joseph, V.R., Vakayil, A. (2022). SPlit: An optimal method for data splitting. Technometrics, 64(2): 166-176. https://doi.org/10.1080/00401706.2021.1921037
[40] Cavieres, R., Barraza, R., Estay, D., Bilbao, J., Valdivia-Lefort, P. (2022). Automatic soiling and partial shading assessment on PV modules through RGB images analysis. Applied Energy, 306: 117964. https://doi.org/10.1016/j.apenergy.2021.117964
[41] Boateng, E.Y., Otoo, J., Abaye, D.A. (2020). Basic tenets of classification algorithms K-Nearest-Neighbor, support vector machine, random forest and neural network: A review. Journal of Data Analysis and Information Processing, 8(4): 341-357. https://doi.org/10.4236/jdaip.2020.84020
[42] Muljono, Andono, P.N., Wulandari, S.A., Al Azies, H., Naufal, M. (2024). Tempo recognition of kendhang instruments using hybrid feature extraction. Journal of Applied Science and Engineering, 27(3): 2179-2192. https://doi.org/10.6180/jase.202403_27(3).0004
[43] Uddin, S., Haque, I., Lu, H.H., Moni, M.A., Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1): 6256. https://doi.org/10.1038/s41598-022-10358-x
[44] Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U. (1999). When is “nearest neighbor” meaningful? In Database Theory - ICDT'99: 7th International Conference, Jerusalem, Israel, pp. 217-235. https://doi.org/10.1007/3-540-49257-7_15
[45] Hultman, M., Johansson, I., Lindqvist, F., Ahlström, C. (2021). Driver sleepiness detection with deep neural networks using electrophysiological data. Physiological Measurement, 42(3): 034001. https://doi.org/10.1088/1361-6579/abe91e
[46] Khan, M.S., Alam, K.N., Dhruba, A.R., Zunair, H., Mohammed, N. (2022). Knowledge distillation approach towards melanoma detection. Computers in Biology and Medicine, 146: 105581. https://doi.org/10.1016/j.compbiomed.2022.105581
[47] Chen, D., Wang, Z.L., Wang, J., Shi, L., Zhang, M.K., Zhou, Y.M. (2023). Detection of distracted driving via edge artificial intelligence. Computers and Electrical Engineering, 111: 108951. https://doi.org/10.1016/j.compeleceng.2023.108951
[48] Ansari, S., Du, H.P., Naghdy, F., Stirling, D. (2022). Automatic driver cognitive fatigue detection based on upper body posture variations. Expert Systems with Applications, 203: 117568. https://doi.org/10.1016/j.eswa.2022.117568
[49] Bernadó-Mansilla, E., Garrell-Guiu, J.M. (2003). Accuracy-based learning classifier systems: Models, analysis and applications to classification tasks. Evolutionary Computation, 11(3): 209-238. https://doi.org/10.1162/106365603322365289
[50] Sofaer, H.R., Hoeting, J.A., Jarnevich, C.S. (2019). The area under the precision-recall curve as a performance metric for rare binary events. Methods in Ecology and Evolution, 10(4): 565-577. https://doi.org/10.1111/2041-210X.13140
[51] Lin, L., Wang, S., Yang, J.C., Wei, F. (2024). A multi-aware graph convolutional network for driver drowsiness detection. Knowledge-Based Systems, 305: 112643. https://doi.org/10.1016/j.knosys.2024.112643
[52] Pahariya, S., Vats, P., Suchitra, S. (2024). Driver drowsiness detection using MobileNetV2 with transfer learning approach. In 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India, pp. 1-6. https://doi.org/10.1109/ADICS58448.2024.10533606
[53] Wunan, T.D., Jappy, P.C., Aurelia, S., Edbert, I.S., Suhartono, D. (2024). Driver drowsiness detection using NasNet mobile, MobileNetV2, and EfficientNetB0. In 2024 IEEE International Conference on Artificial Intelligence and Mechatronics Systems (AIMS), Bandung, Indonesia, pp. 1-4. https://doi.org/10.1109/AIMS61812.2024.10512773