Auditory Evoked Potential-Based Hearing Loss Level Recognition Using Fully Convolutional Neural Networks

Auditory Evoked Potential-Based Hearing Loss Level Recognition Using Fully Convolutional Neural Networks

Ramzi Maalmi Amine Ben SlamaHanene Sahli Hedi Trabelsi 

University of Tunis El Manar, ISTMT, LRBTM, LR13ES07, Tunis 1006, Tunisia

University of Tunis, ENSIT, SIME, Montfleury 1008, Tunisia

Corresponding Author Email:
7 June 2021
12 December 2021
30 April 2022
| Citation



Hearing perception loss is the main common disabilities existing in adults confirmed by the auditory evoked potential exam (AEP). This technique is characterized by limited medical information from feedback response in full routine examination of patients. Body movements, measuring equipment, low-frequency noise are outside factors that cause a misinterpretation. In clinical workflow, AEP signals are manually classified by the experts in order to precise the hearing loss level. In order to enhancethe diagnosis rung rightness, the fully convolutional neural networks methodology is proposed to highlight reliably automated hearing loss analysis. The validation of the proposed approach was focused on 494 factual incorporated auditory loss cases and 177 seen normal undergoing different auditory stimuli (20 dB, 50, 60... and 80 dB) from AEP recordings. The used classification method can represent a highly reduced labor-intensive study loads of ear nose throat (ENT) doctor by applying the pertinent analysis strategy for each hearing loss level and significantly increase the auditory diagnosis performance which provides ability for a computerized ENT assessment. Compared to state-of-the-art methods, the used technique presents a higher accuracy rate by requiring hearing loss level classes.


computer vision, hearing perception loss, auditory evoked potential (AEP), auditory brainstem responses (ABRs), fully-convolutional neural networks

1. Introduction

The auditory evoked potential (AEP) is an electrical signal produced by the human or animal nervous systems when an external stimulus is presented. The EPs are created after a sensory stimulus such (AEP), or visual auditory evoked potential (VAEP). This proposed work is focused on the AEP signals analysis; it is produced by an external ‘sound’ proceeding to the brain. The research of AEP signals has current early assessment of hearing loss recognition [1]. The Hearing perception level analysis has been the most frequent sensory disability throughout the world. More than 275 million people around the world are affected by several kind of hearing loss associated to different causes [2].

AEP signals consist of both positive and negative Peaks, characterized by different parameters such as: latency, amplitude and behavioural correlation [3]. Depending upon the amplitude and latencies, Auditory brainstem responses (ABR) can be subsequently divided into different periods with short (0-12 milliseconds), middle (8-50 milliseconds) and high latency of (50-300 millisecond) [3].

Many researchers generally focused on the evaluation of ABRs waveform, the early signal period (10-12 msec) of AEP.

AEP signal waveform contained different peaks (I-II-III-V), which pre-dominant presence of the V Peak can essentially determine the existence of hearing perception [4, 5].

Different works have been proposed in order to automatically identify the Peak V using a matched filter [6], spectral technique [7] and wavelet method [8].

The major difficulties encountered in classifying the AEP wave Peaks are essential to understand the structure of the signal with definite different Peaks, hence the charge of averaging the AEP signal becomes difficult and time consuming ; then, it’s rather difficult to separate the related III and V peaks; it is also a several task to explore and recognize the AEP peak latencies for abnormal hearing dataset from different sound intensities varying between (the 20 dB to 100 dB) (see Figure 1), which can be caused by auditory nerve pathology. Furthermore, it has been reported that the detection of amplitude of Peak V is complex when the stimulation intensity is below the 30 dB, thus, the defined V Peaks are no longer obvious [9].

Numerous research works [10, 11] have been approved in the field of the measurement of hearing perception level which is produced by different auditory disorders. Until today, the assessment of this problem represents a difficult task for audiologists. The Information about hearing state in the AEP signal is in its amplitude and the characteristic wave Peaks as was mentioned above. However, assessment of the AEP waveform status is often based on a particular evaluation of different factors at presence of irregularities of the brainstem response.

Furthermore, a development of automatic methods, used to categorize AEP signal according to diverse levels of hearing perception states, can assure serious improvement in hearing diagnostics. The main aim of such algorithms is to recognize quality of different hearing level by precise measurements assisting clinicians in the diagnosis process [12].

In fact, an accuracy of a classification method relies on nature of filtering, feature extraction and classification techniques used for AEP analysis process. In other words, the global features contribute to the performed filtering and discrimination technique and must be highly accurate to assess presence or absence of different hearing perception loss.

Figure 1. An exemplary of AEP signal

In addition, current algorithms [13, 14] used for automated hearing loss level identification applied the AEP Peaks as a clinical features. In the literature [15], the examination of clinical information’s is appreciably significant for the analysis process. Therefore, different methods can present an error prone and cannot provide appropriate performance due to different groups of hearing perception level [16, 17].

Original automatic diagnostic methods are probable to be available and need more ability for hearing loss detection. The CAD systems are considered to support ENT doctor in the process of visual screening the AEP signal in order to avoid the miss-diagnosis.

The use of automatic strategy using the deep learning algorithms allows for a consistent judgment process. In this work, a significant method is presented to reduce the time consuming generated by different methods [18] so to help in the performance of the experimental result related to monitoring patient’s status.

The proposed work is a really task for the organization process by the application of fully convolutional neural networks (F-CNN) technique. A fast category is carried out to distinguish between four kinds of classes: Normal (NL) and hearing loss level (mild, moderate, severe) in abnormal data. All AEP signals are collected from the AEP technique.

This paper is structured as follows: section 2 is reserved to hearing loss level classification using CNN method based on AEP signals. Results and discussions are illustrated in section 3 and conclusions in section 4.

2. Classification Using Fully Convolutional Neural Networks

In this work, an AEP machine is employed [19]. The used datasets were collected from the Charles Nicolle Hospital of Tunis taken through the Nihon Kohden technique. The AEPs study was done in a semi-dark room with quiet surroundings. The participants were requested to keep away from needless movement and to remove all the metallic ornaments. Recordings of AEPs were attained with Nihon Kohden Neuropack using blinking clicks in each ear at 10 Hz sampling rate. Different responses were obtained for both, the ears separately [20], to illustrate the cause of hearing loss. The AEP results were extracted for the latencies of I, II, III, and V waves and Inter-Peak Latencies (IPL) [21] (see Figure 2).

The AEP signal consists of a sequence of peaks (I–VII) that frequently; follow a stimulus was recorded by two electrodes fixed on the mastoid applied of each ear and one on the forehead. The I and II waves are generated by the auditory nerve, where the peaks are due to the electrical activities of nuclei at higher levels of the ascending auditory pathway in the brainstem. Waves III is presumed to be generated in the cochlear nucleus. The proposed technique is achieved and evaluated using 671 subjects: 154, 156, 214 are confirmed respectively to have three kind of hearing loss levels, and 177 seen normal.

Figure 2. The proposed methodology

The proposed technique contains two steps: (a) AEP feature collection and (b) data discrimination using F-CNN approach. The use of deep learning techniques is applied to divide data into four categories: (mild (m), Moderate (M), severe (S) and Normal cases (NL)).

The first task of the classification approach was presented in our previous works [22, 23]. An extensive description [23] is presented, where plenty of details are preserved to acquire competent performance in signal classification results. In fact, the procedure proposed in this work is divided into two main phases: (a) feature extraction, (b) data discrimination using F-CNN method to differentiate subjects affected by different types of hearing loss. Nevertheless, the proposed work intends to create an automatic method for early hearing loss recognition using AEP waveform.

2.1 From back propagation (BP) to deep learning

Deep learning (DL) is a new ground in machine learning (ML) technique. Its methods create different models with hierarchical representations of the input data. The higher-level representations of the model correspond to abstract concept, defined as a non-linear composition of the lower-level representation. For discrimination tasks, these representations are more opposed to the inappropriate variations (e.g. noise) that are often represent in the input data. These features increase the diverse descriptive factors that are significant classification. Recently, deep learning methods have been presented to medical signals investigation with hopeful clinical results in various applications, including prediction for Alzheimer’s [24], Parkinson disorders [25] and vestibular disorders identification [26]. Though, some studies used DL models to recognize diverse patterns in patient dataset characteristics.

Table 1. Parameters of the proposed F-CNN method


Layer 0

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

Layer 6

Layer Name

Input feature vectors:

-The amplitude

–Inter-Peak Latencies (IPL)


Batch Normal





Output shape







1×(4 outputs)

Other Layer Parameter


Activation=ReLU, _Strides=3



The Pooling size=2, strides=3, ReLU, strides=1




Droput Rate=0.2


Figure 3. Illustration of the 6-layer F-CNN method

The DL models [27] are divided into a range of sub-types using different training procedures such as: deep neural networks, recurrent neural networks, fully-convolutional neural networks (F-CNNs) [28, 29] and deep belief network [30].

2.2 Classification procedure

In the F-CNN structure, the First layers are applied to pick up features vector, where, the output layers are required to create a specific categorization by establishing the major parameters in the training step, validation and test. Also, the proposed F-CNN structure for the hearing loss level detection is presented. In Table 1, the input layer of the CNN model, the 1D convolution is achieved using 6 vector values presenting the amplitude and Inter Peak Latencies (IPL). Primary, the activation layer is normalized with the batch layer. Then feature maps are formed in the max pooling layer with the maximum values of the output features from previous used layers. In addition, the reduced feature size is indispensable task toward reducing the time consuming of the F-CNN method. Indeed, the max pooling output parameters are used as an input for the flattened layer. Then, acquired feature vectors are used in the dense-connected layer with (512 unit vectors). In the last layer of the networks, the softmax layer is employed presenting the output classes of data. Really, a dropout parameter is used to detect the correct class. 6-layer of the F-CNN network is applied to classify AEP signals into three hearing loss databases compared to normal cases (see Figure 3).

The F-CNN method is executed to discriminate between normal (Nl) and different patients affected by different hearing loss levels. Yet, for diverse classification works [31], the F-CNN represents a high superiority data analysis. Taking into account the resemblance between AEP signals, F-CNN process is considered using the input learning features to get the most pertinent parameters used for a suitable illustration of the data classes (1× 4 output).

Here, F-CNN is basically used to learn the filter values using the back-propagation technique. The choice of this method was completed related to the results of MSE rates.

In this work, a validation experience is applied to choose the pertinent F-CNN structure. The Iteration is composed of four folds for the training and one fold for the validation set.

3. Results and Discussion

An automatic analysis of AEP waveforms features, the database contains 671 subjects represent different subjects, the signals duration is between (1000-3000) samples, are used for the analysis of the optimized F-CNN structure. In order to categorize hearing loss level classes compared to normal data sets, 60%, 20% and 20% of the data in each sub-datasets were applied for the training validation and testing stages (see Table 2).

To show the efficiency of the proposed method, various CAD methods are tested and validated.

From the cross-validation results represented in Table 4, it is clear that the F-CNN classifier is more suitable than the PCA-SVM, PCA-MNN methods in terms of error rate using the AEP datasets.

The hyper-parameters for each classifier measured for automatic hearing loss diagnostic are achieved after numerous experiences. In reality, classification parameters established in the Table 3 shows the relevant grouping that gives the best accuracy percentages during all experiments. In fact, the pre-processing blocs of (PCA) are used to reduce the parameter number of the input networks. Table 4 shows the categorization result accomplished by the used discrimination methods (PCA-MNN and PCA-SVM) [32], a classification result is not suitable for the analysis of different hearing loss levels compared to the used F-CNN [33].

Table 2. AEP signals used for the hearing loss level identification


AEP Dataset
























After several experiences, we achieve that the used DL technique seems to be effective using F-CNN classifier projected for perfect hearing loss level categorization method. The F-CNN scheme is applied to get a promise recognition class’s rate. As shown in Figure 4, the used strategy provides considerable labelling results in view of statistical measurements (accuracy, specificity and sensitivity) respectively with an average of 91.29%, 91.78% and 92.14%.

The employed cross correlation algorithm improves significantly the classification results of AEP signal, shown in the ROC curves (see Figure 5), in particular for the CNN classifier, the AUC are improved to 89.4%.

Significant classification system of AEP signals is established using the F-CNN technique. The proposed CAD system offered adequate classification results and clearly distinguished hearing loss levels from the healthy subjects. It was the highest recognition accuracy compared with others works [34].

With these obtainable results, we can determine that the used classification system can consistently be employed to help audiologist for unlimited cases that require further attention.

Table 3. The hyperparameters for all used classifiers



Hidden layer

Neuron in hidden layer



Training method

Epoch number

























*Note: CG: Conjugate Gradient algorithm; RBF: Gaussian radial basis kernel function; SBP: standard back-propagation

Table 4. Results of validation error (%) of the F-CNN classifier, PCA-MNN and PCA-SVM, using cross-validation method tested on the AEP data

Used technique

Fold 1

Fold 2

Fold 3

Fold 4

Fold 5



















Figure 4. The F-CNN classification performance versus training and test results

Figure 5. Validation results of the F-CNN technique using ROC curves

4. Statistical Measures

The dataset extraction assignment for a routine analysis is a severe task; towards recording all information’s to diagnose the operative reasons and conditions of actual hearing loss. In practice, our study based for the essentially part on recovering the diagnosis of AEP signals.

In this work, it’s an interesting strategy to apply statistical results on the hearing loss diagnosis. Indeed, resulting measurement harmony can identify the included data ranking in term of report supplementary information and reduce time processing as an optimization methodology.

Figure 6 reveals the correlation coefficient results flanked by physical and pathological status. It is remarkable to notice the dataset, possessed a significant correlation as highly accurate measures in I and V Peak waves and latency I-V examination.

First, a logic approach is illustrated over the positive resulting for the clinical data and age relation; recognizing the hearing loss condition. Hence, the relationship between the existence of physical state and the affected by hearing loss is then the more or less the subject age increases, the more the hearing loss level degree will be present. As well as, the three derived feature supplied the high correlation with the same physical characteristic; the average is about 53%, 51% and 45% respectively for I-Peak wave, V-Peak wave and latency I-V.

Figure 6. Resulting statistical coefficients flanked by amplitude and peak wave status

5. Conclusions

In this paper, an advanced method using auditory evoked potentials is proposed to classify data affected by hearing loss. At this point, an effective technique presents the classification part of AEP signal. F-CNN technique is applied to categorize hearing loss achieving 92% of accuracy rate. Compared to recent state of the art, results reveal that the used strategy is very adequate for an experimental purpose. Expert diagnostic techniques can be based on the discoveries of this strategy. In future works, different; attempts can be generated to characterize different classes of hearing loss level. This work presents various assistances for automatic audiologist applications.


[1] Liberman, M.C., Kujawa, S.G. (2017). Cochlear synaptopathy in acquired sensorineural hearing loss: Manifestations and mechanisms. Hearing Research, 349: 138-147.

[2] Chadambuka, A., Mususa, F., Muteti, S. (2013). Prevalence of noise induced hearing loss among employees at a mining industry in Zimbabwe. African Health Sciences, 13(4): 899-906.

[3] Mehraei, G., Hickox, A.E., Bharadwaj, H.M., Goldberg, H., Verhulst, S., Liberman, M.C., Shinn-Cunningham, B.G. (2016). Auditory brainstem response latency in noise as a marker of cochlear synaptopathy. Journal of Neuroscience, 36(13): 3755-3764.

[4] Bradley, A.P., Wilson, W.J. (2004). Automated analysis of the auditory brainstem response. In Proceedings of the 2004 Intelligent Sensors, Sensor Networks and Information Processing Conference, pp. 541-545.

[5] Guest, H., Munro, K.J., Prendergast, G., Millman, R.E., Plack, C.J. (2018). Impaired speech perception in noise with a normal audiogram: No evidence for cochlear synaptopathy and no relation to lifetime noise exposure. Hearing Research, 364: 142-151.

[6] Price, C.N., Alain, C., Bidelman, G.M. (2019). Auditory-frontal channeling in α and β bands is altered by age-related hearing loss and relates to speech perception in noise. Neuroscience, 423: 18-28.

[7] Harkrider, A.W., Plyler, P.N., Hedrick, M.S. (2005). Effects of age and spectral shaping on perception and neural representation of stop consonant stimuli. Clinical Neurophysiology, 116(9): 2153-2164.

[8] Causevic, E., Causevic, E., Wickerhauser, M.V. (2006). Fast wavelet estimation of weak bio-signals using novel algorithms for generating multiple additional data frames. U.S. Patent No. 7,054,454. Washington, DC: U.S. Patent and Trademark Office. 

[9] Paulraj, M.P., Subramaniam, K., Hema, C.R. (2015). Classification of hearing perception level using auditory evoked potentials. Jurnal Teknologi, 77(28): 73-78.

[10] Holman, J.A., Drummond, A., Hughes, S.E., Naylor, G. (2019). Hearing impairment and daily-life fatigue: A qualitative study. International Journal of Audiology, 58(7): 408-416.

[11] Tas, A., Yagiz, R., Tas, M., Esme, M., Uzun, C., Karasalihoglu, A.R. (2007). Evaluation of hearing in children with autism by using TEOAE and ABR. Autism, 11(1): 73-79.

[12] Lasak, J.M., Allen, P., McVay, T., Lewis, D. (2014). Hearing loss: Diagnosis and management. Primary Care: Clinics in Office Practice, 41(1): 19-31.

[13] Nisar, S., Tariq, M., Adeel, A., Gogate, M., Hussain, A. (2019). Cognitively inspired feature extraction and speech recognition for automated hearing loss testing. Cognitive Computation, 11(4): 489-502.

[14] Vlaming, M.S., MacKinnon, R.C., Jansen, M., Moore, D.R. (2014). Automated screening for high-frequency hearing loss. Ear and Hearing, 35(6): 667.

[15] Carpenter, M.G., Campos, J.L. (2020). The effects of hearing loss on balance: A critical review. Ear and Hearing, 41(Suppl 1): 107S-119S.

[16] de Boer, J.N., Linszen, M.M., de Vries, J., Schutte, M.J., Begemann, M.J., Heringa, S.M., Sommer, I.E.C. (2019). Auditory hallucinations, top-down processing and language perception: A general population study. Psychological Medicine, 49(16): 2772-2780.

[17] Hornsby, B.W., Kipp, A.M. (2016). Subjective ratings of fatigue and vigor in adults with hearing loss are driven by perceived hearing difficulties not degree of hearing loss. Ear and Hearing, 37(1): e1.

[18] Viscaino, M., Maass, J.C., Delano, P.H., Torrente, M., Stott, C., Auat Cheein, F. (2020). Computer-aided diagnosis of external and middle ear conditions: A machine learning approach. Plos One, 15(3): e0229226.

[19] Popov, V.V., Supin, A.Y., Nechaev, D.I., Lemazina, A.A., Sysueva, E.V. (2019). Position of an acoustic window in a beluga whale: Computation based on auditory evoked potential latencies. The Journal of the Acoustical Society of America, 145(6): 3578-3585.

[20] Telmesani, L.M., Said, N.M. (2016). Electrically evoked compound action potential (ECAP) in cochlear implant children: Changes in auditory nerve response in first year of cochlear implant use. International Journal of Pediatric Otorhinolaryngology, 82: 28-33.

[21] Mansour, Y., Altaher, W., Kulesza Jr, R.J. (2019). Characterization of the human central nucleus of the inferior colliculus. Hearing Research, 377: 234-246.

[22] Mouelhi, A., Ben Slama, A., Marrakchi, J., Trabelsi, H., Sayadi, M., Labidi, S. (2020). Sparse classification of discriminant nystagmus features using combined video-oculography tests and pupil tracking for common vestibular disorder recognition. Computer Methods in Biomechanics and Biomedical Engineering, 24(4): 400-418.

[23] Slama, A.B., Mouelhi, A., Manoubi, S., ben Salah, M., Trabelsi, H., Sayadi, M., Fnaiech, F. (2018). An enhanced approach for vestibular disorder assessment. In 2018 IEEE 4th Middle East Conference on Biomedical Engineering (MECBME), pp. 243-246.

[24] Rossini, P.M., Di Iorio, R., Vecchio, F., Anfossi, M., Babiloni, C., Bozzali, M., Dubois, B. (2020). Early diagnosis of Alzheimer’s disease: The role of biomarkers including advanced EEG signal analysis. Report from the IFCN-sponsored panel of experts. Clinical Neurophysiology, 131(6): 1287-1310.

[25]  Gurrala, V., Yarlagadda, P., Koppireddi, P. (2021). Detection of sleep apnea based on the analysis of sleep stages data using single channel EEG. Traitement du Signal, 38(2): 431-436.

[26] Ben Slama, A., Mouelhi, A., Sahli, H., Zeraii, A., Marrakchi, J., Trabelsi, H. (2020). A deep convolutional neural network for automated vestibular disorder classification using VNG analysis. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 8(3): 334-342.

[27] Hershey, S., Chaudhuri, S., Ellis, D.P., Gemmeke, J.F., Jansen, A., Moore, R.C., Wilson, K. (2017). CNN architectures for large-scale audio classification. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 131-135.

[28] Torti, E., Fontanella, A., Musci, M., Blago, N., Pau, D., Leporati, F., Piastra, M. (2019). Embedding recurrent neural networks in wearable systems for real-time fall detection. Microprocessors and Microsystems, 71: 102895.

[29] Ervural, S., Ceylan, M. (2021). Convolutional neural networks-based approach to detect neonatal respiratory system anomalies with limited thermal image. Traitement du Signal, 38(2): 437-442.

[30] Kaur, M., Singh, D. (2019). Fusion of medical images using deep belief networks. Cluster Computing, 23: 1439-1453.

[31] Wang, X. (2021). Recognition and positioning of container lock holes for intelligent handling terminal based on convolutional neural network. Traitement du Signal, 38(2): 467-472.

[32] Ben Slama, A., Sahli, H., Mouelhi, A., Marrakchi, J., Boukriba, S., Trabelsi, H., Sayadi, M. (2020). Hybrid clustering system using Nystagmus parameters discrimination for vestibular disorder diagnosis. Journal of X-Ray Science and Technology, 28(5): 923-938.

[33] Kiranyaz, S., Ince, T., Abdeljaber, O., Avci, O., Gabbouj, M. (2019). 1-d convolutional neural networks for signal processing applications. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8360-8364.

[34] Ibrahim, I.A., Ting, H.N., Moghavvemi, M. (2019). Formulation of a novel classification indices for classification of human hearing abilities according to cortical auditory event potential signals. Arabian Journal for Science and Engineering, 44(8): 7133-7147.