Multimodal Deep Learning Framework Using Transformer and LSTM Models for Suicide Risk Detection from Social Media and Voice Data

Multimodal Deep Learning Framework Using Transformer and LSTM Models for Suicide Risk Detection from Social Media and Voice Data

Shwetha Sadashiva* Koratagere Gangadharasastry Manjunath

Department of Computer Science and Engineering, Siddaganga Institute of Technology, Tumkur 572103, India

Corresponding Author Email: 
shwethachandan2889@gmail.com
Page: 
1445-1459
|
DOI: 
https://doi.org/10.18280/ijsse.150712
Received: 
11 May 2025
|
Revised: 
16 June 2025
|
Accepted: 
15 July 2025
|
Available online: 
31 July 2025
| Citation

© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Mental issues in the younger generation have become a growing concern worldwide with suicidal ideation and behaviors creating serious but mostly ignored risks until the point where consequences are irreversible. Conventional detection methods in the form of clinical interviews, psychological tests, and self-report surveys are plagued by limitations such as high subjectivity, poor scalability, and delayed response. This study suggests a modern multimodal framework based on advanced deep learning to identify suicidal inclinations through analysis of various digital indicators in the form of social media text, audio speech, and facial expressions. The system combines various top-performing models—Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Transformer models—with Natural Language Processing (NLP) methods to identify and analyze temporal and contextual patterns that are suicidal. Comparative evaluation based on the Social Media Suicide Risk Dataset, with both linguistic and acoustic features, was performed. Of the various models evaluated, the best-performing model was the Transformer model, showing an accuracy of 94.2% with good precision, recall, and F1-Score values. This reflects the model's efficacy in recognizing weak, high-risk behavioral indicators in various modalities. Through real-time, customized digital intervention, this system provides an efficient and scalable solution to be employed by mental professionals, educational institutions, and social networks to identify suicide risk in advance and act towards saving those in danger, especially from the younger generation.

Keywords: 

multimodal fusion, mental health monitoring, personalized intervention, suicide risk prediction, social media analysis, transformer models

1. Introduction

Mental illness, specifically suicidal ideation and behavior, have become major public global health issues—most notably in adolescents and young adults. Early intervention is crucial but is presently a problem in detection. Conventional methods of psychological evaluation, clinical interview, and self-report questionnaire are generally plagued by the issues of subjectivity, delays in diagnosis, and poor scalability. These traditional methods also are not sensitive to slight emotional expressions and behavioral reactions that individuals might express in an oblique manner, most notably online. To overcome such limitations, this work presents a multimodal deep learning-based system detecting suicidal inclinations based on a combination of text, speech, and facial data. It uses several current best models: CNN-based models to identify spatial features in facial expressions, LSTM networks to process sequential audio and textual data, and Transformer-based models to perform complex contextual understanding. Such an integration enables the system to recognize intricate emotional and behavioral patterns in multiple modalities. It is tested using real datasets to illustrate its accuracy, effectiveness, and promise towards enabling scalable, real-time suicidal risk monitoring. Through the usage of deep learning and Natural Language Processing, this research hopes to contribute towards creating intelligent, early warning-based mental wellness support systems [1].

Recent advancements have seen deep learning techniques, especially Long Short-Term Memory (LSTM) networks and Natural Language Processing (NLP), employed extensively to analyze complex sequential data and identify patterns of suicidal ideation effectively. Trends have also evolved toward multimodal approaches, integrating text analysis, speech recognition, facial gesture interpretation, and behavioral data analytics, enhancing prediction accuracy and real-time responsiveness [2, 3].

The applications of these advanced deep learning methods span numerous areas including healthcare facilities, educational institutions, online social platforms, and public health initiatives. By enabling continuous monitoring and personalized intervention strategies, these technologies provide significant benefits by enhancing timely responses, scaling support services, and proactively managing mental health risks, thereby safeguarding the mental health of the younger generation [4].

1.1 Research gaps

Mental health issues, particularly suicidal ideation and behaviors, have increasingly become a critical concern, especially among the younger population. Early detection of suicidal patterns remains challenging due to their subtle and often hidden nature. Leveraging advanced technology, particularly deep learning, provides a novel approach to addressing this pressing issue [5]. Conventional methods for detecting suicidal tendencies primarily rely on clinical assessments, manual interventions, and self-report questionnaires. However, these methods face significant drawbacks such as subjectivity, limited scalability, and delayed intervention capabilities. The dependence on manual processes and subjective interpretations reduces their effectiveness in proactive, continuous monitoring.

Recent trends demonstrate significant advancements in deep learning, notably through the application of Long Short-Term Memory (LSTM) networks and Natural Language Processing (NLP). These technologies effectively analyze complex sequential data from diverse sources such as social media, text communications, and speech, enabling accurate and timely identification of mental health risks [6]. Applications of these advanced deep learning systems span multiple sectors including healthcare, education, social media platforms, and public health services. They offer real-time monitoring capabilities, personalized interventions, and scalable preventive strategies. These innovations help in delivering proactive mental health services, significantly enhancing the effectiveness of mental health management for the younger generation [7]. However, existing research presents gaps, including limited integration of multimodal data, inadequate real-time analysis capabilities, insufficient interpretability of deep learning models, and unresolved ethical and privacy concerns related to personal data handling. Addressing these gaps is essential for developing robust, scalable, and ethically sound mental health solutions [8].

1.2 Objectives

  • To develop and apply a multimodal deep learning architecture that combines textual, audio, and visual inputs to identify suicidal inclinations at an early stage.

  • To analyze the performance of single and ensemble deep learning models such as the Transformer, Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) in handling multimodal inputs.

  • To increase prediction accuracy through an amalgamation of sentiment analysis and detection of behavioral patterns by applying Natural Language Processing (NLP) methods and attention mechanisms.

  • To bring about quantifiable improvements in evaluation metrics like Accuracy, Precision, Recall, and F1-Score, so that the system is validated in terms of reliability and effectiveness against various data sources.

  • To enable the development of a real-time and scalable suicide risk detection system to assist in active monitoring by healthcare professionals, educators, and online platforms.

2. Structured Approach for Assessing and Managing Suicide Risk

Figure 1 depicts an orderly clinical process for identifying and addressing suicide risk. It starts with referral from Primary Care or Mental Health Specialty services. These patients receive a Brief Screening and Severity Assessment for suicide risk and depression. Should suicidal ideas be found, then a Detailed Suicide Risk Assessment is done [9].

Figure 1. Flowchart of suicide risk assessment and intervention process

This is used in developing a Safety Plan, with added therapeutic measures like Psychotherapy, Psychotropic Medication, or Electroconvulsive Therapy (ECT) [10]. Once clinical stabilization is achieved, the patient is moved on to Follow-up care. Where the risk is extreme, the process involves Discharge planning subsequent to inpatient treatment. The process is designed to provide continuity of care, with an emphasis on early detection and holistic intervention.

2.1 Machine learning and deep learning approaches for suicidal detection

Xie et al. [11] proposed a multimodal approach integrating facial gesture recognition, voice pattern recognition, and text analysis within an Android mobile application to detect suicidal tendencies. Their innovation lies in integrating diverse modalities to provide early warnings directly to individuals’ close ones. However, limitations include potential privacy concerns and dependency on users’ frequent interactions with the application. Shilpa et al. [12] investigated temporal patterns of suicidal ideations and behaviors on Twitter. Their method involves identifying specific risk factors and time-sensitive features, contributing practical insights to public health monitoring and timely intervention. Nevertheless, the approach primarily relies on Twitter data, which limits its generalizability to other social platforms or offline behaviors. Muhammad et al. [13] presented a general framework designed specifically for post-centric suicidal expression classification on social media, focusing on Twitter posts. Their key innovation was addressing the variability in posting frequency among users. The drawback is the exclusion of multimedia data and informal language analysis, potentially limiting comprehensive identification. Machine learning models have been increasingly used for suicidal detection by integrating multiple modalities such as facial expressions, text patterns, and voice signals [14]. These approaches leverage deep learning architectures like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) for accurate classification and it’s determined by Eq. (1).

$y=f(X)=\sum_{i=1}^n w_i x_i+b$               (1)

where, y represents the suicidal risk score, X is the input feature set, wi   are the weights, and b is the bias.

2.2 Natural Language Processing for suicide risk prediction

Ye et al. [15] leveraged NLP algorithms like SVM, logistic regression, and CNN for identifying suicidal risks through social media text analysis, achieving high accuracy. Their significant contribution is in demonstrating strong predictive performance. A limitation, however, remains their reliance solely on textual data, neglecting other potential suicide-indicative signals. Ara et al. [16] introduced a novel technique utilizing explanations of predicted outcomes for analyzing suicidal behaviors in longitudinal social media data. The innovative aspect was the use of Layer Integrated Gradients for model interpretability, aiding in preliminary screenings without extensive computational resources. A limitation is the necessity for accurately labeled training datasets, which may be challenging to obtain.

Algarni et al. [17] introduced NLP techniques, specifically employing LSTM and Random Forest classifiers, using curated Reddit data from "SuicideWatch" and "depression" forums. Their contribution included addressing dataset scarcity. Nevertheless, the generalization capability across other platforms and demographics remains untested.

Natural Language Processing (NLP) techniques are utilized to extract insights from textual data on social media platforms and other digital communication channels. Algorithms like Support Vector Machines (SVMs) and deep learning-based classifiers analyze linguistic cues associated with suicidal ideation and it is calculated by using Eq. (2).

$P(y \mid X)=\frac{e^{W^T X+b}}{1+e^{W^T X+b}}$               (2)

where, P(y|X) denotes the probability of suicidal risk, and $W^T X+b$ represents the linear transformation of input features.

Figure 2. Performance comparison of suicidal detection models

Pushpalatha et al. [18] demonstrated an AI- and NLP-based chatbot that promotes mental well-being through customized therapy interactions and activity recommendations. Multilingual support is integrated uniquely with web scraping in order to maximize user participation as well as enrich response accuracy. Hybrid models are used with the aim of delivering culturally sensitive as well as emotionally intelligent mental health care. While its novel nature is notable, its drawbacks include the difficulty of ensuring conversation topicality across multiple languages as well as maintaining constancy of data when including web content.

Figure 2 depicts the relative performance of the four suicidal detection models—ML Only, DL Only, Multimodal CNN-RNN, and the Proposed System—on the measures of Accuracy, Precision, Recall, and F1-Score. The proposed system performs better than conventional methods through the use of multimodal inputs and deep learning techniques [19].

2.3 Multilingual sentiment aggregation

Eq. (3) merges sentiment analysis results from different NLP engines to improve the chatbot’s ability to interpret multilingual emotional expressions [20].

$\mathrm{S}=\left(\mathrm{S}_1+\mathrm{S}_2+\ldots+\mathrm{S}_{\mathrm{n}}\right) / \mathrm{n}$           (3)

where, S is the average sentiment score, S₁ to Sₙ are scores from NLP engines, and n is the total number of sources.

Huang and Cao [21] presented an AI-based small-data platform named CURATE.AI as an optimization tool for dosing of tacrolimus in pediatric liver transplant recipients. The system leverages limited daily patient data for individualization of immunosuppressant dosing for improved therapeutic uniformity. Using six AI models applied retrospectively, the clinical investigation found the optimal AI for dynamic dose optimization. One of its limitations is its reliance on small datasets, which could influence its generalizability and make its performance conditional upon their validation in various populations as well as clinical environments.

2.4 Personalized dose calculation

Eq. (4) helps determine the precise tacrolimus dose required for pediatric liver transplant patients by balancing the target drug concentration with the patient's unique sensitivity profile, enabling real-time adjustments through the CURATE.AI system [22].

$D_t=\theta \times\left(C_{\text {taret }} / S_t\right)$            (4)

where, Dₜ is the dosage at time t, Cₜₐᵣₑₜ is the target drug concentration, Sₜ is the patient sensitivity score, and θ is a pharmacokinetic constant.

Onesimu et al. [23] performed a survey of student experiences in shifting from online learning to onsite learning after COVID-19. The novelty of their work is in determining student preferences, stressors, and challenges in learning modes, suggesting a mix of both as an optimized mode. The work provides insights regarding educational delivery management, health management, and time management. The drawbacks of the work include the subjective nature of survey responses as well as the challenge of standardization of the mix in diverse institutions.

2.5 Hybrid learning preference score

Eq. (5) evaluates students' preference for hybrid learning models by averaging their satisfaction levels in three core areas—lecture engagement, workload handling, and time management—highlighting key factors for post-pandemic educational design [24].

$P_h=\left(L_s+W_s+T_s\right) / 3$           (5)

where, Pₕ is the preference score, Lₛ is lecture satisfaction, Wₛ is perceived workload comfort, and Tₛ is time management efficiency.

Li et al. [25] have researched incorporating artificial intelligence in personalized STEM learning. The effort bridges gaps like the absence of interdisciplinary instruction and limited teacher knowledge through the introduction of an AI-STEM integration framework that can facilitate cognitive learning as well as contextual learning. The approach promotes demand-driven, experiential STEM learning with AI as an accelerator. One of the major drawbacks is the presence of institutionally embedded resistance as well as limited infrastructure in some institutions, which makes its large-scale implementation difficult [26].

3. Integrated Flowchart for Suicide Risk Detection, Assessment, and Intervention

Figure 3 shows an integrated suicide risk assessment and treatment model. The process starts with the Patient, who goes through an Initial Screening in order to Determine Suicide Risk. On detection of risk, the patient goes through Suicide Risk Assessment. Depending on the severity of the risk, the individual is either referred for Hospitalization or for Outpatient Management [27].

Figure 3. Comprehensive suicide risk evaluation and care pathway

On Hospitalization, the patient proceeds through a Discharge step and is then offered ongoing Follow-up care in order to check for recovery as well as relapse. The approach focuses on dynamic routing as per the evaluated risk and combines inpatient as well as outpatient support systems in order for holistic management of mental health [28].

3.1 AI-STEM integration index

Eq. (6) geometric mean formula measures the depth of AI integration into STEM curricula by considering its cognitive impact, adaptability to learners, and relevance to real-world tasks, ensuring a holistic learning experience [29].

$\mathrm{I}_{\mathrm{ai}}=\left(\mathrm{C}_{\mathrm{d}} \times \mathrm{A}_{\mathrm{s}} \times \mathrm{T}_{\mathrm{x}}\right)^{1 / 3}$            (6)

where, Iₐᵢ is the integration index, Cd is cognitive development contribution, Aₛ is adaptability score, and Tᵣ is task relevance.

Wang and Liu [30] have analyzed user preferences for social robots versus virtual agents in older adults for health monitoring. The key here is measuring how user interface impacts user trust and involvement in long-term home-based healthcare solutions. They concluded that user preferences are determined based on technological affinity, living environment, and personality. The downside is the inability of broadly deploying a single solution based on such personalized preferences as well as the integration cost of social robots.

3.2 Data preprocessing and feature extraction

Text Data (Social Media Posts):

  • Cleaning: Posts were cleaned by removing URLs, hashtags, emojis, punctuation, and stop words.

  • Tokenization & Lemmatization: Cleaned text was tokenized and lemmatized using the NLTK library.

  • Vectorization: Word embeddings were generated using Word2Vec and TF-IDF, transforming the text into dense vector representations that capture semantic relationships.

  • Padding: To ensure uniform input length for LSTM and Transformer models, sequences were padded to a fixed size.

Audio Data (Speech Recordings):

  • Noise Removal: Background noise was reduced using spectral gating techniques.

  • Segmentation: Audio files were segmented into smaller time frames (e.g., 3–5 seconds) to isolate speech segments.

  • Feature Extraction: Acoustic features such as MFCC (Mel Frequency Cepstral Coefficients), pitch, and energy were extracted using the Librosa library.

  • Normalization: Features were scaled to a standard range to improve convergence during training.

Visual Data (Facial Expression):

  • Frame Extraction: Key frames were extracted from video clips at consistent intervals.

  • Face Detection: Detected faces using Haar Cascade or Dlib-based detectors to crop facial regions.

  • Feature Mapping: Facial features were encoded using OpenFace or a CNN-based encoder.

  • Resizing and Normalization: All images were resized to 224×224 and pixel values normalized between 0 and 1.

3.3 Technology affinity score

Eq. (7) quantifies a user’s affinity toward either a social robot or a virtual agent for healthcare monitoring by assessing their interaction preferences, perceived personalization, and available spatial environment [31].

$T_a=\left(I_r \times P_r\right) / S_c$            (7)

where, Tₐ is the technology affinity score, Iᵣ is interest in interaction, Pᵣ is personalization rating, and Sc is spatial constraint.

Semantha et al. [32] presented how AI is used in enhancing communication of patients in healthcare systems. The innovation is based on AI-driven chatbots and virtual assistants that facilitate increased real-time access, scheduling, as well as multilingual support. This enhances patient responsiveness as well as satisfaction while eliminating language barriers. The challenge, though, is in attaining subtle emotional comprehension as well as sustaining ethical AI practice in sensitive health communication.

Figure 4 shows the five models Sentiment Aggregation, Dose Optimization, Hybrid Learning Score, AI-STEM Integration, and Technology Affinity—based on their Accuracy, Precision, Recall, and F1-Score. The dose optimization and AI-STEM models show higher performance, reflecting their suitability for personalized healthcare and adaptive learning environments.

Figure 4. Performance analysis of NLP-based suicidal risk prediction models

3.4 Query response relevance

NLP-based scoring function ranks the relevance of chatbot responses in healthcare by summing the weighted significance of individual input tokens, improving the clarity and usefulness of AI-generated answers [33] and it is determined by the Eq. (8).

$\mathrm{Q}_{\mathrm{r}}=\sum\left(\mathrm{x}_{\mathrm{i}} \times \mathrm{w}_{\mathrm{i}}\right)$          (8)

where, Qᵣ is the response relevance score, xᵢ is token i from user input, and wᵢ is its contextual importance weight.

Bian et al. [34] have suggested an innovative Convolutional Neural Network (CNN) for speech emotion recognition (SER) with the aim of improved human-machine interaction. The proposed model efficiently distinguishes seven emotional states of speech signals from different datasets with 88.76% accuracy. The innovation is highly useful in areas like mental health tracking and personalized voice assistants. The system's limitation is that it can be sensitive to noise in the environment and diverse accents, which may degrade emotion detection accuracy in real environments [35].

3.5 Emotion classification via CNN

This classification function applies a softmax operation on CNN-derived features to identify emotional states from speech inputs, enabling accurate and real-time emotion recognition in smart systems [36] and it is determined by the Eq. (9).

$\mathrm{E}_{\mathrm{c}}=\operatorname{softmax}\left(\mathrm{W}_{\mathrm{c}} \times \mathrm{F}+\mathrm{b}_{\mathrm{c}}\right)$           (9)

where, Ec is the emotion probability vector, Wc is the convolution weight matrix, F is the feature set, and bc is the bias vector.

Ji et al. [37] examined the introduction of AI-driven avatars known as digital humans in learning environments as personalized learning experience providers. As opposed to common chatbots, digital humans mimic human feelings and react in real-time, offering individualized learning support. This increases student motivation and inclusivity in various styles of learning. While its potential is great, the main limitation is the technological as well as cost investment involved in large-scale deployment in under-resourced schools.

3.6 Avatar emotional match score

Eq. (10) evaluates how well a digital human avatar responds to users by combining emotional context, interaction content, and timing, supporting personalized engagement in virtual classrooms or workplaces [38].

$\mathrm{R}_{\mathrm{a}}=\alpha \mathrm{E}_{\mathrm{u}}+\beta \mathrm{C}_{\mathrm{i}}+\gamma \mathrm{T}_{\mathrm{I}}$            (10)

where, Rₐ is the avatar response alignment score, Eᵤ is the emotional similarity, Cᵢ is content relevance, Tᵣ is timing appropriateness, and α, β, γ are tuned weights.

Liu et al. [39] created "StartleMart," a game of treatment and diagnosis for PTSD, which utilizes stress detection through skin conductance in order to profile the psychological health of patients in the course of playing. The innovation resides in the integration of affective computing with game-based exposure therapy, providing an interactive and innovative solution for evaluating mental health. The efficacy of the method, though, is conditional on physiological sensor accuracy as well as the ability to translate game-based stress reaction into clinically useful data [40].

3.7 Game-based stress correlation

This Pearson correlation function is used to examine the relationship between a player’s physiological stress (e.g., skin conductance) and in-game traumatic events, helping diagnose PTSD patterns [41] and it is expressed as given in Eq. (11).

$S c_{\text {orr }}=\operatorname{Cov}\left(G_s, S_e\right) /(\sigma G \times \sigma S)$            (11)

where, Scₒᵣᵣ is the correlation score, Gₛ is galvanic skin signal, Sₑ is the stress intensity of an event, and σ terms are standard deviations.

4. Elmo-Based Hierarchical Architecture for Suicide Risk Prediction

Figure 5 shows a hierarchical deep learning approach that appraises suicide risk from users' written posts. The approach begins with word-level inputs from multiple posts [42]. Each word is mapped to its contextual representation in the form of ELMo embeddings.

Zhu et al. [43] "MindWard," an emotionally enhancing machine learning-based web tool that uses questionnaire information in analyzing users' behavior. Utilizing Azure Studio’s features, the system generates personalized feedback as well as recommendations for emotional enhancement. The system is intended for aiding development through behavioral pattern identification. Its reliance on user honesty as well as cooperation in self-reports, though, could make limited the accuracy as well as the extent of emotional analysis.

Figure 5. Text processing pipeline for suicide risk detection using deep embedding and averaging

4.1 Wellness score aggregation

This summation formula is used to estimate emotional well-being by aggregating user responses from a digital questionnaire using weighted scoring to reflect the significance of each item [44] and it is expressed as given in Eq. (12).

$\mathrm{E}_{\mathrm{W}}=\sum\left(\mathrm{R}_{\mathrm{i}} \times \mathrm{W}_{\mathrm{i}}\right)$          (12)

where, Ew is the overall emotional wellness score, Rᵢ is the response rating for question i, and Wᵢ is the corresponding weight.

Hao et al. [45] suggested a conversational AI system that helps streamline healthcare activities with automation of appointment scheduling, symptom triage, and patient tracking. Based on the RASA framework and external APIs, the system solves healthcare inefficiencies while reducing bottlenecks in healthcare. Although the approach is novel in its elimination of administrative tasks as well as in offering telecare, issues involve guaranteeing the privacy of the collected data, integration with current hospital systems, as well as the management of emergency situations safely [46].

4.2 AI efficiency gain

This efficiency metric quantifies how much faster an AI-based conversational agent performs healthcare operations compared to traditional manual methods, improving workflow efficiency [47] and it is expressed as given in Eq. (13).

$E_{a i}=\left(T_c-T_a\right) / T_c$               (13)

where, Eₐᵢ is the AI efficiency score, Tc is the time taken by conventional systems, and Tₐ is the time taken using AI support.

Figure 6 shows the five models of Sentiment Aggregation, Dose Optimization, Hybrid Learning Score, AI-STEM Integration, and Technology Affinity are compared in line graph based on Accuracy, Precision, Recall, and F1-Score [48]. The performance of the dose optimization and AI-STEM models is greater, indicating their applicability in personalized health care as well as adaptive learning systems [49].

Figure 6. Performance analysis of AI-driven models in healthcare and mental wellness

5. Multimodal and Clinical Data-Based Suicide Prediction

Zhang and Yang [50] presented predictive modeling of suicidal tendencies by integrating clinical, psychosocial, and biological markers. Their innovation included comprehensive integration of clinical and psychosocial data. However, their approach requires extensive data preprocessing and relies significantly on clinical data availability, limiting its generalizability.

Wang et al. [51] applied LSTM models combined with Global Vector Spaces (GVS) word embeddings to identify suicidal tendencies from Twitter posts. The technique excelled in capturing linguistic subtleties indicative of suicidal tendencies, achieving high accuracy. However, its limitation includes a dependency on linguistic cues only, potentially missing non-verbal behavioral markers. Xu et al. [52] explored the use of structured and unstructured Electronic Health Records (EHRs) employing Random Forest, gradient boosting trees, and LSTM for predicting mental health crises among individuals with depression. Their work uniquely combines structured and unstructured clinical data for prediction. A drawback is the reliance on comprehensive clinical datasets, possibly limiting broader applicability. Zheng et al. [53] analyzed multimodal biomarkers (facial, vocal, linguistic, cardiovascular) extracted from remote interviews to detect psychiatric conditions. This multimodal analysis is their core innovation, providing robust diagnostic capabilities remotely. The limitation lies in moderate performance of individual modalities independently.

Multimodal approaches integrate textual, vocal, and physiological signals from clinical and real-time data sources [54]. Models such as Long Short-Term Memory (LSTM) networks are often used to analyze temporal dependencies in suicide-related data and it is calculated by the Eq. (14).

$h_t=\sigma\left(W_h h_{t-1}+W_x x_t+b\right)$               (14)

where, $h_t$ represents the hidden state at time $t, x_t$ is the input, and $W_h, W_x$ are weight matrices.

6. Emerging Digital Technologies for Mental Health Monitoring

Upadhyay et al. [55] introduced the ChAMP system for identifying early childhood mental health disorders using mobile data collection and digital phenotyping. Their innovative method facilitates accessible, objective mental health evaluations in young children. A potential limitation is moderate accuracy (70–73%), indicating room for improvement in model performance. Almahmoud et al. [56] explored factors influencing university students' acceptance of digital mental health tools, highlighting technology self-efficacy and digital alliance as key mediators. Their innovative contribution provides insights into user adoption dynamics. However, the practical effectiveness in long-term engagement remains uncertain [57].

Lee et al. [58] developed a dialogue system based on the Digital Twin concept to detect early signs of mental illnesses, offering personalized feedback. This innovative approach emphasizes real-time, personalized assessment. Yet, a limitation is relatively moderate accuracy (69%), indicating room for improvement. Tlachac and Heinz identified communication profiles through mobile data from crowdsourced participants to study their relationship with depression and anxiety levels. Their descriptive modeling sheds light on behavioral patterns relevant for mental health diagnostics. Nevertheless, the generalizability of crowdsourced samples to clinical populations poses a significant limitation [59].

Bhattacharyya et al. [60] examined risks and opportunities of digital psychiatry applications across sectors such as education, employment, and financial services. Their innovation emphasizes ethical considerations essential to public health. However, actual implementation effectiveness outside controlled environments requires further validation.

Alimour et al. [61] systematically explored metaverse technologies supporting mental health interventions, mapping innovations to global health goals. Their comprehensive mapping is significant for future applications. However, addressing ethical, legal, and access-related challenges is essential for broader acceptance. Patias et al. [62] conducted comparative analysis of NLP and deep learning techniques like BERT, RoBERTa, and XLNet, demonstrating efficacy in diagnosing mental health conditions through social media data. Their innovation lies in high accuracy from advanced NLP models. Nevertheless, limitations relate to specificity and generalizability across varied contexts.

Vuyyuru et al. [63] introduced digital worker systems within Cyber-Physical-Social Systems (CPSS) for effective management of mental health and performance in manufacturing environments. Their innovation includes leveraging intelligent methods for comprehensive worker management. However, generalization beyond manufacturing sectors might be limited. Lee [64] presented the COTIDIANA dataset, a smartphone-collected data resource focusing on mobility, finger dexterity, and mental health among individuals with Rheumatic and Musculoskeletal Diseases (RMDs). Their innovative dataset provides valuable metrics for passive monitoring. The limitation includes small sample size, potentially affecting broader applicability and validation.

Digital technologies, including metaverse applications, digital twins, and IoT-based health monitoring systems, are being adopted for proactive mental health management [65]. AI-based tools analyze behavioural and biometric data to assess and predict suicidal tendencies and it is expressed by Eq. (15).

$D_{\text {risk }}=\frac{1}{n} \sum_{i=1}^n\left(x_i-\mu\right)^2$                (15)

where, $D_{\text {risk }}$ represents the variance in mental health patterns, $x_i$ are individual data points, and $\mu$ is the mean.

Venugopal et al. [66] developed the Biopsychosocial AI-Driven Digital Twin (BADT) approach that integrates AI with biological, psychological, and social information in order to create digital twins of patients in real-time. The system offers predictive analytics for early diagnosis with simulated treatment for mental health care, providing an approach toward personalized psychiatry [67]. The use of wearable device functionalities as input for the model, as well as issues of privacy as well as ethical fairness, is majorly limiting.

Figure 7 shows the outlines performance measures of different AI-based models with clinical, biometric, and digital communication inputs [68]. The techniques of LSTM, digital twins, advanced NLP, and smartphone-based systems are compared in terms of accuracy, precision, recall, and F1-Score. Of these, LSTM-based and advanced NLP models have the highest accuracy of prediction, whereas systems such as the ChAMP system and digital twins provide relatively accurate results.

Figure 7. Performance evaluation of multimodal and digital technologies in mental health monitoring

6.1 Patient condition forecasting

Eq. (16) is used in the BADT framework to compute a patient’s overall mental health condition by linearly combining biological, psychological, and social factors, enabling real-time digital twin updates for personalized care [69].

$C_t=\alpha B_t+\beta P_t+\gamma S_t$                 (16)

where Cₜ is the condition score at time t, Bₜ is biological input, Pₜ is psychological input, Sₜ is social input, and α, β, γ are respective weights.

Oguntimilehin et al. [70] brought in the use of Large Language Models (LLMs) in the Erasmus+ me_HeLi-D project for promoting adolescent mental health literacy and awareness of diversity. The LLMs provide individualized, real-time, and culturally sensitive assistance in the education of mental health. However, issues like maintaining data protection, reducing algorithmic bias, and the frequent requirement of retraining models are major drawbacks.

6.2 LLM adaptability score

This measures how effectively a large language model (LLM) adapts to user input in mental health learning environments by evaluating weighted relevance of multiple response options and it is determined by the Eq. (17).

$A_s=\left(\sum_i w_i \times r_i\right) / N$            (17)

where, Aₛ is the adaptability score, wᵢ is the weight for response i, rᵢ is the relevance of response i, and N is the total number of responses.

Raut et al. [71] created an empathetic chatbot using Natural Language Processing (NLP) and sentiment analysis for personalized virtual support. The system can manage such conditions as anxiety, depression, and stress through an ability to read the user’s emotions [72]. The main disadvantage is its limited ability in decoding complex or ambiguous emotional inputs, which can lead to decreased therapeutic efficacy.

6.3 Sentiment probability computation

In chatbot design, this equation is used to compute the probability of each possible emotion class, allowing the system to deliver contextually sensitive and emotionally accurate replies [73] and it is determined by the Eq. (18).

$\mathrm{P}=\operatorname{softmax}(\mathrm{Wx}+\mathrm{b})$             (18)

where, P is the emotion class probability vector, W is the weight matrix, x is the input feature vector, and b is the bias term.

Kadao and Balkrishna [74] offered a medical healthcare digital twin reference platform that integrates real-world and simulated data in order to forecast stress risk in emotionally stressful jobs. The innovation is in its multilayered architecture, which individualizes intervention through virtual modeling. The platform, though, requires accurate synchronization of multimodal data and can have challenges in practical deployment as a function of architectural complexity.

6.4 Stress risk estimation

Eq. (19) combines emotional, behavioral, and environmental indicators to predict the likelihood of stress in professionals, forming a core part of the real-time stress-monitoring digital twin system [75].

$\mathrm{R}_{\mathrm{s}}=\delta(\mathrm{E}+\mathrm{M}+\mathrm{B})$            (19)

where, Rₛ is the calculated stress score, E is the environmental factor index, M is measured emotion, B is behavior observation score, and δ is a normalization constant.

Hakani et al. [76] developed wearable stress-tracking smart band with context-aware sensor integration for monitoring physiological signals like skin resistance and pulse rate. Targeted for use in adolescents, real-time mental health evaluation as well as management is facilitated. The limitation is dependence on user compliance as well as environmental interference, which could impact sensor accuracy as well as reliability of information.

6.5 Wearable sensor fusion score

It computes a single stress indicator by averaging physiological signals from multiple wearable sensors and it is determined by Eq. (20).

$\mathrm{S}_{\mathrm{f}}=(\mathrm{HR}+\mathrm{GSR}+\mathrm{Temp}) / 3$               (20)

where, Sf is the fused stress signal, HR is heart rate, GSR is galvanic skin response, and Temp is skin temperature.

Klaas et al. [77] developed an AI-based chatbot for mental health utilizing deep learning and Natural Language Processing methods, namely LSTM and Bi-LSTM, for context-specific responses. The uniqueness of its approach is that it preprocesses text data, extracts features, and offers empathetic responses based on the emotional status of the user. Although Bi-LSTM had superior keyword extraction, LSTM was used because of its superior effectiveness in producing efficient responses [78, 79]. However, the major limiting factor is the heavy computation requirement as well as the complexity of Bi-LSTM, resulting in elevated validation loss as well as poor communication output.

6.6 LSTM-based context generation

Equation updates the chatbot's memory state in LSTM-based models, helping it preserves conversational flow and respond appropriately to emotional cues in ongoing dialogues it is determined by Eq. (21).

$h_t=\tanh \left(W_x x_t+W_h h_{t-1}+b\right)$             (21)

where, hₜ is the current hidden state vector, xₜ is the input at time t, Wₓ and Wₕ are weight matrices, hₜ₋₁ is the previous hidden state, and b is the bias.

Tan et al. [80] launched "Soulmind," an anxiety detection chatbot with real-time intervention. The innovation entails personalized support mechanisms that accurately identify anxiety as low, moderate, or high with an 85% accuracy rate, offering users coping mechanisms as well as professional resources. Although the system guarantees anonymity and promotes early mental health help, its drawbacks are the possibility of misclassification from model bias as well as the requirement for regular internet connectivity and user digital proficiency.

6.7 Softmax-based anxiety level detection

This function calculates the probability of anxiety levels (low, moderate, high) using softmax classification it is determined by Eq. (22).

$A_i=\exp \left(z_i\right) / \sum_i \exp \left(z_j\right)$          (22)

Abbreviation: Aᵢ is the probability for class i, zᵢ is the score for class i, and the denominator is the total sum of exponentials for all classes j.

Jayawardena et al. [81] examined the employment of chatbots empowered with GPT-3 for increasing worker engagement in virtual and blended work environments. The chatbots communicate in the same way humans do with coworkers, providing services in the realm of HR policies, well-being, training, as well as collecting feedback. The method promotes inclusivity, immediate help, as well as individualized professional development. The key disadvantage is in the reliance on data protection mechanisms as well as the potential for misinterpreting emotional signals, which could affect the depth of human relationship in corporate communication [82].

6.8 Employee chatbot engagement score

Eq. (23) calculates overall employee engagement during interactions with a GPT-3 powered chatbot by weighting interaction frequency against response quality.

$E=\left(\sum\left(f_i \times r_i\right)\right) / n$          (23)

where, E is the engagement index, fᵢ is interaction frequency, rᵢ is responsiveness score, and n is the number of interactions.

Jiaen et al. [83] created "Revivify," an extensive depression detection and management system that scans user tweets as well as health questionnaire responses for anxiety and depression levels. With machine learning algorithms such as Random Forest as well as Latent Dirichlet Allocation, the system labels mental health status in nine categories and offers users advice as well as help line links. While as an inexpensive as well as proactive digital aid, the system itself depends on social media usage as well as text-based sentiment analysis, which could create biases as well as not capture psychological subtleties.

6.9 Depression score via ensemble learning

This ensemble method aggregates predictions from multiple classifiers to generate a single depression risk score from user tweets and DASS responses and it is determined by the Eq. (24).

$\mathrm{D}=\mathrm{w}_1 \mathrm{~N}_1+\mathrm{w}_2 \mathrm{~N}_2+\mathrm{w}_3 \mathrm{~N}_3$           (24)

where, D is the final depression score, N₁ is from a neural network, N₂ from an LDA model, N₃ from a random forest model, and w₁, w₂, w₃ are respective weights.

Figure 8. Performance comparison of AI-based mental health chatbots and monitoring models

Figure 8 shows the performance of different models based on predicting mental health, chatbot usage, stress estimation, and personalized emotion detection is compared in this bar graph. The depression ensemble model and LSTM chatbot models have superior F1-Score values and accuracy, which show their efficiency in real-time mental health evaluation systems and support.

7. Result and Discussion

The experimental comparison analyzed the accuracy of three models—CNN, LSTM, and Transformer—using the multimodal data composed of social media text, voice data, and facial features. Of those, the best accuracy of 94.2% was attained by the Transformer model, followed by LSTM at 91.7% and CNN at 89.5%. The improved performance of the Transformer is due to its self-attention mechanism, enabling it to well capture long-distance dependencies and contextuality of text sequences. In contrast to LSTM, where the input is processed sequentially, data is processed in parallel in Transformers, enabling the network to model the complex human language structure in social media posts much better. LSTM excelled with its advantage in capturing temporal dynamics, particularly in voice and text data. Yet, the sequential structure of LSTM results in slower training and periodic information leakage in long sequences, slightly handicapping it in terms of performance in comparison to the Transformer. CNN exhibited good competition in extracting spatial features from facial expressions and spectrograms. Its weak point is in capturing the contextual and temporal variations, which are essential in finding the slight emotional signals in speech and text. The gap in performance also captures the complementary benefits of multimodal data fusion. Combined modality training models consistently outperformed single-modality input, confirming the hypothesis that the diversity of behavioral signals augments detection of risk.

Table 1 outlines the key experimental parameters used in the deep learning-based suicide risk detection system. It includes dataset details, feature extraction methods, model architectures, training strategies, evaluation metrics, and hyperparameters essential for optimizing performance.

Table 1. Experimental setup for suicide risk detection

SI. No

Particular

Value

1

Dataset Used

Social Media Suicide Risk Dataset

2

Feature Extraction Method

TF-IDF, Word2Vec, Spectrogram Analysis

3

Model Architecture

CNN, LSTM, Transformer Models

4

Training Algorithm

Adam Optimizer, Stochastic Gradient Descent

5

Evaluation Metrics

Accuracy, Precision, Recall, F1-Score

6

Number of Layers

4 (for CNN), 2 (for LSTM)

7

Activation Function

ReLU, Sigmoid

8

Learning Rate

0.001

9

Batch Size

32

10

Number of Epochs

50

Figure 9. Performance comparison of deep learning models for suicide risk detection

Figure 10. Alternative evaluation metrics for deep learning models

Figure 9 illustrates the performance of CNN, LSTM, and Transformer models based on four evaluation metrics: Accuracy, Precision, Recall, and F1-Score. Transformer shows the highest values across all metrics, confirming its effectiveness in detecting suicidal patterns using multimodal data.

Figure 10 compares CNN, LSTM, and Transformer models using Specificity, Sensitivity, AUC, and Loss. The Transformer model outperforms others, particularly in AUC and Loss, demonstrating its robustness in suicide risk detection.

Figure 11 presents the comparison of CNN, LSTM, and Transformer models based on training time, validation accuracy, overfitting score, and robustness. TheTransformer model demonstrates strong robustness and accuracy with low overfitting.

Figure 11. Training efficiency and robustness evaluation

Figure 12 presents the average performance scores of CNN, LSTM, and Transformer models. The Transformer model exhibits the highest overall performance, highlighting its effectiveness in suicide risk detection tasks.

Figure 12. Overall performance comparison of deep learning models

8. Conclusion

This study introduces a multimodal framework based on deep learning to identify suicidal behavior in an early stage through the integration of text, audio, and facial expression data. It utilizes CNN, LSTM, and Transformer models with a maximum accuracy of 94.2% based on the Transformer model on the Social Media Suicide Risk Dataset. It is seen that the use of multiple modalities' behavior signals heavily increases prediction performance and presents a scalable, data-driven methodology to monitor mental well-being risk. Although it holds much promise, the research is not without limitation. The structure is presently based on pre-obtained datasets, the range of which may not be fully representative of actual suicidal behaviors in the real world displayed in multiple cultures and languages. Also, the computational effort of the Transformer model might be an obstacle to real-time execution on low-resource devices. In addition, the use of publicly available social networks and sound data raises privacy and ethical issues that should be resolved in real-world usage. Future research will involve broadening the dataset with real-time and multilingual data, enabling privacy-preserving data handling techniques, and model optimization for real-time, edge-based deployment. XAI methods will also be a focus in terms of maximizing the interpretability of predictions to allow the professionals in the field of mental health to make well-informed and transparent decisions.

Acknowledgments

This research work is partially supported by Siddaganga Institute of Technology, Tumkur, and JSSATEB AICTE IDEA Lab, Bengaluru. We sincerely acknowledge their invaluable support and encouragement, which motivated us to undertake this research and publish this paper. Their guidance and resources have significantly contributed to the successful completion of this work.

  References

[1] Wang, H., Huang, L., Yu, K., Song, T., Yang, H., Zhang, H. (2023). Deep-learning-based multiview RGBD sensor system for 3-D face point cloud registration. IEEE Sensors Letters, 7(5): 1-4. https://doi.org/10.1109/LSENS.2023.3267948

[2] Tan, P.X., Hoang, D.C., Nguyen, A.N., Nguyen, V.T., et al. (2024). Attention-based grasp detection with monocular depth estimation. IEEE Access, 12: 65041-65057. https://doi.org/10.1109/ACCESS.2024.3397718

[3] Venkatesh, D.Y., Mallikarjunaiah, K., Srikantaswamy, M. (2025). Efficient reconfigurable parallel switching for low-density parity-check encoding and decoding. IAES International Journal of Artificial Intelligence (IJ-AI), 14(1): 260-269.

[4] Laga, H., Jospin, L.V., Boussaid, F., Bennamoun, M. (2020). A survey on deep learning techniques for stereo-based depth estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4): 1738-1764. https://doi.org/10.1109/TPAMI.2020.3032602

[5] Shankara, K.H., Srikantaswamy, M., Nagaraju, S. (2025). An efficient load-balancing in machine learning-based DC-DC conversion using renewable energy resources. IAES International Journal of Artificial Intelligence (IJ-AI), 14(1): 307-316.

[6] Honnegowda, J., Mallikarjunaiah, K., Srikantaswamy, M. (2025). Efficient reduction of computational complexity in video surveillance using hybrid machine learning for event recognition. IAES International Journal of Artificial Intelligence (IJ-AI), 14(1): 317-326. 

[7] Urs, P.M., Reddy, A.T.N., Mallikarjunaswamy, S., Lakshminarayan, U.M. (2025). An innovative IoT framework using machine learning for predicting information loss at the data link layer in smart networks. Engineering, Technology & Applied Science Research, 15(2): 20904-20911. https://doi.org/10.48084/etasr.9597

[8] Hu, J., Bao, C., Ozay, M., Fan, C., Gao, Q., Liu, H., Lam, T.L. (2022). Deep depth completion from extremely sparse data: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(7): 8244-8264. https://doi.org/10.1109/TPAMI.2022.3229090

[9] Ren, L., Lu, J., Feng, J., Zhou, J. (2019). Uniform and variational deep learning for RGB-D object recognition and person re-identification. IEEE Transactions on Image Processing, 28(10): 4970-4983. https://doi.org/10.1109/TIP.2019.2915655

[10] Han, Z., Yu, S., Lin, S.B., Zhou, D.X. (2020). Depth selection for deep ReLU nets in feature extraction and generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4): 1853-1868. https://doi.org/10.1109/TPAMI.2020.3032422

[11] Xie, Z., Yu, X., Gao, X., Li, K., Shen, S. (2022). Recent advances in conventional and deep learning-based depth completion: A survey. IEEE Transactions on Neural Networks and Learning Systems, 35(3): 3395-3415. https://doi.org/10.1109/TNNLS.2022.3201534

[12] Shilpa, M., Yadav, R., Patil, A., Dadhich, L., Rathod, L.S. (2025). Game-theoretic optimization of national energy strategies using fbprophet forecasting. In 2025 International Conference on Intelligent and Cloud Computing (ICoICC), Bhubaneswar, India, pp. 1-5.

[13] Muhammad, K., Khan, S., Del Ser, J., De Albuquerque, V.H.C. (2020). Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey. IEEE Transactions on Neural Networks and Learning Systems, 32(2): 507-522. https://doi.org/10.1109/TNNLS.2020.2995800

[14] Charitha, M., Hosur, S., Srikantaswamy, M. (2025). Optimized BER reduction in wireless communication using a chaos-based CDSK modulation model. Mathematical Modelling of Engineering Problems, 12(2): 719-729. https://doi.org/10.18280/mmep.120234

[15] Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C. (2021). Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6): 2872-2893. https://doi.org/10.1109/TPAMI.2021.3054775

[16] Ara, A., Al-Rodhaan, M., Tian, Y., Al-Dhelaan, A. (2017). A secure privacy-preserving data aggregation scheme based on bilinear ElGamal cryptosystem for remote health monitoring systems. IEEE Access, 5: 12601-12617. https://doi.org/10.1109/ACCESS.2017.2716439

[17] Algarni, F., Khan, M.A., Alawad, W., Halima, N.B. (2023). P3S: Pertinent privacy-preserving scheme for remotely sensed environmental data in smart cities. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16: 5905-5918. https://doi.org/10.1109/JSTARS.2023.3288743

[18] Pushpalatha, V., Mallikarjuna, P.B., Mahendra, H.N., Subramoniam, S.R., Mallikarjunaswamy, S. (2025). Land use and land cover classification for change detection studies using convolutional neural network. Applied Computing and Geosciences, 25: 100227. https://doi.org/10.1016/j.acags.2025.100227

[19] Sukumar, P.G., Krishnaiah, M., Velluri, R., Satish, P., Nagaraju, S., Puttaswamy, N.G., Srikantaswamy, M. (2024). An efficient adaptive reconfigurable routing protocol for optimized data packet distribution in network on chips. International Journal of Electrical & Computer Engineering (2088-8708), 14(1): 305-314. https://doi.org/10.11591/ijece.v14i1.pp305-314

[20] Thazeen, S., Srikantaswamy, M. (2023). An efficient reconfigurable optimal source detection and beam allocation algorithm for signal subspace factorization. In International Journal of Electrical and Computer Engineering (IJECE), 13(6): 6452-6465. https://doi.org/10.11591/ijece.v13i6.pp6452-6465

[21] Huang, Y., Cao, L. (2023). Privacy-preserving remote sensing image generation and classification with differentially private GANs. IEEE Sensors Journal, 23(18): 20805-20816. https://doi.org/10.1109/JSEN.2023.3267001

[22] Sathyanarayana, R., Ramaswamy, N.K., Srikantaswamy, M., Ramaswamy, R.K. (2024). An efficient unused integrated circuits detection algorithm for parallel scan architecture. International Journal of Electrical & Computer Engineering, 14(1): 469-478. https://doi.org/10.11591/ijece.v14i1.pp469-478

[23] Onesimu, J.A., Karthikeyan, J., Eunice, J., Pomplun, M., Dang, H. (2022). Privacy preserving attribute-focused anonymization scheme for healthcare data publishing. IEEE Access, 10: 86979-86997. https://doi.org/10.1109/ACCESS.2022.3199433

[24] Kumar, S.M., Velluri, R., Dayananda, P., Nagaraj, S., Srikantaswamy, M., Chandrappa, K.Y. (2023). An efficient detection and prediction of intrusion in smart grids using artificial neural networks. In International Conference on Data Science, Computation and Security, Singapore, pp. 505-515. https://doi.org/10.1007/978-981-97-0975-5_45

[25] Li, J., Yan, H., Zhang, Y. (2020). Identity-based privacy preserving remote data integrity checking for cloud storage. IEEE Systems Journal, 15(1): 577-585. https://doi.org/10.1109/JSYST.2020.2978146

[26] Vidyashree, K.N., Mallikarjunaswamy, S., Sharmila, N. (2025). Fault Tolerance in fog computing with architectures scheduling and optimization for IoT. In 2025 International Conference on Intelligent and Cloud Computing (ICoICC), Bhubaneswar, India, pp. 1-5.

[27] Usha, S.M., Kumar, D.M., Kavitha, M., Pavithra, G.S., et al. (2024). Intelligent fault detection and prediction in smart grids using supervised learning model. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), pp. 1-6. https://doi.org/10.1109/ICRASET63057.2024.10895864

[28] Deepak, B.L., Sabin, T.T., Darshana, A., Vindya, L., et al. (2024). Artificial intelligence-driven power management system for enhanced efficiency in smart grids. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), pp. 1-5. https://doi.org/10.1109/ICRASET63057.2024.10895109

[29] Shankara, K.H., Srikantaswamy, M., Nagaraju, S. (2024). A comprehensive study on DC-DC converter for equal current sharing and voltage stability in renewable energy resources. Journal Européen des Systèmes Automatisés, 57(2): 323-334. https://doi.org/10.18280/jesa.570202

[30] Wang, Y., Liu, Y. (2022). RC2PAS: Revocable certificateless conditional privacy-preserving authentication scheme in WBANs. IEEE Systems Journal, 16(4): 5675-5685. https://doi.org/10.1109/JSYST.2022.3152742

[31] Honnegowda, J., Mallikarjunaiah, K., Srikantaswamy, M. (2024). An efficient abnormal event detection system in video surveillance using deep learning-based reconfigurable autoencoder. Ingénierie des Systèmes d’Information, 29(2): 677-686. https://doi.org/10.18280/isi.290229

[32] Semantha, F.H., Azam, S., Shanmugam, B., Yeo, K.C., Beeravolu, A.R. (2021). A conceptual framework to ensure privacy in patient record management system. IEEE Access, 9: 165667-165689. https://doi.org/10.1109/ACCESS.2021.3134873

[33] Basavaraju, N.M., Mahadevaswamy, U.B., Mallikarjunaswamy, S. (2024). Design and implementation of crop yield prediction and fertilizer utilization using IoT and machine learning in smart agriculture systems. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-6. https://doi.org/10.1109/NMITCON62075.2024.10699184

[34] Bian, G., Zhang, R., Shao, B. (2022). Identity-based privacy preserving remote data integrity checking with a designated verifier. IEEE Access, 10: 40556-40570. https://doi.org/10.1109/ACCESS.2022.3166920

[35] Kavya, B.M., Mallikarjunaswamy, S., Sharmila, N., Shilpa, M., et al. (2024). An efficient machine learning-based power management system for smart grids using renewable energy resources. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-7. https://doi.org/10.1109/NMITCON62075.2024.10698819

[36] Kumar, S.M., Nagaraj, S., Veerabhadraswamy, P., Nanjundaswamy, M.H., Srikantaswamy, M., Chandratta, K.Y. (2023). An enhanced power management and prediction for smart grid using machine learning. In International Conference on Data Science, Computation and Security, Singapore, pp. 269-277. https://doi.org/10.1007/978-981-97-0975-5_24

[37] Ji, S., Gui, Z., Zhou, T., Yan, H., Shen, J. (2018). An efficient and certificateless conditional privacy-preserving authentication scheme for wireless body area networks big data services. IEEE Access, 6: 69603-69611. https://doi.org/10.1109/ACCESS.2018.2880898

[38] Sangeetha, K.N., Punya, H.R., Srujan, S.P., Sunil, P., et al. (2024). Pilot implementation of efficient automation in sericulture farms using Internet of Things (IoT). In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-5. https://doi.org/10.1109/NMITCON62075.2024.10698940

[39] Liu, F., Wang, D., Xu, Z. (2021). Privacy-preserving travel time prediction with uncertainty using GPS trace data. IEEE Transactions on Mobile Computing, 22(1): 417-428. https://doi.org/10.1109/TMC.2021.3074865

[40] Kavitha, H.S., Mallikarjunaswamy, S., Sharmila, N. (2024). An optimized power management system for solar and wind energy using hybrid inverters and machine learning. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-6. https://doi.org/10.1109/NMITCON62075.2024.10698831

[41] Sharanya, U.G., Birabbi, K.M., Sahana, B.H., Kumar, D.M., Sharmila, N., Mallikarjunaswamy, S. (2024). Design and implementation of IoT-based water quality and leakage monitoring system for urban water systems using machine learning algorithms. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-5. https://doi.org/10.1109/NMITCON62075.2024.10698922

[42] Savitha, A.C., DSouza, S., Shanthakumar, S., Pavan, S., et al. (2024). Renewable energy-based smart agriculture systems for climate change prediction and impact mitigation. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-7. https://doi.org/10.1109/NMITCON62075.2024.10698969

[43] Zhu, J., Wu, J., Bashir, A.K., Pan, Q., Yang, W. (2023). Privacy-preserving federated learning of remote sensing image classification with dishonest majority. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16: 4685-4698. https://doi.org/10.1109/JSTARS.2023.3276781

[44] Kumar, N., Shivakumarswamy, P.M., Nikhil, N., Moger, N., et al. (2024). Optimal renewable energy wireless power management system for electric vehicles using predictive analytics. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-6. https://doi.org/10.1109/NMITCON62075.2024.10698816

[45] Hao, Z., Zhong, S., Yu, N. (2011). A privacy-preserving remote data integrity checking protocol with data dynamics and public verifiability. IEEE Transactions on Knowledge and Data Engineering, 23(9): 1432-1437. https://doi.org/10.1109/TKDE.2011.62

[46] Shilpa, M., Ravi, P., Sharmila, N., Mallikarjunaswamy, S., et al. (2024). Enhancing crop yield and growth prediction using IoT-based smart irrigation systems and machine learning algorithms. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-5. https://doi.org/10.1109/NMITCON62075.2024.10698825

[47] Gangadharaswamy, H.B., Kumar, N., Sharmila, N., Shilpa, M., et al. (2024). An efficient and intelligent IoT-based security model for enhanced protection using motion detection and cloud storage optimization. In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), Bengaluru, India, pp. 1-6. https://doi.org/10.1109/NMITCON62075.2024.10699278

[48] Anu, H., Rathnakara, S., Mallikarjunaswamy, S. (2025). Enhanced ECG signal classification with CNN-LSTM networks using aquila optimization. Engineering, Technology & Applied Science Research, 15(3): 23461-23466. https://doi.org/10.48084/etasr.10492

[49] Manjunatha, S., Swetha, M.D., Rashmi, S., Vinoth Kumar, V., Mallikarjuna Swamy, S. (2024). Convolutional neural network-based image tamper detection with error level analysis. In 2024 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bangalore, India, pp. 1-7. https://doi.org/10.1109/IITCEE59897.2024.10467563

[50] Zhang, J., Yang, H. (2023). A privacy-preserving remote heart rate abnormality monitoring system. IEEE Access, 11: 97089-97098. https://doi.org/10.1109/ACCESS.2023.3312549

[51] Wang, T., Zheng, Z., Rehmani, M.H., Yao, S., Huo, Z. (2018). Privacy preservation in big data from the communication perspective—A survey. IEEE Communications Surveys & Tutorials, 21(1): 753-778. https://doi.org/10.1109/COMST.2018.2865107

[52] Xu, J., Xue, K., Li, S., Tian, H., Hong, J., Hong, P., Yu, N. (2019). Healthchain: A blockchain-based privacy preserving scheme for large-scale health data. IEEE Internet of Things Journal, 6(5): 8770-8781. https://doi.org/10.1109/JIOT.2019.2923525

[53] Zheng, Z., Wang, T., Bashir, A.K., Alazab, M., Mumtaz, S., Wang, X. (2021). A decentralized mechanism based on differential privacy for privacy-preserving computation in smart grid. IEEE Transactions on Computers, 71(11): 2915-2926. https://doi.org/10.1109/TC.2021.3130402

[54] Dayananda, P., Srikantaswamy, M., Nagaraju, S., Nanjundaswamy, M.H. (2024). A machine learning-based smart grid protection and control framework using Kalman filters for enhanced power management. Journal Européen des Systèmes Automatisés, 57(3): 639-651. https://doi.org/10.18280/jesa.570301

[55] Upadhyay, D., Gaikwad, N., Zaman, M., Sampalli, S. (2022). Investigating the avalanche effect of various cryptographically secure hash functions and hash-based applications. IEEE Access, 10: 112472-112486. https://doi.org/10.1109/ACCESS.2022.3215778

[56] Almahmoud, A., Damiani, E., Otrok, H. (2022). Hash-comb: A hierarchical distance-preserving multi-hash data representation for collaborative analytics. IEEE Access, 10: 34393-34403. https://doi.org/10.1109/ACCESS.2022.3158934

[57] Sheela, S., Jyothi, S., Latha, A.P., Ganesh, H.J., Komala, M., et al. (2024). Automated land cover classification in urban environments with deep learning-based semantic segmentation. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), pp. 1-7. https://doi.org/10.1109/ICRASET63057.2024.10895689

[58] Lee, W.K., Jang, K., Song, G., Kim, H., Hwang, S.O., Seo, H. (2022). Efficient implementation of lightweight hash functions on GPU and quantum computers for IoT applications. IEEE Access, 10: 59661-59674. https://doi.org/10.1109/ACCESS.2022.3179970

[59] Rekha, V., Sharmila, N., Komala, M., Pavithra, G.S., Mallikarjunaswamy, S., et al. (2024). Hybrid edge-cloud approach for renewable energy management using deep learning with predictive analytics. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), pp. 1-7. https://doi.org/10.1109/ICRASET63057.2024.10895726

[60] Bhattacharyya, R., Nandi, M., Raychaudhuri, A. (2023). Subversion resilient hashing: Efficient constructions and modular proofs for crooked indifferentiability. IEEE Transactions on Information Theory, 69(5): 3302-3315. https://doi.org/10.1109/TIT.2023.3238115

[61] Alimour, S.A., Alrabeei, M. (2024). A novel model for digital twins in mental health: The biopsychosocial AI-driven digital twin (BADT) framework. In 2024 11th International Conference on Software Defined Systems (SDS), Gran Canaria, Spain, pp. 6-10. https://doi.org/10.1109/SDS64317.2024.10883917

[62] Patias, I., Miteva, D., Peltekova, E., Wright, M., Gasteiger-Klicpera, B. (2024). Leveraging large language models to enhance mental health literacy and diversity awareness in adolescents: The me_HeLi-D project. In 2024 8th International Symposium on Innovative Approaches in Smart Technologies (ISAS), İstanbul, Turkiye, pp. 1-5. https://doi.org/10.1109/ISAS64331.2024.10845582

[63] Vuyyuru, A., Praveena, T.L., Sharma, A., Yelagandula, M., Nelli, S. (2024). Mental health therapist chatbot using NLP. In 2024 International Conference on Artificial Intelligence and Quantum Computation-Based Sensor Application (ICAIQSA), Nagpur, India, pp. 1-6. https://doi.org/10.1109/ICAIQSA64000.2024.10882362

[64] Lee, K.Y. (2024). Medical healthcare digital twin reference platform. In 2024 Fifteenth International Conference on Ubiquitous and Future Networks (ICUFN), Budapest, Hungary, pp. 597-599. https://doi.org/10.1109/ICUFN61752.2024.10624966

[65] Sheela, S., Latha, A.P., Jyothi, S., Vidyarani, H.J., et al. (2024). Enhancing stockpile management through deep learning with a focus on demand forecasting and inventory optimization. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), B G Nagara, Mandya, India, pp. 1-6. https://doi.org/10.1109/ICRASET63057.2024.10895608

[66] Venugopal, D., Kanna, A.R., Sasiramakrishnan, R., Umayal, S. (2024). Wearable stress monitoring band using context aware sensor fusion technology. In 2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC), Salem, India, pp. 1678-1684. https://doi.org/10.1109/ICAAIC60222.2024.10575629

[67] Kavitha, H.S., Anu, H., Komala, M., Pavithra, G.S., et al. (2024). Smart grid solar tracking optimization using deep reinforcement learning algorithm. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), B G Nagara, Mandya, India, pp. 1-6. https://doi.org/10.1109/ICRASET63057.2024.10895647

[68] Jyothi, H., Komala, M., Mallikarjunaswamy, S. (2024). Advancing event recognition in videos using hybrid deep learning model. In 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), B G Nagara, Mandya, India, pp. 1-6. https://doi.org/10.1109/ICRASET63057.2024.10895392

[69] Mahendra, H.N., Pushpalatha, V., Mallikarjunaswamy, S., Subramoniam, S.R., Rao, A.S., Sharmila, N. (2024). LULC change detection analysis of Chamarajanagar district, Karnataka state, India using CNN-based deep learning method. Advances in Space Research, 74(12): 6384-6408. https://doi.org/10.1016/j.asr.2024.07.066

[70] Oguntimilehin, A., Saka, A.F., Toyin, O., Obamiyi, S.E., et al. (2024). Mental health chatbot using deep learning and natural language processing. In 2024 IEEE 5th International Conference on Electro-Computing Technologies for Humanity (NIGERCON), Ado Ekiti, Nigeria, Ado Ekiti, Nigeria, pp. 1-5. https://doi.org/10.1109/NIGERCON62786.2024.10927008

[71] Raut, N., Kalambe, B., Salvi, A., Kothari, H., Pulgam, N. (2024). Soulmind: Mental health chatbot with anxiety prediction. In 2024 2nd International Conference on Emerging Trends in Engineering and Medical Sciences (ICETEMS), Nagpur, India, pp. 75-81. https://doi.org/10.1109/ICETEMS64039.2024.10965104

[72] Mahendra, H.N., Mallikarjunaswamy, S., Subramoniam, S.R. (2023). An assessment of vegetation cover of Mysuru City, Karnataka State, India, using deep convolutional neural networks. Environmental Monitoring and Assessment, 195(4): 526. https://doi.org/10.1007/s10661-023-11140-w

[73] Sheela, S., Naveen, K.B., Basavaraju, N.M., Kumar, D.M., Krishnaiah, M., Mallikarjunaswamy, S. (2023). An efficient vehicle to vehicle communication system using intelligent transportation system. In 2023 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), B G NAGARA, India, pp. 1-6. https://doi.org/10.1109/ICRASET59632.2023.10420043

[74] Kadao, A.K., Balkrishna, S.M. (2025). Using CHATBOTS powered by GPT-3 to increase staff engagement. In 2025 International Conference on Intelligent Control, Computing and Communications (IC3), Mathura, India, pp. 999-1005. https://doi.org/10.1109/IC363308.2025.10957238

[75] Kavya, B.M., Sharmila, N., Naveen, K.B., Mallikarjunaswamy, S., Manu, K.S., Manjunatha, S. (2023). A machine learning based smart grid for home power management using cloud-edge computing system. In 2023 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), B G NAGARA, India, pp. 1-6. https://doi.org/10.1109/ICRASET59632.2023.10419952

[76] Hakani, R., Patil, S., Patil, S., Jhunjhunwala, S., Deulkar, K. (2022). Revivify: A depression detection and control system using tweets and automated chatbot. In 2022 IEEE World Conference on Applied Intelligence and Computing (AIC), pp. 796-801. https://doi.org/10.1109/AIC55036.2022.9848978

[77] Klaas, S., Wijendra, R.H., Imthiaz, M.I., Wijayarathna, P.P., Karunasena, A., Kmlp, W. (2023). Mental illness aiding through machine learning and soft-computing. In 2023 5th International Conference on Advancements in Computing (ICAC), Colombo, Sri Lanka, pp. 862-867. https://doi.org/10.1109/ICAC60630.2023.10417189

[78] Umashankar, M.L., Mallikarjunaswamy, S., Sharmila, N., Kumar, D.M., Nataraj, K.R. (2023). A survey on IoT protocol in real-time applications and its architectures. In ICDSMLA 2021: Proceedings of the 3rd International Conference on Data Science, Machine Learning and Applications, Singapore, pp. 119-130. https://doi.org/10.1007/978-981-19-5936-3_12

[79] Pooja, S., Mallikarjunaswamy, S., Sharmila, N. (2023). Image region driven prior selection for image deblurring. Multimedia Tools and Applications, 82(16): 24181-24202. https://doi.org/10.1007/s11042-023-14335-y

[80] Tan, S.B., Kumar, K.S., Truong, A.T.L., Tan, L.W.J., et al. (2023). Comparing the performance of multiple small-data personalized tacrolimus dosing models for pediatric liver transplant: A retrospective study. In 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Sydney, Australia, pp. 1-4. https://doi.org/10.1109/EMBC40787.2023.10341002

[81] Jayawardena, A., Kahandawa, G., Hewawasam, H., Piyathilaka, L., Sul, J. (2024). Navigating the new normal: Student perspectives on transitioning from online to face-to-face learning after COVID-19 lockdowns. In 2024 IEEE World Engineering Education Conference (EDUNINE), Guatemala City, Guatemala, pp. 1-6. https://doi.org/10.1109/EDUNINE60625.2024.10500610

[82] Venkatesh, D.Y., Mallikarjunaiah, K., Srikantaswamy, M. (2023). A high-throughput reconfigurable LDPC codec for wide band digital communications. Journal Européen des Systèmes Automatisés, 56(4): 529-538. https://doi.org/10.18280/jesa.560402

[83] Jiaen, M., Xiaodi, C. (2024). The logic, framework, and path of artificial intelligence applied to personalized STEM instruction. In 2024 4th International Conference on Educational Technology (ICET), Wuhan, China, pp. 423-427. https://doi.org/10.1109/ICET62460.2024.10869017