eprintid: 26964 rev_number: 8 eprint_status: archive userid: 2 dir: disk0/00/02/69/64 datestamp: 2026-01-29 23:30:10 lastmod: 2026-01-29 23:30:12 status_changed: 2026-01-29 23:30:10 type: article metadata_visibility: show creators_name: KINA, Erol creators_name: Choi, Jin-Ghoo creators_name: Ishaq, Abid creators_name: Shafique, Rahman creators_name: Gracia Villar, Mónica creators_name: Silva Alvarado, Eduardo René creators_name: Diez, Isabel de la Torre creators_name: Ashraf, Imran creators_id: creators_id: creators_id: creators_id: creators_id: monica.gracia@uneatlantico.es creators_id: eduardo.silva@funiber.org creators_id: creators_id: title: Suicide Ideation Detection Using Social Media Data and Ensemble Machine Learning Model ispublished: pub subjects: uneat_eng divisions: uneatlantico_produccion_cientifica divisions: uninimx_produccion_cientifica divisions: uninipr_produccion_cientifica divisions: unic_produccion_cientifica divisions: uniromana_produccion_cientifica full_text_status: public keywords: Suicide ideation; machine learning; feature extraction; ensemble learning; feature fusion abstract: Identifying the emotional state of individuals has useful applications, particularly to reduce the risk of suicide. Users’ thoughts on social media platforms can be used to find cues on the emotional state of individuals. Clinical approaches to suicide ideation detection primarily rely on evaluation by psychologists, medical experts, etc., which is time-consuming and requires medical expertise. Machine learning approaches have shown potential in automating suicide detection. In this regard, this study presents a soft voting ensemble model (SVEM) by leveraging random forest, logistic regression, and stochastic gradient descent classifiers using soft voting. In addition, for the robust training of SVEM, a hybrid feature engineering approach is proposed that combines term frequency-inverse document frequency and the bag of words. For experimental evaluation, “Suicide Watch” and “Depression” subreddits on the Reddit platform are used. Results indicate that the proposed SVEM model achieves an accuracy of 94%, better than existing approaches. The model also shows robust performance concerning precision, recall, and F1, each with a 0.93 score. ERT and deep learning models are also used, and performance comparison with these models indicates better performance of the SVEM model. Gated recurrent unit, long short-term memory, and recurrent neural network have an accuracy of 92% while the convolutional neural network obtains an accuracy of 91%. SVEM’s computational complexity is also low compared to deep learning models. Further, this study highlights the importance of explainability in healthcare applications such as suicidal ideation detection, where the use of LIME provides valuable insights into the contribution of different features. In addition, k-fold cross-validation further validates the performance of the proposed approach. date: 2026-01 publication: International Journal of Computational Intelligence Systems id_number: doi:10.1007/S44196-025-01123-9 refereed: TRUE issn: 1875-6883 official_url: http://doi.org/10.1007/S44196-025-01123-9 access: open language: en citation: Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Artículos y libros Universidad Internacional do Cuanza > Investigación > Producción Científica Universidad de La Romana > Investigación > Producción Científica Abierto Inglés Identifying the emotional state of individuals has useful applications, particularly to reduce the risk of suicide. Users’ thoughts on social media platforms can be used to find cues on the emotional state of individuals. Clinical approaches to suicide ideation detection primarily rely on evaluation by psychologists, medical experts, etc., which is time-consuming and requires medical expertise. Machine learning approaches have shown potential in automating suicide detection. In this regard, this study presents a soft voting ensemble model (SVEM) by leveraging random forest, logistic regression, and stochastic gradient descent classifiers using soft voting. In addition, for the robust training of SVEM, a hybrid feature engineering approach is proposed that combines term frequency-inverse document frequency and the bag of words. For experimental evaluation, “Suicide Watch” and “Depression” subreddits on the Reddit platform are used. Results indicate that the proposed SVEM model achieves an accuracy of 94%, better than existing approaches. The model also shows robust performance concerning precision, recall, and F1, each with a 0.93 score. ERT and deep learning models are also used, and performance comparison with these models indicates better performance of the SVEM model. Gated recurrent unit, long short-term memory, and recurrent neural network have an accuracy of 92% while the convolutional neural network obtains an accuracy of 91%. SVEM’s computational complexity is also low compared to deep learning models. Further, this study highlights the importance of explainability in healthcare applications such as suicidal ideation detection, where the use of LIME provides valuable insights into the contribution of different features. In addition, k-fold cross-validation further validates the performance of the proposed approach. metadata KINA, Erol; Choi, Jin-Ghoo; Ishaq, Abid; Shafique, Rahman; Gracia Villar, Mónica; Silva Alvarado, Eduardo René; Diez, Isabel de la Torre y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, monica.gracia@uneatlantico.es, eduardo.silva@funiber.org, SIN ESPECIFICAR, SIN ESPECIFICAR (2026) Suicide Ideation Detection Using Social Media Data and Ensemble Machine Learning Model. International Journal of Computational Intelligence Systems. ISSN 1875-6883 document_url: http://repositorio.unib.org/id/eprint/26964/1/s44196-025-01123-9_reference.pdf