A Reliable and Efficient Approach to  Suicidal Ideation Detection in a Low Resource Language

Jahangir, Hussen

dc.contributor.author	Jahangir, Hussen
dc.date.accessioned	2026-03-31T04:00:44Z
dc.date.available	2026-03-31T04:00:44Z
dc.date.issued	2025-01-12
dc.identifier.uri	http://suspace.su.edu.bd/handle/123456789/2618
dc.description.abstract	Suicide is an endemic and disastrous global public health issue, necessitating the creation of scalable and forward-looking early detection methods beyond conventional clinical frameworks. Despite remarkable computational progress in high-resource languages such as English, the vast Bangla (Bengali) speaker population, ranging between 250 and 290 million worldwide, is underrepresented severely due to an existing computational imbalance characterized by data scarcity, inadequate linguistic content, and inherent problems such as affluent morphological richness, which hinders standard Natural Language Processing (NLP) methods. This research fills this technology gap by developing, evaluating, and rigorously validating a highly accurate, effective, and operationally robust Bangla Suicide Risk Classification system from user-generated digital text with real-world applicability in low resource healthcare environments. Empirically confirming its assertions through an elite, clinically annotated corpus, this research demonstrates that Character Ngram TF-IDF Vectorization is the optimal feature engineering method, outperforming word-level embeddings by being more adept at dealing with data sparsity. Massive benchmarking across thirteen disparate Machine Learning (ML) and Deep Learning (DL) models obviates the critical Deployment Paradox, signifying a trade-off between predictive performance and computational cost. The best safety performance (Recall: 0.9280, 92 False Negatives) was achieved by the Bi-directional Long Short-Term Memory (BiLSTM) model but at the expense of crippling latency (5.23 seconds), rendering it useless for real-time triage. On the other hand, the light-weight RidgeClassifier (RC) with the same feature representation obtained an equivalent Recall of 0.9170 (106 False Negatives) with near zero latency (0.001 seconds), which is the Optimal Deployable Triage System for large-scale real-time intervention. This paper highlights that interpretable and computationally efficient ML models can outperform state-of-the-art DL architectures in real-world deployment scenarios. Besides, it encourages ethical deployment with interpretable feature weights and Dynamic Threshold Tuning (Human-in-the-Loop) for system sensitivity tuning to adapt to changes in resources in an effort to ensure a sustainable, safe, and effective suicide prevention tool for the Bangla-speaking populations of the world.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Sonargaon University	en_US
dc.relation.ispartofseries	;CSE-250273
dc.subject	A Reliable and Efficient Approach to Suicidal Ideation Detection in a Low Resource Language	en_US
dc.title	A Reliable and Efficient Approach to Suicidal Ideation Detection in a Low Resource Language	en_US
dc.type	Thesis	en_US

Files in this item

Name:: CSE- 250273.pdf
Size:: 1.932Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

2021 - 2025 [184]

Show simple item record