Evaluating Prompt Engineering Techniques for  Low Resource Language NLP Tasks: A Case Study  on Bangla Emotion Recognition

Al Amin

dc.contributor.author	Al Amin
dc.date.accessioned	2026-03-30T05:41:46Z
dc.date.available	2026-03-30T05:41:46Z
dc.date.issued	2025-02-12
dc.identifier.uri	http://suspace.su.edu.bd/handle/123456789/2612
dc.description.abstract	Emotion detection in low-resource languages remains an underexplored area in natural lan- guage processing (NLP). This study develops a unified framework for detecting emotions in Bangla, Banglish (code-mixed Bangla-English), English, and multilingual texts using a diverse set of language models, including Bangla-specialized transformers, code-mixed models, and instruction-tuned multilingual LLMs. By integrating DU-BEC and BTEd datasets with lexicon- guided augmentation from EmoLex-BN, the framework provides robust supervision across six canonical emotions. A modular pipeline automates preprocessing, synthetic augmentation, and model-agnostic training, enabling systematic comparison across monolingual, code-mixed, and multilingual settings. Experimental results demonstrate that traditional machine learning approaches (TF-IDF + Logistic Regression) achieve the best performance with macro F1-score of 0.357, significantly outperforming fine-tuned transformers (0.088–0.106 F1) and direct LLM prompting (0.000 F1). A novel translation based LLM approach achieved 0.232 F1-score, representing the first suc- cessful zero-shot emotion classification for Bangla without labeled training data. Bangla-native transformers excel in supervised in-domain tasks, code-mixed models outperform in Banglish contexts, and multilingual LLMs achieve strong zero-shot cross-lingual generalization when combined with translation pipelines. This work establishes the first comprehensive benchmark for emotion detection across Bangla, Banglish, and multilingual texts, providing reproducible pipelines, datasets, and evaluation met- rics that advance low resource and cross-lingual affective computing. The findings demonstrate that increased model complexity does not guarantee better performance under severe data con- straints, and that simple, well designed supervised methods remain highly effective for low- resource language NLP. Keywords: Bangla NLP, Emotion Detection, Code-Mixed Language, Low Resource Language, Large Language Models, Prompt Engineering, Cross Lingual Transfer, Lexicon-Based Augmentation, DU BEC, BTEd, EmoLex-BN, Multi-label Classification, Traditional Machine Learning.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Sonargaon University	en_US
dc.relation.ispartofseries	;CSE-250266
dc.subject	Evaluating Prompt Engineering Techniques for Low Resource Language NLP Tasks: A Case Study on Bangla Emotion Recognition	en_US
dc.title	Evaluating Prompt Engineering Techniques for Low Resource Language NLP Tasks: A Case Study on Bangla Emotion Recognition	en_US
dc.type	Thesis	en_US

Files in this item

Name:: CSE- 250266.pdf
Size:: 1.821Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

2021 - 2025 [184]

Show simple item record