Towards Real-Time Emotion Analytics: Integrating  Facial Landmarks and Speech Prosody

Islam, Md. Wahidul

dc.contributor.author	Islam, Md. Wahidul
dc.date.accessioned	2025-06-25T06:55:02Z
dc.date.available	2025-06-25T06:55:02Z
dc.date.issued	2025-05-19
dc.identifier.uri	http://suspace.su.edu.bd/handle/123456789/1581
dc.description.abstract	Emotion recognition has garnered significant attention in fields such as mental health, human computer interaction, and personalized services. This research explores a multimodal approach to emotion recognition by integrating facial expression analysis and speech prosody to achieve a more accurate and context-sensitive understanding of human emotions. A distinctive aspect of this study is the creation of a custom video dataset designed specifically for facial expression recognition, which captures a wide range of emotional states under various real-world conditions. In parallel, speech emotion detection is performed using publicly available audio datasets, which analyze features such as pitch, tone, and rhythm to discern emotions expressed vocally. The facial expression recognition is based on Convolutional Neural Networks (CNNs), which extract visual features from the video data, while the emotional cues in speech are analyzed using Long Short-Term Memory (LSTM) networks. By combining these modalities, this research addresses the limitations commonly faced by unimodal systems, such as the challenges posed by noisy environments or occluded faces. The findings demonstrate that the integration of facial and auditory data significantly improves emotion classification accuracy, particularly in real-time applications. This research advances the field of affective computing by highlighting the complementary strengths of visual and auditory emotion cues and offers practical implications for applications in customer service, virtual assistants, and mental health diagnostics.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Sonargaon University (SU)	en_US
dc.relation.ispartofseries	;CSE-2502332
dc.subject	Facial Landmarks and Speech Prosody	en_US
dc.title	Towards Real-Time Emotion Analytics: Integrating Facial Landmarks and Speech Prosody	en_US
dc.type	Thesis	en_US

Files in this item

Name:: CSE-250232.pdf
Size:: 2.211Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

2021 - 2025 [125]

Show simple item record