• Login
    View Item 
    •   SUSpace Home
    • Faculty of Science and Engineering
    • Department of Electrical and Electronics Engineering (EEE)
    • 2021 - 2025
    • View Item
    •   SUSpace Home
    • Faculty of Science and Engineering
    • Department of Electrical and Electronics Engineering (EEE)
    • 2021 - 2025
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Speech Emotion Recognition Using Machine Learning Technique.

    Thumbnail
    View/Open
    EEE-200189.pdf (6.141Mb)
    Date
    2020-02-06
    Author
    Haque, Md. Enamul
    Islam, Md. Shakirul
    Ahmed, Kawser
    Metadata
    Show full item record
    Abstract
    Speech emotion recognition is a challenging problem partly because it is not clear what features are effective for the task. In this thesis, we propose a comparative study of speech emotion recognition (SER) systems. Theoretical definition, categorization of effective state and the modalities of emotion expression are presented. To achieve this study, we performed pre-processing necessary for emotion recognition from speech data on SER system, based on Multi-layer Perceptron (MLP) classifiers and MLP methods for features extraction, and this generates the training and testing datasets that contain the emotions of Neutral, Calm, Happy, Sad, Angry, Fearful, Disgust and Surprised. The MLP classifiers are then used for the classification stage in order to predict the emotion. Mel-frequency Cepstrum coefficients (MFCC), Chroma, and Mel features are extracted from the speech signals and used to train MLP classified with the Mel-frequency cepstral coefficient (MFCC) feature extraction algorithms & Stochastic L-BFGS algorithm. Bangla and RAVDESS databases are used as the experimental data set. This study shows that for RAVDESS database all classifiers achieve an accuracy of 53.89% and for Bangla database 45.83% when a speaker normalization (SN) and a feature selection are applied to the features. The demand for machines that can interact with its users through speech is growing. For example, four of the world’s largest IT companies; Amazon, Apple, Google and Microsoft, are developing intelligent personal assistants who are able to communicate through speech. In this thesis, we have investigated the effect of feature extraction when classifying emotions in speech, using a Artificial neural network (ANN). We used the "kernels" on Kaggle to extract sets of features from recorded audio, and compared the MLP classification accuracy of the sets with eight classes of emotions. We used one architecture of the ANN, to be fair when comparing each feature set. The ANN architecture was developed by an experimental approach. Using python in recent years, the working which requires human-machine interaction such as speech recognition, emotion recognition from speech recognition is increasing. Not only the speech recognition also the features during the conversation is studied like Melody, emotion, chunk, etc. It has been proven with the research that it can be reached meaningful results using prosodic features of speech. In addition, a Confusion Matrix (CM) technique is used to evaluate the performance of these classifiers. The proposed system is tested on RAVDESS and Bangla databases and has achieved a prediction rate of 70.89
    URI
    http://suspace.su.edu.bd/handle/123456789/1132
    Collections
    • 2021 - 2025 [152]

    Copyright © 2022-2025 Library Home | Sonargaon University
    Contact Us | Send Feedback
     

     

    Browse

    All of SUSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright © 2022-2025 Library Home | Sonargaon University
    Contact Us | Send Feedback