期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2020
卷号:11
期号:4
DOI:10.14569/IJACSA.2020.01104104
出版社:Science and Information Society (SAI)
摘要:Speech emotion recognition is one of the most active areas of research in the field of affective computing and social signal processing. However, most research is directed towards a select group of languages such as English, German, and French. This is mainly due to a lack of available datasets in other languages. Such languages are called low-resource languages given that there is a scarcity of publicly available datasets. In the recent past, there has been a concerted effort within the research community to create and introduce datasets for emotion recognition for low-resource languages. To this end, we introduce in this paper the Urdu-Sindhi Speech Emotion Corpus, a novel dataset consisting of 1,435 speech recordings for two widely spoken languages of South Asia, that is Urdu and Sindhi. Furthermore, we also trained machine learning models to establish a baseline for classification performance, with accuracy being measured in terms of unweighted average recall (UAR). We report that the best performing model for Urdu language achieves a UAR = 65.00% on the validation partition and a UAR = 56.96% on the test partition. Meanwhile, the model for Sindhi language achieved UARs of 66.50% and 55.29% on the validation and test partitions, respectively. This classification performance is considerably better than the chance level UAR of 16.67%. The dataset can be accessed via https://zenodo.org/record/3685274.
关键词:Speech emotion recognition; affective computing; social signal processing