期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2018
卷号:18
期号:1
页码:141-148
出版社:International Journal of Computer Science and Network Security
摘要:This work is a predecessor of a larger task which requires discourse based sentiment analysis on Roman Urdu Datasets. In order to perform this task, we first needed to collect a large data corpus in Roman Urdu from social Media websites. Next we cleaned the raw data, lexically normalized it for standard representation of words, performed POS tagging for the words to be tokenized meaningfully and finally identified the presence or absence of a discourse element. After achieving these task, we are now ready to perform Neural Network based sentiment Analysis on Roman Urdu dataset taking discourse into consideration as our future work.
关键词:;;;; ;;;;;; Natural Language Processing; POS Tagging; Discourse units; Roman Urdu Data