首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Performing Natural Language Processing on Roman Urdu Datasets
  • 本地全文:下载
  • 作者:Zareen Sharf ; Dr Saif Ur Rahman
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2018
  • 卷号:18
  • 期号:1
  • 页码:141-148
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:This work is a predecessor of a larger task which requires discourse based sentiment analysis on Roman Urdu Datasets. In order to perform this task, we first needed to collect a large data corpus in Roman Urdu from social Media websites. Next we cleaned the raw data, lexically normalized it for standard representation of words, performed POS tagging for the words to be tokenized meaningfully and finally identified the presence or absence of a discourse element. After achieving these task, we are now ready to perform Neural Network based sentiment Analysis on Roman Urdu dataset taking discourse into consideration as our future work.
  • 关键词:;;;; ;;;;;; Natural Language Processing; POS Tagging; Discourse units; Roman Urdu Data
国家哲学社会科学文献中心版权所有