首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:A Conceptual Data Modelling Framework for Context-Aware Text Classification
  • 其他标题:A Conceptual Data Modelling Framework for Context-Aware Text Classification
  • 本地全文:下载
  • 作者:Nazia Tazeen ; K. Sandhya Rani
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2020
  • 卷号:11
  • 期号:11
  • DOI:10.14569/IJACSA.2020.0111116
  • 出版社:Science and Information Society (SAI)
  • 摘要:Data analytics has an interesting variant that aims to understand an entity's behavior. It is termed as diagnostic analytics, which answers “why type questions”. “Why type questions” find their applications in emotion classification, brand analysis, drug review modeling, customer complaints classification etc. Labeled data form the core of any analytics' problem, leave alone diagnostic analytics; however, labeled data is not always available. In some cases, it is required to assign labels to unknown entities and understand its behavior. For such scenarios, the proposed model unites topic modeling and text classification techniques. This combined data model will help to solve diagnostic issues and obtain meaningful insights from data by treating the procedure as a classification problem. The proposed model uses Improved Latent Drichlet Allocation for topic modeling and sentiment analysis to understand an entity's behavior and represent it as an Improved Multinomial Naïve Bayesian data model to achieve automated classification. The model is tested using drug review dataset obtained from UCI repository. The health conditions with their associated drug names were extracted from the reviews and sentiment scores were assigned. The sentiment scores reflected the behavior of various drugs for a particular health condition and classified them according to their quality. The proposed model performance is compared with existing baseline models and it is proved that our model exhibited better than other models.
  • 关键词:Text classification; topic modeling; natural language processing; sentiment analysis; drug dataset; context-aware model; diagnostic analytics; feature extraction
国家哲学社会科学文献中心版权所有