首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Unsupervised Clustering of Comments Written in Albanian Language
  • 本地全文:下载
  • 作者:Mërgim H. HOTI ; Jaumin AJDARI
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2021
  • 卷号:12
  • 期号:8
  • DOI:10.14569/IJACSA.2021.0120833
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:Now-a-days, social media and communications in social media have become very important for services providers and those play a key role in service quality improvement as well as in decision making. The services consumers’ discussions usually are written in their local languages and extracting important knowledge sometimes is very hard and problematic. In this field the natural language processing techniques are helpful, but different languages have their specifics and difficulties, and some languages are not prosperous enough in the techniques and methods on NLP, especially the local speaking of the language. In this scientific paper, we have tried to solve such a problem for the Albanian language spoken in Kosovo. Namely, for a dataset of the comments, written in Albanian language in Kosovo (local speaking), collected from the social media, by use of unsupervised clustering techniques, to make clustering regarding the topic of discussion in the comment. In this research, the different techniques of text feature extraction (vectorization and others) and clustering algorithms (K-means, Spectral, Agglomerative, etc.), are used with the idea to find and define more appropriate techniques for the Albanian language. In this paper are shown the results of the conducted experiments as well as discussions about what to use in case of the Albanian language and other languages similar or in group with Albanian (those which have a weak NLP).
  • 关键词:Unsupervised clustering; k-means; spectral; agglomerative; vectorization; Albanian language
国家哲学社会科学文献中心版权所有