期刊名称:The International Arab Journal of Information Technology
印刷版ISSN:1683-3198
出版年度:2021
卷号:18
期号:3
DOI:10.34028/iajit/18/3A/8
语种:English
出版社:Zarqa Private University
摘要:The process of selecting the appropriate meaning of an ambigous word according to its context is known as word sense disambiguation. In this research, we generate a number of Arabic sense inventories based on an unsupervised approach and different pre-trained embeddings, such as Aravec, Fasttext, and Arabic-News embeddings. The resulted inventories from the pre-trained embeddings are evaluated to investigate their efficiency in Arabic word sense disambiguation and sentence similarity. The sense inventories are generated using an unsupervised approach that is based on a graph-based word sense inductionalgorithm. Results show that the Aravec-Twitter inventory achieves the best accuracy of 0.47 for 50 neighbors and a close accuracy to the Fasttext inventory for 200 neighbors while it provides similar accuracy to the Arabic-News inventory for 100neighbors. The experiment of replacing ambiguous words with their sense vectors is tested for sentence similarity using all sense inventories and the results show that using Aravec-Twitter sense inventoryprovides a better correlation value.
关键词:Word sense induction;word sense disambiguation;arabic text;sense inventory