期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2018
卷号:9
期号:11
DOI:10.14569/IJACSA.2018.091149
出版社:Science and Information Society (SAI)
摘要:Machine Translation, Information Retrieval and Knowledge Acquisition are the three main applications of Word Sense Disambiguation (WSD). The sense of a target word can be identified from a dictionary using a ‘bag of words’, i.e. neighbours of the target word. A target word has the same spelling of the word but with a different meaning, i.e. chair, light etc. In WSD, the key input sources are sentences and target words. But, instead of providing a target word, this should automatically be detected. If a sentence has more than one target word, then the filtration process will require further processing. In this study, the proposed framework, consisting of buzz words and query words has been developed to detect target words using the WordNet dictionary. Buzz words are defined as a ‘bag-of-words’ using POS-Tags, and query words are those words having multiple meanings. The proposed framework will endeavor to find the sense of the detected target word using its gloss and with examples containing buzz words. This is a semi-supervised approach because 266 words of multiple meanings have been labelled from various sources and used based on an unsupervised approach to detect the target word and sense (meaning). After experimenting on a dataset consisting of 300 hotel reviews, 100 % of the target words for each sentence were detected with 84 % related to the sense of each sentence or phrase.
关键词:Word sense disambiguation; machine translation; information retrieval and knowledge acquisition; target word; WordNet; bag of words