期刊名称:TELKOMNIKA (Telecommunication Computing Electronics and Control)
印刷版ISSN:2302-9293
出版年度:2016
卷号:14
期号:2A
页码:372-378
DOI:10.12928/telkomnika.v14i2A.4355
语种:English
出版社:Universitas Ahmad Dahlan
摘要:In the finding of hot topics on microblog, the short text, less word, non-standard word use and other features of microblog have made the traditional identification method of hot topic powerless. To solve this problem, discovery method of microblog hot topic based on speed increase has been put forward. First, divide pretreated microblog according to windows with equal quality, count word frequency of each word in each window and express as two-tuple sequence of time; then calculate increase slope of each word in every two adjacent windows to find words with faster increasing speed; later calculate increasing speed of users and article number of microblogs related to the word to make sure whether the word is hot subject term; finally produce hot topic from the cluster of hot subject term. The feasibility of this method has been verified by experiment. Experimental results show that this method has improved identification efficiency and lowered omission ratio and fall-out ratio, which can effectively and promptly discover the hot topic of microblog.
其他摘要:In the finding of hot topics on microblog, the short text, less word, non-standard word use and other features of microblog have made the traditional identification method of hot topic powerless. To solve this problem, discovery method of microblog hot topic based on speed increase has been put forward. First, divide pretreated microblog according to windows with equal quality, count word frequency of each word in each window and express as two-tuple sequence of time; then calculate increase slope of each word in every two adjacent windows to find words with faster increasing speed; later calculate increasing speed of users and article number of microblogs related to the word to make sure whether the word is hot subject term; finally produce hot topic from the cluster of hot subject term. The feasibility of this method has been verified by experiment. Experimental results show that this method has improved identification efficiency and lowered omission ratio and fall-out ratio, which can effectively and promptly discover the hot topic of microblog.
关键词:Topic identification;Two-tuple sequence of time;Hot topic of microblog;Analysis on public sentiment