首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:A Word Extraction Method from Newspaper Articles Based on Time Infomation for Event Sequence Mining
  • 本地全文:下载
  • 作者:Tomomichi Tada ; Koji Iwanuma ; Hidetomo Nabeshima
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2009
  • 卷号:24
  • 期号:6
  • 页码:488-493
  • DOI:10.1527/tjsai.24.488
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:This paper shows a new method of extracting important words from newspaper articles based on time-sequence information. This word extraction method plays an important role in event sequence mining. TF-IDF is a well-known method to rank word's importance in a document. However, the TF-IDF method never consider the time information embedded in sequential textual data, which is peculiar to newspapers. In this research, we will propose a new word-extraction method, called the TF-IDayF method, which considers time-sequence information, and can extract important/characteristic words expressing sequential events. The TF-IDayF method never use so-called burst phenomenon of topic word occurrences, which has been studied by lots of researchers. The TF-IDayF method is quite simple, but effective and easy to compute in sequential textual mining. We evaluate the proposed method from three points of view, i.e., a semantic viewpoint, a statistical one and a data mining viewpoint through several experiments.
  • 关键词:word extraction ; event sequentce mining ; TF-IDF ; newspaper article
国家哲学社会科学文献中心版权所有