首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Named Entity Recognition in Vietnamese documents
  • 作者:Tri Tran Q. ; Thao Pham T. X. ; Hung Ngo Q.
  • 期刊名称:Progress in Informatics
  • 印刷版ISSN:1349-8614
  • 电子版ISSN:1349-8606
  • 出版年度:2007
  • 期号:4
  • 页码:5-13
  • DOI:10.2201/NiiPi.2007.4.2
  • 出版社:National Institute of Informatics
  • 摘要:Named Entity Recognition (NER) aims to classify words in a document into pre-defined target entity classes and is now considered to be fundamental for many natural language processing tasks such as information retrieval, machine translation, information extraction and question answering. This paper presents the results of an experiment in which a Support Vector Machine (SVM) based NER model is applied to the Vietnamese language. Though this state of the art machine learning method has been widely applied to NER in several well-studied languages, this is the first time this method has been applied to Vietnamese. In a comparison against Conditional Random Fields (CRFs) the SVM model was shown to outperform CRF by optimizing its feature window size, obtaining an overall F-score of 87.75. The paper also presents a detailed discussion about the characteristics of the Vietnamese language and provides an analysis of the factors which influence performance in this task.
  • 关键词:Named Entity Recognition (NER); Support Vector Machine (SVM); text mining
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有