期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2014
卷号:5
期号:6
页码:8316-8320
出版社:TechScience Publications
摘要:In this paper, we develop a text-mining system by integrating methods from Information Extraction (IE) and Data Mining (Knowledge Discovery from Databases or KDD). By utilizing existing IE and KDD techniques, text-mining systems can be developed relatively rapidly and evaluated on existing text corpora for testing IE systems. We present a general text-mining framework called MRAR which employs an IE module for transforming natural-language documents into structured data and a KDD module for discovering prediction rules from the extracted data. We present experimental results on inducing prediction and ranked association rules from natural-language texts demonstrating that MRAR learn more accurate rules than previous methods for these tasks. We also present an approach to using rules mined from extracted data to improve the accuracy of information extraction. Experimental results demonstrate that such discovered patterns can be used to effectively improve the underlying IE method.