摘要:This paper presents a new way to extract concept that can be used to improve text classification performance (precision and recall). The computational measure will be divided into two layers. The bottom layer called document layer is concerned with extracting the concepts of particular document and the upper layer called category layer is with finding the description and subject concepts of particular category. The relevant implementation algorithm that dramatically decreases the search space is discussed in detail. The experiment based on real-world data collected from Infor-Bank shows that the approach is superior to the traditional ones.
关键词:text classification; concept extraction; characteristic term; association rule; algorithm