摘要:Traditional classification methods cannot well capture the characteristics of complex problems, thus leading to poor performance. In this paper, we propose a new framework named Partition based LAzy Classification (PLAC) tobetter characterize complex problems by dividing the training data space into smaller and easier-to-learn partitions. In PLAC, only the nearest partition of a new instance is used to train a local classifier that is finally used to classify the new instance. As the partitioning is performed based on information gain before receiving a new instance, the resulting partitions are groups of similar instances and the chance of the nearest instances of the new instance coming from different regions by accident isreduced. Moreover, our method uses only one partition to conducta prediction and employs the caching mechanism to avoid work replication during classification, thus efficiency is improved. An extensive experimental evaluation on 40 real world data sets shows that PLAC effectively improves the performance of base classifiers and outperforms existing mainstream ensemble methods.
关键词:Classification; eager learning; lazy learning; data partitioning; ensemble learning.