期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2013
卷号:48
期号:2
页码:741-748
出版社:Journal of Theoretical and Applied
摘要:In order to solve the difficult questions such as in the presence of the cluster deviation and high dimensional data processing in traditional semi-supervised clustering algorithm, a semi-supervised clustering algorithm based on active learning was proposed, this algorithm can effectively solve the above two problems. Using active learning strategies in algorithm can obtain a large amount of information of pairwise constraints therefore enhance the proportion of prior knowledge. And the use of this constraint set projection space, finally in the mapping of the subspace, the improved K-means algorithm implemented for data clustering, as the algorithm clustering object is a low dimensional data, and prior knowledge increased, clustering in time efficiency can be guaranteed, and also can solve the deviation problem of clustering. The experiment results show that, with active learning algorithm clustering performance improvement, was superior to the other two semi-supervised clustering algorithms.
关键词:Pairwise Constraints; Semi-Supervised Clustering; K-Means Algorithm; Active Learning