期刊名称:Brain. Broad Research in Artificial Intelligence and Neuroscience
印刷版ISSN:2067-3957
出版年度:2012
卷号:3
期号:3
页码:62-70
语种:English
出版社:EduSoft publishing
摘要:Clustering is an active research topic in data mining and different methods have been proposed in the literature. Most of these methods are based on numerical attributes. Recently, there have been several proposals to develop clustering methods that support mixed attributes. There are three basic groups of clustering methods: partitional methods, hierarchical methods and densitybased methods. This paper proposes a hybrid clustering algorithm that combines the advantages of hierarchical clustering and fuzzy clustering techniques and considers mixed attributes. The proposed algorithms improve the fuzzy algorithm by making it less dependent on the initial parameters such as randomly chosen initial cluster centers, and it can determine the number of clusters based on the complexity of cluster structure. Our approach is organized in two phases: first, the division of data in two clusters; then the determination of the worst cluster and splitting. The number of clusters is unknown, but our algorithms can find this parameter based on the complexity of cluster structure. We demonstrate the effectiveness of the clustering approach by evaluating datasets of linked data. We applied the proposed algorithms on three different datasets. Experimental results the proposed algorithm is suitable for link discovery between datasets of linked data. Clustering can decrease the number of comparisons before link discovery.
关键词:Hierarchical method, Fuzzy Clustering, similarity measure, Linked Data