期刊名称:IOP Conference Series: Earth and Environmental Science
印刷版ISSN:1755-1307
电子版ISSN:1755-1315
出版年度:2019
卷号:252
期号:5
页码:1-16
DOI:10.1088/1755-1315/252/5/052100
出版社:IOP Publishing
摘要:This paper proposes one improved K-Prototype algorithm based on innovations of controlling initialization process and attribute weighting (KP-IW) in order to deal with mixed data containing numeric and categorical attributes. Making initialization of clustering fixed and giving weightings to attributes are two common principles for improving algorithms. However, there are rarely methods regarding numeric or categorical proportion as one new attribute, which will affect the initialization consequence and weight value assigning to attribute because that density distribution of instances is calculated by the combing each attributes and those entire two proportions instead of only the former. There are some more detailed innovations for initialization and weighting, involving auxiliary point, auxiliary clusters and weightings combing linear and exponential effect. And it can be concluded that the KP-IW algorithm is suitable according to the clustering evaluation scores from KP-IW compared with others algorithms.