摘要:Internal clustering validation is recognized as one of the vital issues essential to clustering applications, especially when external information is not available. Existing measures have their limitations in different application circumstances. There are still some deficiencies for Internal Validation of Boolean clustering. This paper proposes a new Clustering Validation index based on Type of Attributes for Boolean data (CVTAB). It evaluates the clustering quality in the light of Dissimilarity of two clusters for Boolean Data (DBD). The attributes in the Boolean Data are categorized into three types: Type A, Type O and Type E representing respectively the attribute values 1,0 and not the same for all the objects in the set. When two clusters are composed into one, DBD applies the numbers of attributes with the types changed and the numbers of objects changed to measure dissimilarity of two clusters. CVTAB evaluates the clustering quality without respect to external information
关键词:Clustering Validation index based on Type of Attributes for Boolean data (CVTAB); Dissimilarity for Boolean Data (DBD); internal clustering validation index; Boolean data; high dimensional data