期刊名称:International Journal on Computer Science and Engineering
印刷版ISSN:2229-5631
电子版ISSN:0975-3397
出版年度:2010
卷号:2
期号:2
页码:126-131
出版社:Engg Journals Publications
摘要:Classification is one of the most efficient and widely used data mining technique. In classification, Decision trees can handle high dimensional data, and their representation is intuitive and generally easy to assimilate by humans. The area under the receiver operating characteristic curve, AUC is one of the recently used measures for calculating the performance of a classifier.In this paper, we presented two novel decision tree algorithms namely C4.45 and C4.55, aimed to improve the AUC value over the C4.5, which is a state-of-the-art decision tree algorithm. The empirical experiments conducted on 42 benchmark datasets have strongly indicated that C4.45 and C4.55 has significantly outperformed C4.5 on the AUC value.
关键词:Decision Trees. Information gain; Area under Curve; Gain ratio; Laplace Correction; Confidence Factor.