首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Accuracy and Diversity in Ensembles of Text Categorisers
  • 本地全文:下载
  • 作者:Juan Jose Garcia Adeva ; Ulises Cervino ; Rafael A. Calvo
  • 期刊名称:CLEI Electronic Journal
  • 印刷版ISSN:0717-5000
  • 出版年度:2005
  • 卷号:8
  • 期号:2
  • 出版社:Centro Latinoamericano de Estudios en Informática
  • 摘要:Error-Correcting Out Codes (ECOC) ensembles of binary classifiers are used in Text Categorisation
    to improve the accuracy while benefiting from learning algorithms that only support two classes. An accurate ensemble relies on the quality of its corresponding decomposition matrix, which at the same time depends on the separation between the categories and the diversity of the dichotomies representing the binary classifiers. Important open questions include finding a good definition for diversity between two dichotomies and a way of combining all the pairwise diversity values into a single indicator that we call the decomposition quality. In this work we introduce a new measure to estimate the diversity between two learners and we compare it to the well-known Hamming distance. We also examine three functions to evaluate the decomposition quality. We present a set of experiments where these measures and functions are tested using two distinct document corpora with several configurations in each. The analysis of the results shows a weak relationship between the ensemble accuracy and its diversity.

国家哲学社会科学文献中心版权所有