首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Attribute Selection via a Novel Interval Based Evaluation Algorithm: Applied on Real life data sets
  • 本地全文:下载
  • 作者:Mostafa A. Salama ; Mostafa A. Salama ; Ghada Hassan
  • 期刊名称:MATEC Web of Conferences
  • 电子版ISSN:2261-236X
  • 出版年度:2016
  • 卷号:76
  • 页码:1-7
  • DOI:10.1051/matecconf/20167604030
  • 语种:English
  • 出版社:EDP Sciences
  • 摘要:Real life problems handled by machine learning deals with various forms of values in the data set attributes, like the continuous and discrete form. Discretization is an important step in the pre-processing stage as most of the attribute selection techniques assume the discreetness of the input values. This step could change the internal structure of the input attribute values with respect to the classification problem, and thus the quality of this step directly impact the quality of the selected features. This work discusses the problems existing in the current discretization techniques and proposes an attribute evaluation and selection technique to avoid these problems. Attributes are evaluated in its continuous form directly without biasing its internal structure and enhances the computational complexity by eliminating the discretization step. The basic insight of the proposed approach relies on the inverse relationship between class label distribution overlap and the relative information content of a given attribute. In order to estimate the validity of this assumption, a series of data sets were examined using several standard approaches including our own implementation, and the approaches ranked with respect to the overall classification accuracy. The results, at least with respect to the testing data sets deployed in this study, indicate that the proposed approach outperformed other methods selected for evaluation in this study. These results will be examined over a wider range of continuous attribute data sets from nonmedical domains in order to investigate the robustness of these results.
国家哲学社会科学文献中心版权所有