首页    期刊浏览 2024年12月13日 星期五
登录注册

文章基本信息

  • 标题:A Proposed Multi-Domain Approach for Automatic Classification of Text Documents
  • 本地全文:下载
  • 作者:Abdelrahman M. Arab ; Ahmed M. Gadallah ; Akram Salah
  • 期刊名称:International Journal on Soft Computing
  • 电子版ISSN:2229-7103
  • 出版年度:2017
  • 卷号:8
  • 期号:1
  • 页码:1
  • DOI:10.5121/ijsc.2017.8101
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Classification is an important technique used in information retrieval. Supervised classification suffersfrom certain limitations concerning the collection and labeling of the training dataset. When facing Multi-Domain classification, multiple training datasets and classifiers are needed which is relatively difficult. Inthis paper an unsupervised classification system is proposed that can manage the Multi-Domainclassification problem as well. It is a multi-domain system where each domain represented by an ontology.A document is mapped on each ontology based on the weights of the mutual tokens between them with thehelp of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carriedout showing satisfying classification results with an improvement in the evaluation results of the proposedsystem compared to Apache Lucene.
  • 关键词:Information Retrieval; Ontology; Machine Learning; Document Classification; Fuzzy Sets
国家哲学社会科学文献中心版权所有