首页    期刊浏览 2025年02月28日 星期五
登录注册

文章基本信息

  • 标题:An Efficient Classification Approach for the XML Documents
  • 本地全文:下载
  • 作者:Navya sree.Yarramsetti ; G.Siva Nageswara Rao
  • 期刊名称:International Journal of Computer Trends and Technology
  • 电子版ISSN:2231-2803
  • 出版年度:2013
  • 卷号:4
  • 期号:3-2
  • 出版社:Seventh Sense Research Group
  • 摘要:Extensible Markup Language (XML) has been used as standard format for a data representation over the internet. An XML document is usually organized by a set of textual data according to a predefined logical structure. Due to the presence of inherent structure in the XML documents, conventional text classification methods cannot be used to classify XML documents directly. In this paper, we propose the learning issues with XML documents from three major research areas. First, a knowledge representation method, which is based on typed higher order logic formalism. Here, the main focus is how to represent an XML document using higher order logic terms where both its contents and structures are captured. Secondsymbolic machine learning. Here, a new decisiontree learning algorithm determined by precision/recall breakeven point (PRDT) for the XML document classification problem. Precision/recall heuristic is considered in xml document classification is that the xml documents have strong connections with text documents. Finally, we had a semisupervised learning algorithm which is based on the PRDT algorithm and the cotraining framework. By producing comprehensible theories, the tentative results exhibit that our framework is capable to attain good performance in both the machine learning techniques.
  • 关键词:precision/recall; Co-training; machine learning; knowledge representation; semi-supervised learning
国家哲学社会科学文献中心版权所有