首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:WEB DOCUMENT CLUSTERING THROUGH METAFILE GENERATION FOR DIGRAPH STRUCTURE USING DOCUMENT INDEX GRAPH
  • 本地全文:下载
  • 作者:BUDI ; SRI NURDIATI ; BIB PARUHUM SILALAHI
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2014
  • 卷号:60
  • 期号:1
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Clustering techniques are often used to cluster grouping text documents. Modeling and graph-based representation of the document clustering process can be done by using algorithms Document Index Graph (DIG). This study aims to implement the DIG algorithm for designing the structure digraphs used for graphical representation of web document clustering process. The data used is the REUTERS-21578 documents. Testing is done by determining the parameter values for the number of groups of documents to be processed and the determination of the frequency of occurrence of the word limit. Analysis performed on the stage of determining the limit frequency of occurrence of relevant words (inter-cluster) and the occurrence of the word that is not relevant (intra-cluster) on the document clustering process. Digraph structure that represents the best graph for document clustering process is achieved in inter-cluster frequency value 5 and the value of intra-cluster frequency 3 within 25 documents.
  • 关键词:Algorithm; Clustering; Digraph; Document Index Graph; Reuters Document
国家哲学社会科学文献中心版权所有