期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2011
卷号:8
期号:6
出版社:IJCSI Press
摘要:Feature reduction is an important process before documents classification. The classification performance is impact by the quality of the selected. A new semantic approach is presented using synonym merge to preserve features semantic and prevent important terms from being excluded. The resulting feature space were then processed with five feature selection methods, ID, TFIDF, CHI, IG and MI. experiment show that classification performance is increased after merging terms and yielding best performance for CHI and IG selection method. A promising classification technique is presented based on Dewey decimal classification system, which uses filtered indexes and three levels of classes from Dewey system to classify and label Arabic documents. The technique shows along with synonyms merge a promising result.
关键词:Dimension reduction; Arabic text Classification; synonyms.