期刊名称:Iranian Journal of Information Processing & Management
印刷版ISSN:2251-8223
电子版ISSN:2251-8231
出版年度:2014
卷号:29
期号:2
页码:453-476
出版社:Iranian Research Institute for Information and Technology
摘要:Identifying the author of an electroni message is one of the main problems in text classification and natural language processing. The aim of this article is to determine the authors of 50 cyber messages (by 50 potential customers, according to Amazon 's website), by a machine learning methods. To evaluate the effectiveness of the proposed method, the decision was carefully tested and the results were compared with the performance of machine learning methods. Also, when extracting various features of authors' writing style for evaluation by machine, we tried to maximize the features required to identify a writer. Therefore, nearly 10,000 different features were extracted from different entries in four categories: lexical features, syntactic features, special features and structural features. In this study, the average accuracy of the proposed method reached to 98. 78.
关键词:Identification of Authors ; Machine Learning Methods ; Characteristics of Writing Styles