期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2011
卷号:32
期号:2
页码:135-145
出版社:Journal of Theoretical and Applied
摘要:Plagiarism is a form of academic misconduct. It has increased rapidly because it is now quick and easy to reach data and information through electronic documents and the Internet. The problem occurs when found documents content is illegal and without permission or citation, this problem is known as plagiarism. One of the major challenges is to detect the plagiarism and illegal copy. This paper discusses a new representation method for text documents called text graph-based representation. The proposed method does not represent the content of a text document as a graph only, but also captures the underlying semantic meaning in terms of the relationships among its concepts in order to defeat the difficulty which the traditional plagiarism detection systems face with some kinds of plagiarism such as complicated plagiarism in which users can reword the plagiarized part or replace some words by their synonyms. The experiments have been carried out using PAN-PC-09 standardization of plagiarism detection corpus. The results showed that our method remarkably outperforms the modern methods for plagiarism detection.