期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2020
卷号:98
期号:22
页码:3602-3624
出版社:Journal of Theoretical and Applied
摘要:Information is stored in several forms such as pictures, web pages, sound and video, but 80% is stored as a text. Quick searching for a specific text document depends totally on the accuracy of the classification of the document's subject with a similar group of documents. This process is called documents clustering. Recently, deep learning techniques have achieved distinguish results in solving the problems facing documents clustering such as complex semantics and high dimensionality. This paper aims to examines a comprehensive review related to documents clustering, and survey the recent work in document clustering using deep learning methods. The proposed taxonomy represents knowledge that helps researchers to understand and follow up previous works in this area, and developing or creating new methods and a comparative analysis was made between popular dataset, performance metrics, deep learning frameworks and library used in deep learning clustering documents.