期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2016
卷号:7
期号:4
页码:82-86
语种:English
出版社:Ayushmaan Technologies
摘要:The similarity between documents are the new creative idea now days in data mining and data recovery. These incorporate basically supported hunt, question reformulation and picture recovery. Standard text comparability measures perform ineffectively due to data meager condition and the absence of context. Where Document preparing assumes a vital part in data mining, and web look. In text handling, pack of-words model is utilized. Measuring the closeness between records is a fundamental assignment in the report preparing and text classification. In this, another comparability measure is proposed. To quantify the similitude between records regarding a component, the proposed technique takes the accompanying cases: (a) The element must be in both documents, (b) the element that shows up in one and only archive, and (c) the element that shows up in none of the documents. For first case, closeness increments as the distinction between the records highlight values diminishes. For second, a settledesteem is discover the closeness. For last case, the element has no commitment to closeness. The adequacy of measure is assessed on a few genuine data sets for record classification and clustering.