期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2020
卷号:11
期号:4
DOI:10.14569/IJACSA.2020.0110418
出版社:Science and Information Society (SAI)
摘要:In recent times, dimension size has posed more challenges as compared to data size. The serious concern of high dimensional data is the curse of dimensionality and has ultimately caught the attention of data miners. Anomaly detection based on local neighborhood like local outlier factor has been admitted as state of art approach but fails when operated on the high number of dimensions for the reason mentioned above. In this paper, we determine the effects of different distance functions on an unlabeled dataset while digging outliers through the density-based approach. Further, we also explore findings regarding runtime and outlier score when dimension size and number of nearest neighbor points (min_pts) are varied. This analytic research is also very appropriate and applicable in the domain of big data and data science as well.
关键词:High dimensional data; density-based anomaly detection; local outlier; outlier detection