期刊名称:International Journal of Hybrid Information Technology
印刷版ISSN:1738-9968
出版年度:2015
卷号:8
期号:6
页码:375-380
DOI:10.14257/ijhit.2015.8.6.36
出版社:SERSC
摘要:Analysis and clustering of very large scale data set has been a complex problem. It becomes increasingly difficult to compute the results in a reasonable amount of time as data amount increases and with its feature dimensions. The GPU (graphics processing unit) has been a point of attraction in a last few years for its ability to compute highly- parallel and semi-parallel problems way faster than any traditional sequential processor. This paper explores the capability of GPU with MapReduce Model. This highly scalable model for distributed programming can be scaled upto thousands of machines. This was developed by Google's developers Jeffrey Dean and Sanjay Ghemawat and has been implemented in many programming languages and frameworks like Apache Hadoop, Hive, and Pig etc. For this paper we'll mainly focus on Hadoop framework. First two sections present the introduction and background. The working mechanism of this combination has been shown in section 3. Then further we explore frameworks present to implement MapReduce on GPU. In section 5, a comparative experiment was performed on GPU and CPU, both implementing MapReduce Model. The paper ends conclusion.
关键词:Graphical Processing Unit (GPU); Hadoop; large Scale Analytics; Map ; Reduce Model; Java Compute Unified Device Architecture (JCUDA)