期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
印刷版ISSN:2278-1323
出版年度:2014
卷号:3
期号:2
页码:354-356
出版社:Shri Pannalal Research Institute of Technolgy
摘要:Now-a-days, the amount of digital information is increasing at a high speed. The majority of this data will be ¨Dunstructured¡¬¨Ccomplex data mainly poorly-suited to management by structured storage systems like RDMS (Relational Database Management System). Unstructured data mainly come from many sources and takes many forms like web logs, text files, sensor readings, video, audio and images. Complex Unstructured data can hide important insights. The companies which are able to extract the important data from huge volume of data can better control process and costs, can better predicate demand and can better products. Big data mainly deals with two important things: firstly, inexpensive, reliable storage. Secondly, need new tools for analyzing unstructured and structured data. Hadoop is a powerful open source software platform which can address both the above problems. Hadoop mainly contains two important components: HDFS and MapReduce. HDFS is used for storage purposes and MapReduce is programming paradigms for distributed platform, where a cluster of computers are connected with each other through networking and useful in solving certain problems of computing cluster.