期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
印刷版ISSN:2347-6710
电子版ISSN:2319-8753
出版年度:2016
卷号:5
期号:10
页码:17583
DOI:10.15680/IJIRSET.2016.0510013
出版社:S&S Publications
摘要:In this research we perform analysis on large data sets of students which will be obliging smartenvironment. Such a persisting and spring of students data is analysed using batch analysis technique. Beyond batchprocess streaming data analysis is performed on the basis of word-count program that runs data from HDFS anddynamically created data. To compute such coherent strategies we use an amateur schema called batch and streamingprocess. This architecture reduced to serve as X-Platform where many tools can be used for batch and stream analysison this framework. We compute the immutable data using spark-sql which is a query language where it provides thebridge for interactive process to have iterative operations too. The processing of real-time streaming data includesworks using spark-streaming. We evaluate preliminary results and analysis report, where we compare performance ondatasets and achieve a low-latency rate due to RDD used.
关键词:Spark; RDD; Spark SQL; interactive; Spark Streaming; in-memory and Hive.