期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
出版社:S&S Publications
摘要:In this research we perform analysis on large data sets of students which will be obliging smartenvironment. Such a persisting and spring of students data is analysed using batch analysis technique. Beyond batchprocess streaming data analysis is performed on the basis of word-count program that runs data from HDFS anddynamically created data. To compute such coherent strategies we use an amateur schema called batch and streamingprocess. This architecture reduced to serve as X-Platform where many tools can be used for batch and stream analysison this framework. We compute the immutable data using spark-sql which is a query language where it provides thebridge for interactive process to have iterative operations too. The processing of real-time streaming data includesworks using spark-streaming. We evaluate preliminary results and analysis report, where we compare performance ondatasets and achieve a low-latency rate due to RDD used.
关键词:Spark; RDD; Spark SQL; interactive; Spark Streaming; in-memory and Hive.