Study Of Scalable Data Mining Techniques For Big Data Analysis

Abstract

Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. The process of research into massive amounts of data to reveal hidden patterns and secret correlations named newlineas big data analytics. These useful informations for companies or organizations with the help newlineof gaining richer and deeper insights and getting an advantage over the competition. For this newlinereason, big data implementations need to be analyzed and executed as accurately as possible. newlineThe thesis Defines the scalability and its two types horizontal and vertical, gives a brief explanation newlineof different big data analysis tools like Hadoop, Spark, H2O, and Flink. As well as, it newlineGives a brief overview of data mining techniques which includes supervised and unsupervised newlinemethods as well as regression analysis. newlinePresentation of the MapReduce technique for analyzing IoT data has been introduced. The newlinethesis discussed the concept of cluster analysis. A full description of K-means clustering algorithm newlinebased on MapReduce programming model is given. The results conducted from this work newlineproved the efficiency of using MapReduce in big data analysis specially IoT data. newline

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced