Data analysis on big data: improving the map and shuffle phases in Hadoop Map Reduce
J.V.N. Lakshmi
International Journal of Data Analysis Techniques and Strategies, 2018, vol. 10, issue 3, 305-316
Abstract:
The data management has become a challenging issue for network centric applications which need to process large amount of datasets. System requires advanced tools to analyse these datasets. As an efficient parallel computing programming model Map Reduce and Hadoop are used for large-scale data analysis. However, Map Reduce still suffers with performance problems Map Reduce uses a shuffle phase individual shuffle service component with efficient I/O policy. The map phase requires an improvement in its performance as this phase's output acts as an input to the next phase. Its result reveals the efficiency, so map phase needs some intermediate check points which regularly monitor all the splits generated by intermediate phases. This acts as a barrier for effective resource utilisation. This paper implements shuffle as a service component to decrease the overall execution time of jobs, monitor map phase by skew handling and increase resource utilisation in a cluster.
Keywords: Map Reduce; Hadoop; shuffle; big data; data analytics; Hadoop distributed file system; HDFS; rack awareness; stragglers; light weight processing; OLAP; OLTP. (search for similar items in EconPapers)
Date: 2018
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=94130 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:10:y:2018:i:3:p:305-316
Access Statistics for this article
More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().