Big Data et Technologies de Stockage et de Traitement des Données Massives: Comprendre les bases de l’écosystème HADOOP (HDFS, MAPREDUCE, YARN, HIVE, HBASE, KAFKA et SPARK)
Big Data and Technologies of Storage and Processing of Massive Data: Understand the basics of the HADOOP ecosystem (HDFS, MAPREDUCE, YARN, HIVE, HBASE, KAFKA and SPARK)
Moussa Keita
MPRA Paper from University Library of Munich, Germany
Abstract:
Over the past decade, many technological solutions have been designed to meet the multiple challenges of Big Data, namely the problematic of storing and processing huge volumes of data generated at continuous pace. Two major concepts are at the heart of the solutions designed to meet the challenges: storage in distributed architecture and parallelized processing. HADOOP is one of the first frameworks that implemented this approach. In this document, we provide a general overview of the HADOOP framework, its main functionalities as well as some technological layers that form its ecosystem. First, we present the basic components of HADOOP technology: HDFS, MAPREDUCE and YARN. And secondly, we present some tools that allow exploiting data stored in HADOOP environment. Especially, we present HIVE a query engine, HBASE a distributed database, KAFKA a tool of ingestion and integration of streams of data and SPARK a parallelized data processing engine.
Keywords: Big data; data Science; Hadoop; HDFS; MAPREDUCE; YARN; Spark; Kafka; Hbase; java; python; scala (search for similar items in EconPapers)
JEL-codes: C8 (search for similar items in EconPapers)
Date: 2021-10
New Economics Papers: this item is included in nep-big
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://mpra.ub.uni-muenchen.de/110334/1/MPRA_paper_110334.pdf original version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pra:mprapa:110334
Access Statistics for this paper
More papers in MPRA Paper from University Library of Munich, Germany Ludwigstraße 33, D-80539 Munich, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Joachim Winter ().