Distributed Parallel Architecture for "Big Data"
Catalin Boja,
Adrian Pocovnicu () and
Lorena Batagan ()
Informatica Economica, 2012, vol. 16, issue 2, 116-127
Abstract:
This paper is an extension to the "Distributed Parallel Architecture for Storing and Processing Large Datasets" paper presented at the WSEAS SEPADS’12 conference in Cambridge. In its original version the paper went over the benefits of using a distributed parallel architecture to store and process large datasets. This paper analyzes the problem of storing, processing and retrieving meaningful insight from petabytes of data. It provides a survey on current distributed and parallel data processing technologies and, based on them, will propose an architecture that can be used to solve the analyzed problem. In this version there is more emphasis put on distributed files systems and the ETL processes involved in a distributed environment.
Keywords: Large Dataset; Distributed; Parallel; Storage; Cluster; Cloud; MapReduce; Hadoop (search for similar items in EconPapers)
Date: 2012
References: View complete reference list from CitEc
Citations: View citations in EconPapers (6)
Downloads: (external link)
http://www.revistaie.ase.ro/content/62/12%20-%20Boja.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:aes:infoec:v:16:y:2012:i:2:p:116-127
Access Statistics for this article
Informatica Economica is currently edited by Ion Ivan
More articles in Informatica Economica from Academy of Economic Studies - Bucharest, Romania Contact information at EDIRC.
Bibliographic data for series maintained by Paul Pocatilu ().