Econometric modeling of panel data using parallel computing with Apache Spark
Michał Bernardelli ()
Collegium of Economic Analysis Annals, 2016, issue 41, 189-202
The aim of this article is to provide a method for determining the fixed effects estimators using MapReduce programming model implemented in Apache Spark. From many known algorithms two common approaches were exploited: the within transformation and least squares dummy variables method (LSDV). Efficiency of the computations was demonstrated by solving a specially crafted example for sample data. Based on theoretical analysis and computer experiments it can be stated that Apache Spark is an efficient tool for modeling panel data especially if it comes to Big Data.
Keywords: fixed effects estimator; panel data; Apache Spark; Big Data; MapReduce (search for similar items in EconPapers)
JEL-codes: C13 C23 C51 C55 (search for similar items in EconPapers)
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1) Track citations by RSS feed
Downloads: (external link)
http://rocznikikae.sgh.waw.pl/p/roczniki_kae_z41_12.pdf Full text (application/pdf)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:sgh:annals:i:41:y:2016:p:189-202
Access Statistics for this article
Collegium of Economic Analysis Annals is currently edited by Joanna Plebaniak, Beata Czarnacka-Chrobot
More articles in Collegium of Economic Analysis Annals from Warsaw School of Economics, Collegium of Economic Analysis Contact information at EDIRC.
Bibliographic data for series maintained by Michał Bernardelli ().