Research of Correlation Dependencies in Russian Household Data Using Data Mining Methods
Vasily Usachev (),
Veronika Brus (),
Lilia Voronova () and
Elena Tarasenko ()
Additional contact information
Vasily Usachev: Moscow Technical University of Communications and Informatics
Veronika Brus: Moscow Technical University of Communications and Informatics
Lilia Voronova: Moscow Technical University of Communications and Informatics
Elena Tarasenko: HSE University
A chapter in Digitalization of Society, Economics and Management, 2022, pp 151-161 from Springer
Abstract:
Abstract The article is devoted to the study of big data using modern Data Mining tools. For the analysis, the authors use survey data from the Russian Monitoring of the Economic Situation and Health of the Population at the Higher School of Economics (RLMS HSE) “conducted by the National Research University Higher School of Economics and Demoscope LLC with the participation of the University of North Carolina Population Center at Chapel Hill and the Federal Institute of Sociology Research Sociological Center of the Russian Academy of Sciences. The set under study contains data from surveys of households and individuals. For the study, we took household data for 2019 and 2009, each containing more than a thousand attributes included in 12 information groups. For data preprocessing, the Python language and the PyCharm development environment were used. For basic analysis, we used the IBM SPSS Statistics 26 program, as well as the Cloudera CDH tools (Hue and Impala) from the Apache Hadoop distribution, which contains a set of modules for processing big data and machine learning. Automation of the search for dependencies for Pearson's correlation coefficients was carried out, comparison and visualization of detailed dependencies of the influence of the status of a settlement and the region of family residence on the availability of centralized utilities at the beginning and end of a ten-year period was carried out.
Keywords: Data mining; RLMS HSE; Apache hadoop (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:lnichp:978-3-030-94252-6_11
Ordering information: This item can be ordered from
http://www.springer.com/9783030942526
DOI: 10.1007/978-3-030-94252-6_11
Access Statistics for this chapter
More chapters in Lecture Notes in Information Systems and Organization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().