EconPapers    
Economics at your fingertips  
 

Research of Correlation Dependencies in Russian Household Data Using Data Mining Methods

Vasily Usachev (), Veronika Brus (), Lilia Voronova () and Elena Tarasenko ()
Additional contact information
Vasily Usachev: Moscow Technical University of Communications and Informatics
Veronika Brus: Moscow Technical University of Communications and Informatics
Lilia Voronova: Moscow Technical University of Communications and Informatics
Elena Tarasenko: HSE University

A chapter in Digitalization of Society, Economics and Management, 2022, pp 151-161 from Springer

Abstract: Abstract The article is devoted to the study of big data using modern Data Mining tools. For the analysis, the authors use survey data from the Russian Monitoring of the Economic Situation and Health of the Population at the Higher School of Economics (RLMS HSE) “conducted by the National Research University Higher School of Economics and Demoscope LLC with the participation of the University of North Carolina Population Center at Chapel Hill and the Federal Institute of Sociology Research Sociological Center of the Russian Academy of Sciences. The set under study contains data from surveys of households and individuals. For the study, we took household data for 2019 and 2009, each containing more than a thousand attributes included in 12 information groups. For data preprocessing, the Python language and the PyCharm development environment were used. For basic analysis, we used the IBM SPSS Statistics 26 program, as well as the Cloudera CDH tools (Hue and Impala) from the Apache Hadoop distribution, which contains a set of modules for processing big data and machine learning. Automation of the search for dependencies for Pearson's correlation coefficients was carried out, comparison and visualization of detailed dependencies of the influence of the status of a settlement and the region of family residence on the availability of centralized utilities at the beginning and end of a ten-year period was carried out.

Keywords: Data mining; RLMS HSE; Apache hadoop (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:lnichp:978-3-030-94252-6_11

Ordering information: This item can be ordered from
http://www.springer.com/9783030942526

DOI: 10.1007/978-3-030-94252-6_11

Access Statistics for this chapter

More chapters in Lecture Notes in Information Systems and Organization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-01
Handle: RePEc:spr:lnichp:978-3-030-94252-6_11