Извлечение информации из редких событий в регрессионном анализе
Extracting information from rare events in regression analysis
Oleksiy Dushyn and
Borys Dushyn
MPRA Paper from University Library of Munich, Germany
Abstract:
This paper investigated an important practical problem of extracting information from rare events in sparse and high-dimensional data while building a linear regression model. It analyzes the advantages and the limitations of the different linear regression method used for high-dimensional problems. Main known meth-ods were selected and tested on the real Tripadvisor.com dataset. The results of this research show the impor-tance of the data aggregation based on hierarchical clustering. It allows extracting information from rare fea-tures by aggregating them according the clustering tree. Comparative analyses of main different linear regres-sion methods that use clustering aggregation were done.
Keywords: rare events; regression Analysis; sparse data; high-dimensional data; Lasso; Ridge; ElasticNet; rare methods; text mining; semantic aggregation; hierarchical clustering; vector word representation. (search for similar items in EconPapers)
JEL-codes: C51 C63 C87 (search for similar items in EconPapers)
Date: 2024-02-03
New Economics Papers: this item is included in nep-mac
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://mpra.ub.uni-muenchen.de/120235/9/MPRA_paper_120235.pdf original version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pra:mprapa:120235
Access Statistics for this paper
More papers in MPRA Paper from University Library of Munich, Germany Ludwigstraße 33, D-80539 Munich, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Joachim Winter ().