EconPapers    
Economics at your fingertips  
 

Bootstrapping binary GEV regressions for imbalanced datasets

Michele Rocca (), Marcella Niglio () and Marialuisa Restaino ()
Additional contact information
Michele Rocca: University of Salerno
Marcella Niglio: University of Salerno
Marialuisa Restaino: University of Salerno

Computational Statistics, 2024, vol. 39, issue 1, No 10, 213 pages

Abstract: Abstract This paper proposes and discusses a bootstrap scheme to make inferences when an imbalance in one of the levels of a binary variable affects both the dependent variable and some of the features. Specifically, the imbalance in the binary dependent variable is managed by adopting an asymmetric link function based on the quantile of the generalized extreme value (GEV) distribution, leading to a class of models called GEV regression. Within this framework, we propose using the fractional-random-weighted (FRW) bootstrap to obtain confidence intervals and implement a multiple testing procedure to identifying the set of relevant features. The main advantages of FRW bootstrap are as follows: (1) all observations belonging to the imbalanced class are always present in every bootstrap resample; (2) the bootstrap can be applied even when the complexity of the link function does not allow to easily compute second-order derivatives for the Hessian; (3) the bootstrap resampling scheme does not change whatever the link function is, and can be applied beyond the GEV link function used in this study. The performance of the FRW bootstrap in GEV regression modelling is evaluated using a detailed Monte Carlo simulation study, where the imbalance is present in the dependent variable and features. An application of the proposed methodology to a real dataset to analyze student churn in an Italian university is also discussed.

Keywords: GEV regression; FRW bootstrap; Imbalanced data; Variable selection (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00180-023-01330-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:39:y:2024:i:1:d:10.1007_s00180-023-01330-y

Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2

DOI: 10.1007/s00180-023-01330-y

Access Statistics for this article

Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik

More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-12
Handle: RePEc:spr:compst:v:39:y:2024:i:1:d:10.1007_s00180-023-01330-y