EconPapers    
Economics at your fingertips  
 

A powerful penalized multinomial logistic regression approach

Cornelia Fuetterer (), Malte Nalenz (), Thomas Augustin () and Ruth M. Pfeiffer ()
Additional contact information
Cornelia Fuetterer: Technical University of Munich (TUM)
Malte Nalenz: Ludwig-Maximilians-University Munich
Thomas Augustin: Ludwig-Maximilians-University Munich
Ruth M. Pfeiffer: National Cancer Institute

Computational Statistics, 2025, vol. 40, issue 8, No 18, 4565-4587

Abstract: Abstract Penalized regression methods that shrink model coefficients are popular approaches to improve prediction and for variable selection in high-dimensional settings. We present a penalized (or regularized) regression approach for multinomial logistic models for categorical outcomes with a novel adaptive L1-type penalty term, that incorporates weights based on intra- and inter-outcome category distances of each predictor. A predictor that has large between- and small within-outcome category distances is penalized less and has a higher likelihood to be selected for the final model. We propose and study three measures for weight calculation: an analysis of variance (ANOVA)-based measure and two indices used in clustering approaches. Our novel approach, that we term the discriminative power lasso (DP-lasso), thus combines elements of marginal screening with regularized regression methods. We studied the performance of DP-lasso and other published methods in simulations with varying numbers of outcome categories, numbers of predictors, strengths of associations and predictor correlation structures. For correlated predictors, the DP-lasso approach with ANOVA based weights (DPan) resulted in much sparser models than other regularization approaches, especially in high-dimensional settings. When the number p of (correlated) predictors was much larger than the available sample size N, DPan had the highest true positive rate while maintaining low false positive rates for all simulation settings. Similarly, when $${p

Keywords: Clustering; Penalized regression; Penalty weights; Polytomous logistic regression; Single-cell RNA sequencing data; Shrinkage; Variable selection (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00180-025-01635-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:40:y:2025:i:8:d:10.1007_s00180-025-01635-0

Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2

DOI: 10.1007/s00180-025-01635-0

Access Statistics for this article

Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik

More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-10-26
Handle: RePEc:spr:compst:v:40:y:2025:i:8:d:10.1007_s00180-025-01635-0