Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory

Xu, Yunbei; Zeevi, Assaf

Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory

Yunbei Xu () and Assaf Zeevi ()
Additional contact information
Yunbei Xu: Decision, Risk, and Operations Division, Graduate School of Business, Columbia University, New York, New York 10027
Assaf Zeevi: Decision, Risk, and Operations Division, Graduate School of Business, Columbia University, New York, New York 10027

Mathematics of Operations Research, 2025, vol. 50, issue 1, 40-67

Abstract: We study problem-dependent rates, that is, generalization errors that scale near-optimally with the variance, effective loss, or gradient norms evaluated at the “best hypothesis.” We introduce a principled framework dubbed “uniform localized convergence” and characterize sharp problem-dependent rates for central statistical learning problems. From a methodological viewpoint, our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches. It also provides improvements and some level of unification in the study of localized complexities, one-sided uniform inequalities, and sample-based iterative algorithms. In the so-called “slow rate” regime, we provide the first (moment-penalized) estimator that achieves the optimal variance-dependent rate for general “rich” classes; we also establish an improved loss-dependent rate for standard empirical risk minimization. In the “fast rate” regime, we establish finite-sample, problem-dependent bounds that are comparable to precise asymptotics. In addition, we show that iterative algorithms such as gradient descent and first order expectation maximization can achieve optimal generalization error in several representative problems across the areas of nonconvex learning, stochastic optimization, and learning with missing data.

Keywords: Primary: 68T01; Secondary: 60G07; problem-dependent generalization error bounds; statistical learning theory; uniform convergence and localization; nonconvex learning; stochastic optimization; iterative algorithms; expectation maximization; variance penalization (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/moor.2021.0076 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:50:y:2025:i:1:p:40-67

Access Statistics for this article

More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().