Uncovering memorization effect in the presence of spurious correlations
Chenyu You (),
Haocheng Dai,
Yifei Min,
Jasjeet S. Sekhon,
Sarang Joshi and
James S. Duncan
Additional contact information
Chenyu You: Yale University
Haocheng Dai: University of Utah
Yifei Min: Yale University
Jasjeet S. Sekhon: Yale University
Sarang Joshi: University of Utah
James S. Duncan: Yale University
Nature Communications, 2025, vol. 16, issue 1, 1-8
Abstract:
Abstract Machine learning models often rely on simple spurious features – patterns in training data that correlate with targets but are not causally related to them, like image backgrounds in foreground classification. This reliance typically leads to imbalanced test performance across minority and majority groups. In this work, we take a closer look at the fundamental cause of such imbalanced performance through the lens of memorization, which refers to the ability to predict accurately on atypical examples (minority groups) in the training set but failing in achieving the same accuracy in the testing set. This paper systematically shows the ubiquitous existence of spurious features in a small set of neurons within the network, providing the first-ever evidence that memorization may contribute to imbalanced group performance. Through three experimental sources of converging empirical evidence, we find the property of a small subset of neurons or channels in memorizing minority group information. Inspired by these findings, we hypothesize that spurious memorization, concentrated within a small subset of neurons, plays a key role in driving imbalanced group performance. To further substantiate this hypothesis, we show that eliminating these unnecessary spurious memorization patterns via a novel framework during training can significantly affect the model performance on minority groups. Our experimental results across various architectures and benchmarks offer new insights on how neural networks encode core and spurious knowledge, laying the groundwork for future research in demystifying robustness to spurious correlation.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-61531-5 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-61531-5
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-61531-5
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().