When should we expect non-decreasing returns from data in prediction tasks?
Maximilian Schaefer ()
Additional contact information
Maximilian Schaefer: LITEM - Laboratoire en Innovation, Technologies, Economie et Management (EA 7363) - UEVE - Université d'Évry-Val-d'Essonne - Université Paris-Saclay - IMT-BS - Institut Mines-Télécom Business School - IMT - Institut Mines-Télécom [Paris], IMT-BS - DEFI - Département Data analytics, Économie et Finances - IMT - Institut Mines-Télécom [Paris] - IMT-BS - Institut Mines-Télécom Business School - IMT - Institut Mines-Télécom [Paris]
Working Papers from HAL
Abstract:
This article studies the change in the prediction accuracy of a response variable when the number of predictors increases, and all variables follow a multivariate normal distribution. Assuming that the correlations between variables are independently drawn, I show that adding variables leads to globally increasing returns to scale when the mean of the correlation distribution is zero. The speed of learning depends positively on the variance of the correlation distribution. I use simulations to study the more complex case of correlation distributions with a non-zero mean and find a pattern of decreasing returns followed by increasing returns to scale - as long as the variance of correlations is not degenerate, in which case globally decreasing returns emerge. I train a collaborative filtering algorithm using the MovieLens 1M dataset to analyze returns from adding variables in a more realistic setting and find globally increasing returns to scale across 2,000 variables. The results suggest significant scale advantages from additional variables in prediction tasks.
Keywords: Collaborative Filtering; Data as Barrier to Entry; Learning from Data; Increasing Returns to Scale (search for similar items in EconPapers)
Date: 2025-03-06
Note: View the original document on HAL open archive server: https://hal.science/hal-05234918v1
References: Add references at CitEc
Citations:
Downloads: (external link)
https://hal.science/hal-05234918v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:wpaper:hal-05234918
DOI: 10.48550/arXiv.2503.03602
Access Statistics for this paper
More papers in Working Papers from HAL
Bibliographic data for series maintained by CCSD ().