EconPapers    
Economics at your fingertips  
 

Attributing value in a data pooling setting for predictive modeling

Julie Moeyersoms, Brian D'Alessandro, Foster Provost and David Martens

Working Papers from University of Antwerp, Faculty of Business and Economics

Abstract: The rapid growth of data sources comes with numerous challenges. One of them is the determination of its value. That is, when building prediction models based on different data sources, it is interesting to know how much each of the features has contributed to that specific prediction. As such, we get an idea on how the benefits created by the prediction model could be divided over the features responsible for it. The goal of this paper is to define, solve and evaluate a data attribution scheme for predictive modeling that is “fair”, which is defined by using concepts from game theory. We use two methods from various research fields in order to distribute the value both on an instance level and ultimately on a feature level: The (approximate) Shapley value and an explanation approach for high-dimensional data. By using a high-dimensional and sparse data set, consisting of website visits for each user, we show that: (i) the proposed methods allow to create a fair value distribution among a very large number of data sources (websites in this case) in a prediction model, and (i) are able to obtain a double amount of instances that are explained for a given number of features as compared to just looking at the high-coefficient features. Interestingly, (iii) although the proposed methods come from different sources and motivations, the two new alternatives provide strikingly similar rankings of important features and division of the revenues.

Pages: 39 pages
Date: 2017-09
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://repository.uantwerpen.be/docman/irua/ab3df5/145523.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ant:wpaper:2017009

Access Statistics for this paper

More papers in Working Papers from University of Antwerp, Faculty of Business and Economics Contact information at EDIRC.
Bibliographic data for series maintained by Joeri Nys ().

 
Page updated 2025-04-13
Handle: RePEc:ant:wpaper:2017009