Active Feature-Value Acquisition
Maytal Saar-Tsechansky (),
Prem Melville () and
Foster Provost ()
Additional contact information
Maytal Saar-Tsechansky: McCombs School of Business, University of Texas at Austin, Austin, Texas 78712
Prem Melville: IBM T.J. Watson Research Center, Yorktown Heights, New York 10598
Foster Provost: Stern School of Business, New York University, New York, New York 10012
Management Science, 2009, vol. 55, issue 4, 664-684
Abstract:
Most induction algorithms for building predictive models take as input training data in the form of feature vectors. Acquiring the values of features may be costly, and simply acquiring all values may be wasteful or prohibitively expensive. Active feature-value acquisition (AFA) selects features incrementally in an attempt to improve the predictive model most cost-effectively. This paper presents a framework for AFA based on estimating information value. Although straightforward in principle, estimations and approximations must be made to apply the framework in practice. We present an acquisition policy, sampled expected utility (SEU), that employs particular estimations to enable effective ranking of potential acquisitions in settings where relatively little information is available about the underlying domain. We then present experimental results showing that, compared with the policy of using representative sampling for feature acquisition, SEU reduces the cost of producing a model of a desired accuracy and exhibits consistent performance across domains. We also extend the framework to a more general modeling setting in which feature values as well as class labels are missing and are costly to acquire.
Keywords: information acquistion; predictive modeling; active learning; active feature acquisition; data mining; machine learning; business intelligence; imputation; utility-based data mining (search for similar items in EconPapers)
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (10)
Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.1080.0952 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:55:y:2009:i:4:p:664-684
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().