EconPapers    
Economics at your fingertips  
 

A probability transducer and decision-theoretic augmentation for machine-learning classifiers

Kjetil Dyrland, Alexander Selvikvåg Lundervold and PierGianLuca Porta Mana
Additional contact information
Alexander Selvikvåg Lundervold: Western Norway University of Applied Sciences
PierGianLuca Porta Mana: HVL Western Norway University of Applied Sciences

No vct9y, OSF Preprints from Center for Open Science

Abstract: In a classification task from a set of features, one would ideally like to have the probability of the class conditional on the features. Such probability is computationally almost impossible to find in many important cases. The primary idea of the present work is to calculate the probability of a class conditional not on the features, but on a trained classifying algorithm's output. Such probability is easily calculated and provides an output-to-probability ’transducer’ that can be applied to the algorithm's future outputs. In conjunction with problem-dependent utilities, the probabilities of the transducer allows one to make the optimal choice among the classes or among a set of more general decisions, by means of expected-utility maximization. The combined procedure is a computationally cheap yet powerful ‘augmentation’ of the original classifier. This idea is demonstrated in a simplified drug-discovery problem with a highly imbalanced dataset. The augmentation leads to improved results, sometimes close to theoretical maximum, for any set of problem-dependent utilities. The calculation of the transducer also provides, automatically: (i) a quantification of the uncertainty about the transducer itself; (ii) the expected utility of the augmented algorithm (including its uncertainty), which can be used for algorithm selection; (iii) the possibility of using the algorithm in a ‘generative mode’, useful if the training dataset is biased. It is argued that the optimality, flexibility, and uncertainty assessment provided by the transducer & augmentation are dearly needed for classification problems in fields such as medicine and drug discovery.

Date: 2022-06-01
New Economics Papers: this item is included in nep-big, nep-cmp and nep-upt
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://osf.io/download/62971cb606863102ff729e57/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:osfxxx:vct9y

DOI: 10.31219/osf.io/vct9y

Access Statistics for this paper

More papers in OSF Preprints from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-03-19
Handle: RePEc:osf:osfxxx:vct9y