EconPapers    
Economics at your fingertips  
 

(Psycho-)analysis of benchmark experiments: A formal framework for investigating the relationship between data sets and learning algorithms

Manuel J.A. Eugster, Friedrich Leisch and Carolin Strobl

Computational Statistics & Data Analysis, 2014, vol. 71, issue C, 986-1000

Abstract: It is common knowledge that the performance of different learning algorithms depends on certain characteristics of the data—such as dimensionality, linear separability or sample size. However, formally investigating this relationship in an objective and reproducible way is not trivial. A new formal framework for describing the relationship between data set characteristics and the performance of different learning algorithms is proposed. The framework combines the advantages of benchmark experiments with the formal description of data set characteristics by means of statistical and information-theoretic measures and with the recursive partitioning of Bradley–Terry models for comparing the algorithms’ performances. The formal aspects of each component are introduced and illustrated by means of an artificial example. Its real-world usage is demonstrated with an application example consisting of thirteen widely-used data sets and six common learning algorithms. The Appendix provides information on the implementation and the usage of the framework within the R language.

Keywords: Benchmark experiments; Data set characterization; Recursive partitioning; Preference scaling; Bradley–Terry model (search for similar items in EconPapers)
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947313002946
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:71:y:2014:i:c:p:986-1000

DOI: 10.1016/j.csda.2013.08.007

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:71:y:2014:i:c:p:986-1000