EconPapers    
Economics at your fingertips  
 

A benchmarking study of classification techniques for behavioral data

Sofie de Cnudde, David Martens, Theodoros Evgeniou and Foster Provost

Working Papers from University of Antwerp, Faculty of Business and Economics

Abstract: The predictive power in ubiquitous big, behavioral data has been emphasized by previous academic research. The ultra-high dimensional and sparse characteristics, however, pose significant challenges on state-of-the-art classification techniques. Moreover, no consensus exists regarding a feasible trade-off between classification performance and computational complexity. This work provides a contribution in this direction through a systematic benchmarking study. Forty-three fine-grained behavioral data sets are analyzed with 11 classification techniques. Statistical performance comparisons enriched with learning curve analyses demonstrate two important findings. Firstly, an inherent AUC-time trade-off becomes clear, making the choice for an appropriate classifier dependent on time restrictions and data set characteristics. Logistic regression achieves the best AUC, however in the worst amount of time. Also, L2 regularization proves better than sparse L1-regularization. An attractive trade-off is found in a similarity-based technique called PSN. Secondly, the results illustrate that significant value lies in collecting and analyzing even more data, both in the instance and in the feature dimension, contrasting findings on traditional data. The results of this study provide guidance for researchers and practitioners for the selection of appropriate classification techniques, sample sizes and data features, while also providing focus in scalable algorithm design in the face of large, behavioral data.

Pages: 54 pages
Date: 2017-04
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://repository.uantwerpen.be/docman/irua/f3979a/142910.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ant:wpaper:2017005

Access Statistics for this paper

More papers in Working Papers from University of Antwerp, Faculty of Business and Economics Contact information at EDIRC.
Bibliographic data for series maintained by Joeri Nys ().

 
Page updated 2025-03-22
Handle: RePEc:ant:wpaper:2017005