EconPapers    
Economics at your fingertips  
 

Optimal classification scores based on multivariate marker transformations

Pablo Martínez-Camblor (), Sonia Pérez-Fernández and Susana Díaz-Coto
Additional contact information
Pablo Martínez-Camblor: Geisel School of Medicine at Dartmouth
Sonia Pérez-Fernández: Oviedo University
Susana Díaz-Coto: Oviedo University

AStA Advances in Statistical Analysis, 2021, vol. 105, issue 4, No 3, 599 pages

Abstract: Abstract Modern science frequently involves the study of complex relationships among effects and factors. Flexible statistical tools are commonly used to visualize nonlinear associations. When our interest is to study the discrimination capacity of a multivariate marker on a binary outcome, the theoretical transformation leading to the optimal results in terms of sensitivity and specificity has already been settled. It is particularly useful to know this function, not only to allocate items to groups, but also to understand the relationship between the multivariate marker and the outcome. In this paper, we explore the use of the multivariate kernel density estimator in order to approximate such transformation. Large sample properties of the finally derived estimator are outlined, while its finite sample behavior is studied via Monte Carlo simulations. We consider six different bivariate and three additional higher-dimensional scenarios. The performance of the estimator is studied by using four different tuning parameters computed automatically. Besides a cross-validation algorithm is incorporated with the aim of reducing the potential overfitting. The proposed methodology is applied in order to study the capacity of two molecular characteristics to predict the toxicity of some chemical products. Results suggest that smoothing techniques are promising classical and simple statistical tools which can be used for a better understanding of some current scientific problems. However, the incorporation of additional machine learning techniques such as cross-validation is advisable in order to control the frequently over optimistic results, specially in those cases with small sample size. The function implementing the proposed methodology is provided as supplementary material.

Keywords: Classification problem; Kernel density estimator; Multivariate marker; Optimal transformation; Receiver-operating characteristic (ROC) curve (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10182-020-00388-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:alstar:v:105:y:2021:i:4:d:10.1007_s10182-020-00388-z

Ordering information: This journal article can be ordered from
http://www.springer. ... cs/journal/10182/PS2

DOI: 10.1007/s10182-020-00388-z

Access Statistics for this article

AStA Advances in Statistical Analysis is currently edited by Göran Kauermann and Yarema Okhrin

More articles in AStA Advances in Statistical Analysis from Springer, German Statistical Society
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:alstar:v:105:y:2021:i:4:d:10.1007_s10182-020-00388-z