EconPapers    
Economics at your fingertips  
 

A New Separation Index and Classification Techniques Based on Shannon Entropy

Jorge Navarro (), Francesco Buono () and Jorge M. Arevalillo ()
Additional contact information
Jorge Navarro: Universidad de Murcia
Francesco Buono: Università di Napoli Federico II
Jorge M. Arevalillo: UC3M-Santander Big Data Institute

Methodology and Computing in Applied Probability, 2023, vol. 25, issue 4, 1-24

Abstract: Abstract The purpose is to use Shannon entropy measures to develop classification techniques and an index which estimates the separation of the groups in a finite mixture model. These measures can be applied to machine learning techniques such as discriminant analysis, cluster analysis, exploratory data analysis, etc. If we know the number of groups and we have training samples from each group (supervised learning) the index is used to measure the separation of the groups. Here some entropy measures are used to classify new individuals in one of these groups. If we are not sure about the number of groups (unsupervised learning), the index can be used to determine the optimal number of groups from an entropy (information/uncertainty) criterion. It can also be used to determine the best variables in order to separate the groups. In all the cases we assume that we have absolutely continuous random variables and we use the Shannon entropy based on the probability density function. Theoretical, parametric and non-parametric techniques are proposed to get approximations of these entropy measures in practice. An application to gene selection in a colon cancer discrimination study with a lot of variables is provided as well.

Keywords: Shannon entropy; Discriminant analysis; Cluster analysis; Kernel density estimation; Omic data; 62N05; 90B25 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11009-023-10055-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:metcap:v:25:y:2023:i:4:d:10.1007_s11009-023-10055-w

Ordering information: This journal article can be ordered from
https://www.springer.com/journal/11009

DOI: 10.1007/s11009-023-10055-w

Access Statistics for this article

Methodology and Computing in Applied Probability is currently edited by Joseph Glaz

More articles in Methodology and Computing in Applied Probability from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:metcap:v:25:y:2023:i:4:d:10.1007_s11009-023-10055-w