Information Loss Due to the Data Reduction of Sample Data from Discrete Distributions
Maryam Moghimi and
Herbert W. Corley
Additional contact information
Maryam Moghimi: Center on Stochastic Modeling, Optimization, and Statistics (COSMOS), the University of Texas at Arlington, Arlington, TX 76013, USA
Herbert W. Corley: Center on Stochastic Modeling, Optimization, and Statistics (COSMOS), the University of Texas at Arlington, Arlington, TX 76013, USA
Data, 2020, vol. 5, issue 3, 1-18
Abstract:
In this paper, we study the information lost when a real-valued statistic is used to reduce or summarize sample data from a discrete random variable with a one-dimensional parameter. We compare the probability that a random sample gives a particular data set to the probability of the statistic’s value for this data set. We focus on sufficient statistics for the parameter of interest and develop a general formula independent of the parameter for the Shannon information lost when a data sample is reduced to such a summary statistic. We also develop a measure of entropy for this lost information that depends only on the real-valued statistic but neither the parameter nor the data. Our approach would also work for non-sufficient statistics, but the lost information and associated entropy would involve the parameter. The method is applied to three well-known discrete distributions to illustrate its implementation.
Keywords: data reduction; Shannon information; entropy; information loss (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2306-5729/5/3/84/pdf (application/pdf)
https://www.mdpi.com/2306-5729/5/3/84/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:5:y:2020:i:3:p:84-:d:413006
Access Statistics for this article
Data is currently edited by Ms. Cecilia Yang
More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().