EconPapers    
Economics at your fingertips  
 

Model Selection Using Information Theory and the MDL Principle

Robert A. Stine
Additional contact information
Robert A. Stine: University of Pennsylvania

Sociological Methods & Research, 2004, vol. 33, issue 2, 230-260

Abstract: Information theory offers a coherent, intuitive view of model selection. This perspective arises from thinking of a statistical model as a code, an algorithm for compressing data into a sequence of bits. The description length is the length of this code for the data plus the length of a description of the model itself. The length of the code for the data measures the fit of the model to the data, whereas the length of the code for the model measures its complexity. The minimum description length (MDL) principle picks the model with smallest description length, balancing fit versus complexity. Variations on MDL reproduce other well-known methods of model selection. Going further, information theory allows one to choose from among various types of models, permitting the comparison of tree-based models to regressions. A running example compares several models for the well-known Boston housing data.

Keywords: Akaike information criterion (AIC); Bayes information criterion (BIC); risk inflation criterion (RIC); cross-validation; model selection, stepwise regression; regression tree (search for similar items in EconPapers)
Date: 2004
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://journals.sagepub.com/doi/10.1177/0049124103262064 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:sae:somere:v:33:y:2004:i:2:p:230-260

DOI: 10.1177/0049124103262064

Access Statistics for this article

More articles in Sociological Methods & Research
Bibliographic data for series maintained by SAGE Publications ().

 
Page updated 2025-03-19
Handle: RePEc:sae:somere:v:33:y:2004:i:2:p:230-260