EconPapers    
Economics at your fingertips  
 

Interpretable dimension reduction

Hugh Chipman and Hong Gu

Journal of Applied Statistics, 2005, vol. 32, issue 9, 969-987

Abstract: The analysis of high-dimensional data often begins with the identification of lower dimensional subspaces. Principal component analysis is a dimension reduction technique that identifies linear combinations of variables along which most variation occurs or which best “reconstruct” the original variables. For example, many temperature readings may be taken in a production process when in fact there are just a few underlying variables driving the process. A problem with principal components is that the linear combinations can seem quite arbitrary. To make them more interpretable, we introduce two classes of constraints. In the first, coefficients are constrained to equal a small number of values (homogeneity constraint). The second constraint attempts to set as many coefficients to zero as possible (sparsity constraint). The resultant interpretable directions are either calculated to be close to the original principal component directions, or calculated in a stepwise manner that may make the components more orthogonal. A small dataset on characteristics of cars is used to introduce the techniques. A more substantial data mining application is also given, illustrating the ability of the procedure to scale to a very large number of variables.

Keywords: Principal component; interpretable; homogeneity; sparsity; stepwise algorithm; dimension reduction; data mining (search for similar items in EconPapers)
Date: 2005
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (8)

Downloads: (external link)
http://www.tandfonline.com/doi/abs/10.1080/02664760500168648 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:japsta:v:32:y:2005:i:9:p:969-987

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/CJAS20

DOI: 10.1080/02664760500168648

Access Statistics for this article

Journal of Applied Statistics is currently edited by Robert Aykroyd

More articles in Journal of Applied Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:969-987