Hole or grain? A Section Pursuit Index for Finding Hidden Structure in Multiple Dimensions
Ursula Laa (),
Dianne Cook (),
Andreas Buja () and
German Valencia ()
No 17/20, Monash Econometrics and Business Statistics Working Papers from Monash University, Department of Econometrics and Business Statistics
Abstract:
Multivariate data is often visualized using linear projections, produced by techniques such as principal component analysis, linear discriminant analysis, and projection pursuit. A problem with projections is that they obscure low and high density regions near the center of the distribution. Sections, or slices, can help to reveal them. This paper develops a section pursuit method, building on the extensive work in projection pursuit, to search for interesting slices of the data. Linear projections are used to define sections of the parameter space, and to calculate interestingness by comparing the distribution of observations, inside and outside a section. By optimizing this index, it is possible to reveal features such as holes (low density) or grains (high density). The optimization is incorporated into a guided tour so that the search for structure can be dynamic. The approach can be useful for problems when data distributions depart from uniform or normal, as in visually exploring nonlinear manifolds, and functions in multivariate space. Two applications of section pursuit are shown: exploring decision boundaries from classification models, and exploring subspaces induced by complex inequality conditions from multiple parameter model. The new methods are available in R, in the tourr package.
Keywords: multivariate data; dimension reduction; projection pursuit; statistical graphics; data visualization; exploratory data analysis; data science (search for similar items in EconPapers)
Pages: 21
Date: 2020
New Economics Papers: this item is included in nep-ecm and nep-gen
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.monash.edu/business/ebs/research/publications/ebs/wp17-2020.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:msh:ebswps:2020-17
Ordering information: This working paper can be ordered from
http://business.mona ... -business-statistics
Access Statistics for this paper
More papers in Monash Econometrics and Business Statistics Working Papers from Monash University, Department of Econometrics and Business Statistics PO Box 11E, Monash University, Victoria 3800, Australia. Contact information at EDIRC.
Bibliographic data for series maintained by Professor Xibin Zhang ().