Descriptive statistics of large data sets by scatter plots, an exploratory approach
W.J.J. Rey
Statistica Neerlandica, 1992, vol. 46, issue 4, 283-297
Abstract:
In the analysis of large tables of M variables on N observations one is interested in the relations between the variables and it is usual to inspect the M(M‐1)/2 scatter plots of N points. Clearly, the scatter plot approach relies on visual inspection and is to be preferred in so far as applicable to detect simple relations, namely when M is small. Other approaches are needed for large values of M. We consider that only the relatively few scatter plots that present a ‘structure’ are of interest for an exploratory analysis and, by ‘structure’, we mean a domain of specially high local density in the plot. Based on this concept, we propose a method constructed around two steps: the selection of the possibly interesting pairs of variables and the validation of the corresponding scatter plots. The selection of the pairs results from an algorithm based on a binary partitioning tree. The validation of the corresponding scatter plots enables the production of only those where a structure is found the recognition of a structure is derived from a statistic based on the length of the Minimum Spanning Tree constructed on the N points of the candidate scatter plot. For illustration, we report on an industrial application where the method is routinely applied for exploratory purposes.
Date: 1992
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1111/j.1467-9574.1992.tb01346.x
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:stanee:v:46:y:1992:i:4:p:283-297
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0039-0402
Access Statistics for this article
Statistica Neerlandica is currently edited by Miroslav Ristic, Marijtje van Duijn and Nan van Geloven
More articles in Statistica Neerlandica from Netherlands Society for Statistics and Operations Research
Bibliographic data for series maintained by Wiley Content Delivery ().