EconPapers    
Economics at your fingertips  
 

A new procedure to optimize the selection of groups in a classification tree: Applications for ecological data

Lionel Guidi, Frédéric Ibanez, Vincent Calcagno and Grégory Beaugrand

Ecological Modelling, 2009, vol. 220, issue 4, 451-461

Abstract: Agglomerative cluster analyses encompass many techniques, which have been widely used in various fields of science. In biology, and specifically ecology, datasets are generally highly variable and may contain outliers, which increase the difficulty to identify the number of clusters. Here we present a new criterion to determine statistically the optimal level of partition in a classification tree. The criterion robustness is tested against perturbated data (outliers) using an observation or variable with values randomly generated. The technique, called Random Simulation Test (RST), is tested on (1) the well-known Iris dataset [Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann. Eugenic. 7, 179–188], (2) simulated data with predetermined numbers of clusters following Milligan and Cooper [Milligan, G.W., Cooper, M.C., 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179] and finally (3) is applied on real copepod communities data previously analyzed in Beaugrand et al. [Beaugrand, G., Ibanez, F., Lindley, J.A., Reid, P.C., 2002. Diversity of calanoid copepods in the North Atlantic and adjacent seas: species associations and biogeography. Mar. Ecol. Prog. Ser. 232, 179–195]. The technique is compared to several standard techniques. RST performed generally better than existing algorithms on simulated data and proved to be especially efficient with highly variable datasets.

Keywords: Classification; Clustering; Dendrogram; Stopping rule; Outliers (search for similar items in EconPapers)
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0304380008005437
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:ecomod:v:220:y:2009:i:4:p:451-461

DOI: 10.1016/j.ecolmodel.2008.11.006

Access Statistics for this article

Ecological Modelling is currently edited by Brian D. Fath

More articles in Ecological Modelling from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:ecomod:v:220:y:2009:i:4:p:451-461