EconPapers    
Economics at your fingertips  
 

Factor probabilistic distance clustering (FPDC): a new clustering method

Cristina Tortora (), Mireille Gettler Summa (), Marina Marino () and Francesco Palumbo ()
Additional contact information
Cristina Tortora: McMaster University
Mireille Gettler Summa: CEREMADE, Université Paris Dauphine
Marina Marino: University of Naples Federico II
Francesco Palumbo: University of Naples Federico II

Advances in Data Analysis and Classification, 2016, vol. 10, issue 4, No 3, 464 pages

Abstract: Abstract Factor clustering methods have been developed in recent years thanks to improvements in computational power. These methods perform a linear transformation of data and a clustering of the transformed data, optimizing a common criterion. Probabilistic distance (PD)-clustering is an iterative, distribution free, probabilistic clustering method. Factor PD-clustering (FPDC) is based on PD-clustering and involves a linear transformation of the original variables into a reduced number of orthogonal ones using a common criterion with PD-clustering. This paper demonstrates that Tucker3 decomposition can be used to accomplish this transformation. Factor PD-clustering alternatingly exploits Tucker3 decomposition and PD-clustering on transformed data until convergence is achieved. This method can significantly improve the PD-clustering algorithm performance; large data sets can thus be partitioned into clusters with increasing stability and robustness of the results. Real and simulated data sets are used to compare FPDC with its main competitors, where it performs equally well when clusters are elliptically shaped but outperforms its competitors with non-Gaussian shaped clusters or noisy data.

Keywords: Factor clustering; Probabilistic distance clustering; Tucker3; k-means; 6207; 62H30 (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://link.springer.com/10.1007/s11634-015-0219-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:10:y:2016:i:4:d:10.1007_s11634-015-0219-5

Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2

DOI: 10.1007/s11634-015-0219-5

Access Statistics for this article

Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs

More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-12
Handle: RePEc:spr:advdac:v:10:y:2016:i:4:d:10.1007_s11634-015-0219-5