k -POD: A Method for k -Means Clustering of Missing Data

Chi, Jocelyn T.; Chi, Eric C.; Baraniuk, Richard G.

k -POD: A Method for k -Means Clustering of Missing Data

Jocelyn T. Chi, Eric C. Chi and Richard G. Baraniuk

The American Statistician, 2016, vol. 70, issue 1, 91-99

Abstract: The k -means algorithm is often used in clustering applications but its usage requires a complete data matrix. Missing data, however, are common in many applications. Mainstream approaches to clustering missing data reduce the missing data problem to a complete data formulation through either deletion or imputation but these solutions may incur significant costs. Our k -POD method presents a simple extension of k -means clustering for missing data that works even when the missingness mechanism is unknown, when external information is unavailable, and when there is significant missingness in the data.[Received November 2014. Revised August 2015.]

Date: 2016
References: View complete reference list from CitEc
Citations: View citations in EconPapers (6)

Downloads: (external link)
http://hdl.handle.net/10.1080/00031305.2015.1086685 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:amstat:v:70:y:2016:i:1:p:91-99

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UTAS20

DOI: 10.1080/00031305.2015.1086685

Access Statistics for this article

The American Statistician is currently edited by Eric Sampson

More articles in The American Statistician from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().