k -POD: A Method for k -Means Clustering of Missing Data
Jocelyn T. Chi,
Eric C. Chi and
Richard G. Baraniuk
The American Statistician, 2016, vol. 70, issue 1, 91-99
Abstract:
The k -means algorithm is often used in clustering applications but its usage requires a complete data matrix. Missing data, however, are common in many applications. Mainstream approaches to clustering missing data reduce the missing data problem to a complete data formulation through either deletion or imputation but these solutions may incur significant costs. Our k -POD method presents a simple extension of k -means clustering for missing data that works even when the missingness mechanism is unknown, when external information is unavailable, and when there is significant missingness in the data.[Received November 2014. Revised August 2015.]
Date: 2016
References: View complete reference list from CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
http://hdl.handle.net/10.1080/00031305.2015.1086685 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:amstat:v:70:y:2016:i:1:p:91-99
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UTAS20
DOI: 10.1080/00031305.2015.1086685
Access Statistics for this article
The American Statistician is currently edited by Eric Sampson
More articles in The American Statistician from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().