*K-means and Cluster Models for Cancer Signatures

Kakushadze, Zura; Yu, Willie

*K-means and Cluster Models for Cancer Signatures

Zura Kakushadze and Willie Yu

Abstract: We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means' computational cost is a fraction of NMF's. Using 1,389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.

Date: 2017-03, Revised 2017-07
New Economics Papers: this item is included in nep-cmp and nep-hea
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Published in Biomolecular Detection and Quantification 13 (2017) 7-31

Downloads: (external link)
https://arxiv.org/pdf/1703.00703 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:1703.00703

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().