EconPapers    
Economics at your fingertips  
 

Outcome‐guided sparse K‐means for disease subtype discovery via integrating phenotypic data with high‐dimensional transcriptomic data

Lingsong Meng, Dorina Avram, George Tseng and Zhiguang Huo

Journal of the Royal Statistical Society Series C, 2022, vol. 71, issue 2, 352-375

Abstract: The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from existing approaches are not necessarily associated with clinical outcomes. With the rich clinical data along with the omics data in modern epidemiology cohorts, it is urgent to develop an outcome‐guided clustering algorithm to fully integrate the phenotypic data with the high‐dimensional omics data. Hence, we extended a sparse K‐means method to an outcome‐guided sparse K‐means (GuidedSparseKmeans) method. An unified objective function was proposed, which was comprised of (i) weighted K‐means to perform sample clusterings; (ii) lasso regularizations to perform gene selection from the high‐dimensional omics data; and (iii) incorporation of a phenotypic variable from the clinical dataset to facilitate biologically meaningful clustering results. By iteratively optimizing the objective function, we will simultaneously obtain a phenotype‐related sample clustering results and gene selection results. We demonstrated the superior performance of the GuidedSparseKmeans by comparing with existing clustering methods in simulations and applications of high‐dimensional transcriptomic data of breast cancer and Alzheimer's disease. Our algorithm has been implemented into an R package, which is publicly available on GitHub ( https://github.com/LingsongMeng/GuidedSparseKmeans).

Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1111/rssc.12536

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jorssc:v:71:y:2022:i:2:p:352-375

Ordering information: This journal article can be ordered from
http://ordering.onli ... 1111/(ISSN)1467-9876

Access Statistics for this article

Journal of the Royal Statistical Society Series C is currently edited by R. Chandler and P. W. F. Smith

More articles in Journal of the Royal Statistical Society Series C from Royal Statistical Society Contact information at EDIRC.
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jorssc:v:71:y:2022:i:2:p:352-375