MapperPlus: Agnostic clustering of high-dimension data for precision medicine
Esha Datta,
Aditya Ballal,
Javier E López and
Leighton T Izu
PLOS Digital Health, 2023, vol. 2, issue 8, 1-16
Abstract:
One of the goals of precision medicine is to classify patients into subgroups that differ in their susceptibility and response to a disease, thereby enabling tailored treatments for each subgroup. Therefore, there is a great need to identify distinctive clusters of patients from patient data. There are three key challenges to three key challenges of patient stratification: 1) the unknown number of clusters, 2) the need for assessing cluster validity, and 3) the clinical interpretability. We developed MapperPlus, a novel unsupervised clustering pipeline, that directly addresses these challenges. It extends the topological Mapper technique and blends it with two random-walk algorithms to automatically detect disjoint subgroups in patient data. We demonstrate that MapperPlus outperforms traditional agnostic clustering methods in key accuracy/performance metrics by testing its performance on publicly available medical and non-medical data set. We also demonstrate the predictive power of MapperPlus in a medical dataset of pediatric stem cell transplant patients where a number of cluster is unknown. Here, MapperPlus stratifies the patient population into clusters with distinctive survival rates. The MapperPlus software is open-source and publicly available.Author summary: The era of precision medicine represents a unique and exciting opportunity in transforming the way we treat patients. With the immense availability of biomedical data and new computational techniques, we are more able than ever to understand what makes a patient unique. Indeed, even for a single condition, we can recognize that there are heterogeneities within the patient population. Understanding these differences can and should influence the way we treat patients. Key to this process is patient stratification, which is the division of patient populations into clinically meaningful subgroups. The goal of patient stratification is to capture the individuality of patients without becoming overly fine-grained. This is an exciting balancing act that engages both meaningful medical and mathematical questions. We develop the MapperPlus pipeline for patient stratification. This is an unsupervised learning pipeline that leverages the mathematical notion of topology to detect clusters within high-dimensional data. It is effective in many settings and we demonstrate, in particular, its efficacy in a precision medicine application.
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000307 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 00307&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0000307
DOI: 10.1371/journal.pdig.0000307
Access Statistics for this article
More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().