Leveraging heterogeneity across multiple datasets increases cell-mixture deconvolution accuracy and reduces biological and technical biases

Vallania, Francesco; Tam, Andrew; Lofgren, Shane; Schaffert, Steven; Azad, Tej D.; Bongen, Erika; Haynes, Winston; Alsup, Meia; Alonso, Michael; Davis, Mark; Engleman, Edgar; Khatri, Purvesh

Leveraging heterogeneity across multiple datasets increases cell-mixture deconvolution accuracy and reduces biological and technical biases

Francesco Vallania, Andrew Tam, Shane Lofgren, Steven Schaffert, Tej D. Azad, Erika Bongen, Winston Haynes, Meia Alsup, Michael Alonso, Mark Davis, Edgar Engleman and Purvesh Khatri ()
Additional contact information
Francesco Vallania: Institute for Immunity, Transplantation and Infection, Stanford University
Andrew Tam: Institute for Immunity, Transplantation and Infection, Stanford University
Shane Lofgren: Institute for Immunity, Transplantation and Infection, Stanford University
Steven Schaffert: Institute for Immunity, Transplantation and Infection, Stanford University
Tej D. Azad: Institute for Immunity, Transplantation and Infection, Stanford University
Erika Bongen: Institute for Immunity, Transplantation and Infection, Stanford University
Winston Haynes: Stanford University
Meia Alsup: Institute for Immunity, Transplantation and Infection, Stanford University
Michael Alonso: Stanford University
Mark Davis: Institute for Immunity, Transplantation and Infection, Stanford University
Edgar Engleman: Stanford University
Purvesh Khatri: Institute for Immunity, Transplantation and Infection, Stanford University

Nature Communications, 2018, vol. 9, issue 1, 1-8

Abstract: Abstract In silico quantification of cell proportions from mixed-cell transcriptomics data (deconvolution) requires a reference expression matrix, called basis matrix. We hypothesize that matrices created using only healthy samples from a single microarray platform would introduce biological and technical biases in deconvolution. We show presence of such biases in two existing matrices, IRIS and LM22, irrespective of deconvolution method. Here, we present immunoStates, a basis matrix built using 6160 samples with different disease states across 42 microarray platforms. We find that immunoStates significantly reduces biological and technical biases. Importantly, we find that different methods have virtually no or minimal effect once the basis matrix is chosen. We further show that cellular proportion estimates using immunoStates are consistently more correlated with measured proportions than IRIS and LM22, across all methods. Our results demonstrate the need and importance of incorporating biological and technical heterogeneity in a basis matrix for achieving consistently high accuracy.

Date: 2018
References: Add references at CitEc
Citations: View citations in EconPapers (6)

Downloads: (external link)
https://www.nature.com/articles/s41467-018-07242-6 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:9:y:2018:i:1:d:10.1038_s41467-018-07242-6

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-018-07242-6

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().