EconPapers    
Economics at your fingertips  
 

Multiplets in scRNA-seq data: Extent of the problem and efficacy of methods for removal

Dimitris Ttoouli and Daniel Hoffmann

PLOS ONE, 2025, vol. 20, issue 10, 1-24

Abstract: Multiplets—droplets that capture more than one cell—are a known artefact in droplet-based single-cell RNA sequencing (scRNA-seq), yet their prevalence and impact remain underestimated. In this study, we assess the frequency of multiplets across diverse publicly available datasets and evaluate how well commonly used detection tools are able to identify them. Using cell hashing data to determine a lower bound of the true multiplet rate, we demonstrate that commonly used heuristic estimations systematically underestimate multiplet rates, and that existing tools—despite optimized parameters—detect only a small subset of cell-hashing multiplets. We further refine a Poisson-based model to estimate the true multiplet rate, revealing that actual rates can exceed heuristic predictions by more than twofold. Downstream analyses are significantly affected by multiplets: they are not confined to isolated clusters but are distributed throughout the transcriptional landscape, where they distort clustering and cell type annotation. In differential gene expression analysis, multiplets inflated artefactual signals while expected cell-type markers remained stable, leading to shifts in effect sizes and partial loss of significant genes despite high overall fold-change correlation. Using both quantitative and qualitative approaches, we visualize these effects and show that cell-hashing-informed multiplet removal eliminates artefactual clusters and improves annotation clarity, whereas computationally detected multiplets fail to fully remove artefacts in the most common experimental contexts. Our findings confirm that multiplet contamination remains a pervasive and under-addressed issue in scRNA-seq analysis. Since most datasets lack multiplexing, researchers must often rely on heuristics and limited tools, leaving many multiplets unidentified. We advocate for more robust multiplet-detection strategies, including multimodal validation, to ensure more accurate and interpretable scRNA-seq results.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0333687 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 33687&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0333687

DOI: 10.1371/journal.pone.0333687

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-11-29
Handle: RePEc:plo:pone00:0333687