Randomized Spatial PCA (RASP): A computationally efficient method for dimensionality reduction of high-resolution spatial transcriptomics data
Ian K Gingerich,
Brittany A Goods and
H Robert Frost
PLOS Computational Biology, 2025, vol. 21, issue 12, 1-28
Abstract:
Spatial transcriptomics (ST) provides critical insights into the spatial organization of gene expression, enabling researchers to unravel the intricate relationship between cellular environments and biological function. Identifying spatial domains within tissues is key to understanding tissue architecture and mechanisms underlying development and disease progression. Here, we present Randomized Spatial PCA (RASP), a novel spatially-aware dimensionality reduction method for ST data. RASP is designed to be orders-of-magnitude faster than existing techniques, scale to datasets with 100,000+ locations, support flexible integration of non-transcriptomic covariates, and reconstruct de-noised, spatially-smoothed gene expression values. RASP itself is not a clustering or domain detection method; cell types and spatial regions are obtained by clustering the RASP PCs, and the effective cluster resolution depends on the K-nearest-neighbor (kNN) graph and a smoothing parameter β. It employs a randomized two-stage PCA framework and configurable spatial smoothing. RASP was compared to BASS, GraphST, SEDR, SpatialPCA, STAGATE, and CellCharter using diverse ST datasets (10x Visium, Stereo-Seq, MERFISH, 10x Xenium) on human and mouse tissues. In these benchmarks, RASP delivers comparable or superior accuracy in tissue-domain detection while achieving substantial improvements in computational speed. Its efficiency not only reduces runtime and resource requirements but also makes it practical to explore a broad range of spatial-smoothing parameters in a high-throughput fashion. By enabling rapid re-analysis under different parameter settings, RASP empowers users to fine-tune the balance between resolution and noise suppression on large, high-resolution subcellular datasets—a critical capability when investigating complex tissue architecture.Author summary: Spatial transcriptomics (ST) technologies enable unprecedented insights into the spatial organization of gene expression within tissues, yet analysis of these increasingly large and complex datasets remains computationally challenging. We present Randomized Spatial PCA (RASP), a novel, scalable, and computationally efficient dimensionality reduction method tailored for spatial transcriptomics data. Unlike existing methods, RASP can rapidly process datasets with hundreds of thousands of spatial locations and integrates non-transcriptomic covariates to improve biological signal recovery. By combining randomized linear algebra with spatial smoothing, RASP produces spatially informed principal components that support downstream clustering and spatial domain identification across diverse ST platforms, including high-throughput sequencing and in situ imaging technologies. Benchmarking on multiple real and simulated datasets demonstrates that RASP achieves comparable or superior accuracy to state-of-the-art methods while drastically reducing computational time and resource requirements. This efficiency empowers researchers to explore biological questions at multiple spatial resolutions and scales, facilitating robust, high-throughput spatial analysis critical for advancing our understanding of complex tissue architectures.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013759 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13759&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013759
DOI: 10.1371/journal.pcbi.1013759
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().