EconPapers    
Economics at your fingertips  
 

STICI: Split-Transformer with integrated convolutions for genotype imputation

Mohammad Erfan Mowlaei, Chong Li, Oveis Jamialahmadi, Raquel Dias, Junjie Chen, Benyamin Jamialahmadi, Timothy Richard Rebbeck, Vincenzo Carnevale, Sudhir Kumar and Xinghua Shi ()
Additional contact information
Mohammad Erfan Mowlaei: Temple University
Chong Li: Temple University
Oveis Jamialahmadi: University of Gothenburg
Raquel Dias: University of Florida
Junjie Chen: Harbin Institute of Technology
Benyamin Jamialahmadi: University of Waterloo
Timothy Richard Rebbeck: Dana-Farber Cancer Institute
Vincenzo Carnevale: Temple University
Sudhir Kumar: Temple University
Xinghua Shi: Temple University

Nature Communications, 2025, vol. 16, issue 1, 1-14

Abstract: Abstract Despite advances in sequencing technologies, genome-scale datasets often contain missing bases and genomic segments, hindering downstream analyses. Genotype imputation addresses this issue and has been a cornerstone pre-processing step in genetic and genomic studies. Although various methods have been widely adopted for genotype imputation, it remains challenging to impute certain genomic regions and large structural variants. Here, we present a transformer-based framework, named STICI, for accurate genotype imputation. STICI models automatically learn genome-wide patterns of linkage disequilibrium, evidenced by much higher imputation accuracy in regions with highly linked variants. Our imputation results on the human 1000 Genomes Project and non-human genomes show that STICI can achieve high imputation accuracy comparable to the state-of-the-art genotype imputation methods, with the additional capability to impute multi-allelic variants and various types of genetic variants. STICI can be trained for any collection of genomes automatically using self-supervision. Moreover, STICI shows excellent performance without needing any special presuppositions about the underlying patterns in collections of non-human genomes, pointing to adaptability and applications of STICI to impute missing genotypes in any species.

Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-025-56273-3 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56273-3

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-025-56273-3

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-22
Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56273-3