Analytical and computational solution for the estimation of SNP-heritability in biobank-scale and distributed datasets
Guo-An Qi,
Qi-Xin Zhang,
Jingyu Kang,
Tianyuan Li,
Xiyun Xu,
Zhe Zhang,
Zhe Fan,
Siyang Liu and
Guo-Bo Chen
PLOS Computational Biology, 2025, vol. 21, issue 10, 1-20
Abstract:
Author summary: For a complex trait, heritability (h2) gives the genetic determination of its variation. Given the emergence of biobank-scale data, a more powerful method is needed to estimate h2. Based on the framework of Haseman-Elston regression (RHE-reg), we integrate a fast randomization algorithm to estimate h2, and RHE-reg can tackle biobank-scale data, such as UK Biobank (UKB), very efficiently. Furthermore, we present an analytical solution that balances computational cost and precision of the estimation, a property that is important in dealing with biobank-scale data. We investigated the performance of the RHE-reg in simulated data and also applied it for 81 UKB quantitative traits; as tested in UKB data of nearly 300,000 unrelated individuals, it took on average about 4.5 hours to complete an estimation when used 10 CPUs. We extended the application of RHE-reg into distributed datasets when privacy is not compromised. As shown in UKB and simulated data the performance of RHE-reg was accurate in estimating h2. The software for estimating SNP-heritability for biobank-scale data is released.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013568 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13568&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013568
DOI: 10.1371/journal.pcbi.1013568
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().