New composite phenotypes enhance chronic kidney disease classification and genetic associations
Kim Ngan Tran,
Heidi G Sutherland,
Andrew J Mallett,
Lyn R Griffiths and
Rodney A Lea
PLOS Genetics, 2025, vol. 21, issue 5, 1-15
Abstract:
Chronic kidney disease (CKD) is a multifactorial condition driven by diverse etiologies that lead to a gradual loss of kidney function. Although genome-wide association studies (GWAS) have identified numerous genetic loci linked to CKD, a large portion of its genetic basis remains unexplained. This knowledge gap may partly arise from the reliance on single biomarkers, such as estimated glomerular filtration rate (eGFR), to assess kidney function. To address this limitation, we developed and applied a novel multi-phenotype approach, combinatorial Principal Component Analysis (cPCA), to better understand the complex genetic architecture of CKD. Using UK Biobank dataset (n = 337,112), we analyzed 21 CKD-related phenotypes, generating over 2 million composite phenotypes (CPs) through cPCA. Nearly 50,000 of these CPs demonstrated significantly higher classification power for clinical CKD compared to individual biomarkers. The top-ranked CP—a combination of albumin, cystatin C, eGFR, gamma-glutamyltransferase, HbA1c, low-density lipoprotein, and microalbuminuria, achieved an AUC of 0.878 (95% CI: 0.873–0.882), significantly outperforming eGFR alone (AUC: 0.830, 95% CI: 0.825–0.835). Genetic association analysis of the ~ 50,000 high-performing CPs identified all major eGFR-associated loci, except for the SH2B3 locus rs3184504, a loss-of-function variant, which was uniquely identified in CPs (p = 3.1×10-56) but not in eGFR within the same sample size. In addition, SH2B3 locus showed strong evidence of colocalization with eGFR, supporting its role in kidney function. These results highlight the power of the multi-phenotype cPCA approach in understanding the genetic basis of CKD, with potential applications to other complex diseases.Author summary: Chronic kidney disease (CKD) can result from diverse underlying causes, such as diabetes, high blood pressure, infections, and lifestyle factors. However, most CKD studies rely on single measurements, such as estimated glomerular filtration rate (eGFR), which assesses kidney filtration but may not fully capture the complexity of the disease. Here, we applied a novel approach to explore CKD from a broader perspective. Using a large dataset of over 300,000 individuals, we combined 21 kidney-related health measures into millions of new composite traits, providing a more comprehensive view of kidney function. One of these composite traits resulted from a combination of albumin, cystatin C, eGFR, gamma-glutamyltransferase, HbA1c, low-density lipoprotein, and microalbuminuria, proved to be significantly more effective at identifying CKD than any single measurement. Additionally, we identified key genetic factors associated with CKD, including the SH2B3 gene. By integrating multiple measurements, our work offers a clearer understanding of the genetic basis of CKD and paves the way for similar approaches to unravel other complex diseases, ultimately aiding in their prevention and treatment.
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011718 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 11718&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1011718
DOI: 10.1371/journal.pgen.1011718
Access Statistics for this article
More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().