Detecting Sample Misidentifications in Genetic Association Studies
Ekstrøm Claus T. and
Feenstra Bjarke
Additional contact information
Ekstrøm Claus T.: University of Southern Denmark, Biostatistics, Faculty of Health Sciences
Feenstra Bjarke: Department of Epidemiology Research, Statens Serum Institut, Denmark
Statistical Applications in Genetics and Molecular Biology, 2012, vol. 11, issue 3, 19
Abstract:
Genetic association studies require that the genotype data from a given person can be correctly linked to the phenotype data from the same person. However, sample misidentification errors sometimes happen, whereby the link becomes invalid for some of the subjects in a study. This can have substantial consequences in terms of power to detect truly associated variants. In family-based studies, Mendelian inconsistencies can be used to detect sample misidentification. Genome-wide association studies (GWAS), however, typically use unrelated individuals, making error detection more problematic.Here we present a method for identifying potential sample misidentifications in GWAS and other genetic association studies building on ideas from forensic sciences. A widely used ad-hoc method for error detection is to check if the sex of an individual matches its X-linked genotype. We generalize this idea to less stringent associations between known genotypes and phenotypes, and show that if several known associations are combined, the power to detect misidentifications increases substantially. Individuals with an unlikely set of phenotypes given their genotypes are flagged as potential errors.We provide analytical and simulation results comparing the odds that the genotype and phenotype are both from the same individual for different numbers of available genotype-phenotype associations and for different information content of the associations. Our method has good sensitivity and specificity with as few as ten moderately informative genotype-phenotype associations. We apply the method to GWAS data from the Danish National Birth Cohort.
Keywords: error detection; genome-wide association studies; known genotype-phenotype associations; outlier detection (search for similar items in EconPapers)
Date: 2012
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/1544-6115.1772 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:11:y:2012:i:3:n:13
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.1515/1544-6115.1772
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().