Detecting Sample Misidentifications in Genetic Association Studies

T., Ekstrøm Claus; Bjarke, Feenstra

Detecting Sample Misidentifications in Genetic Association Studies

Ekstrøm Claus T. and Feenstra Bjarke
Additional contact information
Ekstrøm Claus T.: University of Southern Denmark, Biostatistics, Faculty of Health Sciences
Feenstra Bjarke: Department of Epidemiology Research, Statens Serum Institut, Denmark

Statistical Applications in Genetics and Molecular Biology, 2012, vol. 11, issue 3, 19

Abstract: Genetic association studies require that the genotype data from a given person can be correctly linked to the phenotype data from the same person. However, sample misidentification errors sometimes happen, whereby the link becomes invalid for some of the subjects in a study. This can have substantial consequences in terms of power to detect truly associated variants. In family-based studies, Mendelian inconsistencies can be used to detect sample misidentification. Genome-wide association studies (GWAS), however, typically use unrelated individuals, making error detection more problematic.Here we present a method for identifying potential sample misidentifications in GWAS and other genetic association studies building on ideas from forensic sciences. A widely used ad-hoc method for error detection is to check if the sex of an individual matches its X-linked genotype. We generalize this idea to less stringent associations between known genotypes and phenotypes, and show that if several known associations are combined, the power to detect misidentifications increases substantially. Individuals with an unlikely set of phenotypes given their genotypes are flagged as potential errors.We provide analytical and simulation results comparing the odds that the genotype and phenotype are both from the same individual for different numbers of available genotype-phenotype associations and for different information content of the associations. Our method has good sensitivity and specificity with as few as ten moderately informative genotype-phenotype associations. We apply the method to GWAS data from the Danish National Birth Cohort.

Keywords: error detection; genome-wide association studies; known genotype-phenotype associations; outlier detection (search for similar items in EconPapers)
Date: 2012
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/1544-6115.1772 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:11:y:2012:i:3:n:13

Ordering information: This journal article can be ordered from
https://www.degruyte ... urnal/key/sagmb/html

DOI: 10.1515/1544-6115.1772

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().