EconPapers    
Economics at your fingertips  
 

Markov chain Monte Carlo sampling of gene genealogies conditional on unphased SNP genotype data

Burkett Kelly M. (), McNeney Brad and Graham Jinko
Additional contact information
Burkett Kelly M.: Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada Current address: Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Street, Ottawa, ON, K1N 6N5, Canada
McNeney Brad: Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Graham Jinko: Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada

Statistical Applications in Genetics and Molecular Biology, 2013, vol. 12, issue 5, 559-581

Abstract: The gene genealogy is a tree describing the ancestral relationships among genes sampled from unrelated individuals. Knowledge of the tree is useful for inference of population-genetic parameters and has potential application in gene-mapping. Markov chain Monte Carlo approaches that sample genealogies conditional on observed genetic data typically assume that haplotype data are observed even though commonly-used genotyping technologies provide only unphased genotype data. We have extended our haplotype-based genealogy sampler, sampletrees, to handle unphased genotype data. We use the sampled haplotype configurations as a diagnostic for adequate sampling of the tree space based on the reasoning that if haplotype sampling is restricted, sampling from the tree space will also be restricted. We compare the distributions of sampled haplotypes across multiple runs of sampletrees, and to those estimated by the phase inference program, PHASE. Performance was excellent for the majority of individuals as shown by the consistency of results across multiple runs. However, for some individuals in some datasets, sampletrees had problems sampling haplotype configurations; longer run lengths would be required for these datasets. For many datasets though, we expect that sampletrees will be useful for sampling from the posterior distribution of gene genealogies given unphased genotype data.

Keywords: coalescent model; gene genealogy; haplotype uncertainty; Markov chain Monte Carlo (search for similar items in EconPapers)
Date: 2013
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/sagmb-2012-0011 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:12:y:2013:i:5:p:559-581:n:2

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.1515/sagmb-2012-0011

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:12:y:2013:i:5:p:559-581:n:2