EconPapers    
Economics at your fingertips  
 

Accuracy and Computational Efficiency of a Graphical Modeling Approach to Linkage Disequilibrium Estimation

Abel Haley J and Thomas Alun

Statistical Applications in Genetics and Molecular Biology, 2011, vol. 10, issue 1, 1-15

Abstract: We develop recent work on using graphical models for linkage disequilibrium to provide efficient programs for model fitting, phasing, and imputation of missing data in large data sets. Two important features contribute to the computational efficiency: the separation of the model fitting and phasing-imputation processes into different programs, and holding in memory only the data within a moving window of loci during model fitting. Optimal parameter values were chosen by cross-validation to maximize the probability of correctly imputing masked genotypes. The best accuracy obtained is slightly below than that from the Beagle program of Browning and Browning, and our fitting program is slower. However, for large data sets, it uses less storage. For a reference set of n individuals genotyped at m markers, the time and storage required for fitting a graphical model are approximately O(nm) and O(n+m), respectively. To impute the phases and missing data on n individuals using an already fitted graphical model requires O(nm) time and O(m) storage. While the times for fitting and imputation are both O(nm), the imputation process is considerably faster; thus, once a model is estimated from a reference data set, the marginal cost of phasing and imputing further samples is very low.

Keywords: phasing-imputation; cross validation; SNP genotype assays (search for similar items in EconPapers)
Date: 2011
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.2202/1544-6115.1615 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:10:y:2011:i:1:n:5

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.2202/1544-6115.1615

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:10:y:2011:i:1:n:5