Impact of reference population and marker density on accuracy of population imputation
Anita Kranjčevičová,
Eva Kašná,
Michaela Brzáková,
Josef Přibyl and
Luboš Vostrý
Additional contact information
Anita Kranjčevičová: Department of Genetics and Breeding, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Prague, Czech Republic
Eva Kašná: Department of Genetics and Breeding of Farm Animals, Institute of Animal Science, Prague-Uhříněves, Czech Republic
Michaela Brzáková: Department of Genetics and Breeding, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Prague, Czech Republic
Josef Přibyl: Department of Genetics and Breeding of Farm Animals, Institute of Animal Science, Prague-Uhříněves, Czech Republic
Luboš Vostrý: Department of Genetics and Breeding, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Prague, Czech Republic
Czech Journal of Animal Science, 2019, vol. 64, issue 10, 405-410
Abstract:
The effect of the reference population size and the number of missing single nucleotide polymorphisms (SNPs) on imputation accuracy was determined. The population imputation method using the FImpute software was applied. The dataset used for the purpose of this study was taken from the database of the Holstein Cattle Breeders Association of the Czech Republic. It contains 1000 animals genotyped with the Illumina BovineSNP50 v.2 BeadChip. Two datasets were created, the first containing the original genotypes, including the missing SNPs, the second containing the same genotypes modified to avoid missing data. In these datasets, animals were randomly selected for a reference population (10, 25, 50 and 75%) and there were randomly selected SNPs for deletion (15, 30, 55, 70, and 95%) in animals that were not used as the reference population. Subsequently, the data accuracy was determined by two parameters: correlation between original and imputed SNPs and percentage of correctly imputed SNPs. Since animals and SNPs were randomly selected, the process including data imputation was repeated 100 times. Accuracy was determined as the average accuracy over all repetitions. It was found that the imputation accuracy is influenced by both parameters. If the size of the reference population is sufficient, the imputation accuracy is higher despite the large number of missing SNPs.
Keywords: cattle; genomics; marker density; missing SNPs; simulation (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://cjas.agriculturejournals.cz/doi/10.17221/148/2019-CJAS.html (text/html)
http://cjas.agriculturejournals.cz/doi/10.17221/148/2019-CJAS.pdf (application/pdf)
free of charge
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:caa:jnlcjs:v:64:y:2019:i:10:id:148-2019-cjas
DOI: 10.17221/148/2019-CJAS
Access Statistics for this article
Czech Journal of Animal Science is currently edited by Bc. Michaela Polcarová
More articles in Czech Journal of Animal Science from Czech Academy of Agricultural Sciences
Bibliographic data for series maintained by Ivo Andrle ().