EconPapers    
Economics at your fingertips  
 

A machine-compiled database of genome-wide association studies

Volodymyr Kuleshov (), Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Ré, Serafim Batzoglou and Michael Snyder
Additional contact information
Volodymyr Kuleshov: Stanford University
Jialin Ding: Stanford University
Christopher Vo: Stanford University
Braden Hancock: Stanford University
Alexander Ratner: Stanford University
Yang Li: University of Chicago
Christopher Ré: Stanford University
Serafim Batzoglou: Stanford University
Michael Snyder: Stanford University School of Medicine

Nature Communications, 2019, vol. 10, issue 1, 1-8

Abstract: Abstract Tens of thousands of genotype-phenotype associations have been discovered to date, yet not all of them are easily accessible to scientists. Here, we describe GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms. Our information extraction system helps curators by automatically collecting over 6,000 associations from open-access publications with an estimated recall of 60–80% and with an estimated precision of 78–94% (measured relative to existing manually curated knowledge bases). This system represents a fully automated GWAS curation effort and is made possible by a paradigm for constructing machine learning systems called data programming. Our work represents a step towards making the curation of scientific literature more efficient using automated systems.

Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.nature.com/articles/s41467-019-11026-x Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-11026-x

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-019-11026-x

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-11026-x