EconPapers    
Economics at your fingertips  
 

A graphical model method for integrating multiple sources of genome-scale data

Dvorkin Daniel (), Biehs Brian and Kechris Katerina
Additional contact information
Dvorkin Daniel: Computational Bioscience Program, University of Colorado School of Medicine, Mail Stop 8303, 12801 E. 17th Ave., RC1S-L18 6103, Aurora, CO 80045–0511, USA
Biehs Brian: Cardiovascular Research Institute and Department of Biochemistry and Biophysics, University of California at San Francisco, San Francisco, CA 94143–2711, USA
Kechris Katerina: Computational Bioscience Program, University of Colorado School of Medicine, Mail Stop 8303, 12801 E. 17th Ave., RC1S-L18 6103, Aurora, CO 80045–0511, USA Department of Biostatistics and Informatics, Colorado School of Public Health, 13001 E. 17th Place, B-119, Aurora, CO 80045, USA

Statistical Applications in Genetics and Molecular Biology, 2013, vol. 12, issue 4, 469-487

Abstract: Making effective use of multiple data sources is a major challenge in modern bioinformatics. Genome-wide data such as measures of transcription factor binding, gene expression, and sequence conservation, which are used to identify binding regions and genes that are important to major biological processes such as development and disease, can be difficult to use together due to the different biological meanings and statistical distributions of the heterogeneous data types, but each can provide valuable information for understanding the processes under study. Here we present methods for integrating multiple data sources to gain a more complete picture of gene regulation and expression. Our goal is to identify genes and cis-regulatory regions which play specific biological roles. We describe a graphical mixture model approach for data integration, examine the effect of using different model topologies, and discuss methods for evaluating the effectiveness of the models. Model fitting is computationally efficient and produces results which have clear biological and statistical interpretations. The Hedgehog and Dorsal signaling pathways in Drosophila, which are critical in embryonic development, are used as examples.

Keywords: data integration; genomics; graphical models; mixture models (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/sagmb-2012-0051 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:12:y:2013:i:4:p:469-487:n:4

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.1515/sagmb-2012-0051

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:12:y:2013:i:4:p:469-487:n:4