EconPapers    
Economics at your fingertips  
 

Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis

Rebeca Campos-Sánchez, Marzia A Cremona, Alessia Pini, Francesca Chiaromonte and Kateryna D Makova

PLOS Computational Biology, 2016, vol. 12, issue 6, 1-41

Abstract: Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-wide study of the most recently active ERVs in the human and mouse genome. We investigated 826 fixed and 1,065 in vitro HERV-Ks in human, and 1,624 fixed and 242 polymorphic ETns, as well as 3,964 fixed and 1,986 polymorphic IAPs, in mouse. We quantitated >40 human and mouse genomic features (e.g., non-B DNA structure, recombination rates, and histone modifications) in ±32 kb of these ERVs’ integration sites and in control regions, and analyzed them using Functional Data Analysis (FDA) methodology. In one of the first applications of FDA in genomics, we identified genomic scales and locations at which these features display their influence, and how they work in concert, to provide signals essential for integration and fixation of ERVs. The investigation of ERVs of different evolutionary ages (young in vitro and polymorphic ERVs, older fixed ERVs) allowed us to disentangle integration vs. fixation preferences. As a result of these analyses, we built a comprehensive model explaining the uneven distribution of ERVs along the genome. We found that ERVs integrate in late-replicating AT-rich regions with abundant microsatellites, mirror repeats, and repressive histone marks. Regions favoring fixation are depleted of genes and evolutionarily conserved elements, and have low recombination rates, reflecting the effects of purifying selection and ectopic recombination removing ERVs from the genome. In addition to providing these biological insights, our study demonstrates the power of exploiting multiple scales and localization with FDA. These powerful techniques are expected to be applicable to many other genomic investigations.Author Summary: Approximately half of the human genome is composed of repetitive elements. Yet we do not completely understand why certain elements insert in particular genomic locations, and what determines which elements are retained and which are eliminated from the genome. To address these questions we studied endogenous retroviruses, one type of repetitive elements which occupy ~10% of the human and mouse genomes, together with genomic features characterizing various biological processes (e.g., recombination and transcription) in the neighborhoods of these elements. Using statistical techniques, we identified enrichment of genomic features in the vicinity of endogenous retroviruses of different evolutionary ages. Features overrepresented adjacent to young endogenous retroviruses are expected to have facilitated their insertion in the genome. Features overrepresented adjacent to older endogenous retroviruses are expected to have facilitated both their insertion and their chances of being sustained in the genome. Our analyses allowed us to explain the uneven distribution of endogenous retroviruses along the genome, and thus to better understand the interaction of different biological processes in shaping the evolution of genome architecture.

Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004956 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 04956&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1004956

DOI: 10.1371/journal.pcbi.1004956

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-22
Handle: RePEc:plo:pcbi00:1004956