EconPapers    
Economics at your fingertips  
 

ExpertRNA: A New Framework for RNA Secondary Structure Prediction

Menghan Liu (), Erik Poppleton (), Giulia Pedrielli (), Petr Šulc () and Dimitri P. Bertsekas ()
Additional contact information
Menghan Liu: School of Computing Informatics and Decision Systems Engineering, Arizona State University, Tempe, Arizona 85281
Erik Poppleton: School of Molecular Sciences and Center for Molecular Design and Biomimetics, Arizona State University, Tempe, Arizona 85281
Giulia Pedrielli: School of Computing Informatics and Decision Systems Engineering, Arizona State University, Tempe, Arizona 85281
Petr Šulc: School of Molecular Sciences and Center for Molecular Design and Biomimetics, Arizona State University, Tempe, Arizona 85281
Dimitri P. Bertsekas: School of Computing Informatics and Decision Systems Engineering, Arizona State University, Tempe, Arizona 85281; Massachusetts Institute of Technology, Electrical Engineering, Cambridge, Massachusetts 02139

INFORMS Journal on Computing, 2022, vol. 34, issue 5, 2464-2484

Abstract: Ribonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of efficient algorithms that solve the “folding problem”—given a sequence of nucleotides, return a probable list of base pairs, referred to as the secondary structure prediction. Early algorithms largely rely on finding the structure with minimum free energy. However, the predictions rely on effective simplified free energy models that may not correctly identify the correct structure as the one with the lowest free energy. In light of this, new, data-driven approaches that not only consider free energy, but also use machine learning techniques to learn motifs are also investigated and recently been shown to outperform free energy–based algorithms on several experimental data sets. In this work, we introduce the new ExpertRNA algorithm that provides a modular framework that can easily incorporate an arbitrary number of rewards (free energy or nonparametric/data driven) and secondary structure prediction algorithms. We argue that this capability of ExpertRNA has the potential to balance out different strengths and weaknesses of state-of-the-art folding tools. We test ExpertRNA on several RNA sequence-structure data sets, and we compare the performance of ExpertRNA against a state-of-the-art folding algorithm. We find that ExpertRNA produces, on average, more accurate predictions of nonpseudoknotted secondary structures than the structure prediction algorithm used, thus validating the promise of the approach. Summary of Contribution: ExpertRNA is a new algorithm inspired by a biological problem. It is applied to solve the problem of secondary structure prediction for RNA molecules given an input sequence. The computational contribution is given by the design of a multibranch, multiexpert rollout algorithm that enables the use of several state-of-the-art approaches as base heuristics and allowing several experts to evaluate partial candidate solutions generated, thus avoiding assuming the reward being optimized by an RNA molecule when folding. Our implementation allows for the effective use of parallel computational resources as well as to control the size of the rollout tree as the algorithm progresses. The problem of RNA secondary structure prediction is of primary importance within the biology field because the molecule structure is strongly related to its functionality. Whereas the contribution of the paper is in the algorithm, the importance of the application makes ExpertRNA a showcase of the relevance of computationally efficient algorithms in supporting scientific discovery.

Keywords: computational science; biology; computational methods; dynamic programming; applications; deterministic; industries; pharmaceutical (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/ijoc.2022.1188 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:orijoc:v:34:y:2022:i:5:p:2464-2484

Access Statistics for this article

More articles in INFORMS Journal on Computing from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:orijoc:v:34:y:2022:i:5:p:2464-2484