- AutorIn
- Christina Otto
- Mathias Möhl
- Steffen Heyne
- Mika Amit
- Gad M. Landau
- Rolf Backofen
- Sebastian Will
- Titel
- ExpaRNA-P : simultaneous exact pattern matching and folding of RNAs
- Zitierfähige Url:
- https://nbn-resolving.org/urn:nbn:de:bsz:15-qucosa-159847
- Quellenangabe
- BMC Bioinformatics 2014, 15:404 doi:10.1186/s12859-014-0404-0
- Erstveröffentlichung
- 2014
- Abstract (EN)
- Background: Identifying sequence-structure motifs common to two RNAs can speed up the comparison of structural RNAs substantially. The core algorithm of the existent approach ExpaRNA solves this problem for a priori known input structures. However, such structures are rarely known; moreover, predicting them computationally is no rescue, since single sequence structure prediction is highly unreliable. Results: The novel algorithm ExpaRNA-P computes exactly matching sequence-structure motifs in entire Boltzmann-distributed structure ensembles of two RNAs; thereby we match and fold RNAs simultaneously, analogous to the well-known “simultaneous alignment and folding” of RNAs. While this implies much higher flexibility compared to ExpaRNA, ExpaRNA-P has the same very low complexity (quadratic in time and space), which is enabled by its novel structure ensemble-based sparsification. Furthermore, we devise a generalized chaining algorithm to compute compatible subsets of ExpaRNA-P’s sequence-structure motifs. Resulting in the very fast RNA alignment approach ExpLoc-P, we utilize the best chain as anchor constraints for the sequence-structure alignment tool LocARNA. ExpLoc-P is benchmarked in several variants and versus state-of-the-art approaches. In particular, we formally introduce and evaluate strict and relaxed variants of the problem; the latter makes the approach sensitive to compensatory mutations. Across a benchmark set of typical non-coding RNAs, ExpLoc-P has similar accuracy to LocARNA but is four times faster (in both variants), while it achieves a speed-up over 30-fold for the longest benchmark sequences (≈400nt). Finally, different ExpLoc-P variants enable tailoring of the method to specific application scenarios. ExpaRNA-P and ExpLoc-P are distributed as part of the LocARNA package. The source code is freely available at http://www.bioinf.uni-freiburg.de/Software/ExpaRNA-P webcite. Conclusions: ExpaRNA-P’s novel ensemble-based sparsification reduces its complexity to quadratic time and space. Thereby, ExpaRNA-P significantly speeds up sequence-structure alignment while maintaining the alignment quality. Different ExpaRNA-P variants support a wide range of applications.
- Andere Ausgabe
- Link zur Originalpublikation in der Zeitschrift BMC Bioinformatics
Link: http://dx.doi.org/10.1186/s12859-014-0404-0 - Freie Schlagwörter (DE)
- RNA, Bioinformatik, Struktur, RNA, Vergleich, Vereinfachung
- Freie Schlagwörter (EN)
- RNA bioinformatics; Structure-based comparison of RNA
- Klassifikation (DDC)
- 000
- 570
- Herausgeber (Institution)
- Universität Freiburg
- Universität Leipzig
- Max-Planck-Institut für Immunbiologie und Epigenetik
- University of Haifa
- New York University-Poly
- University of Copenhagen
- Verlag
- BioMed Central, London
- URN Qucosa
- urn:nbn:de:bsz:15-qucosa-159847
- Veröffentlichungsdatum Qucosa
- 22.01.2014
- Dokumenttyp
- Artikel
- Sprache des Dokumentes
- Englisch