All Publications

Towards Optimal Grammars for RNA Structures

Mar 2024 (Written: Nov 2023)

Eva Onokpasa, Sebastian Wild, and Prudence Wong:

Data Compression Conference (DCC) 2024

| read herePDFarXivGitHubColab |

In this paper, we search for the best context-free grammars for joint RNA sequence and structure compression.

In our DCC 2023 paper, we used the same grammar-based compression approach for RNA with human-expert curated grammars from the RNA secondary-structure-preduction literature and found that prediction quality and compression ability are strongly correlated.

In this work, we use exhaustive and random searches to test whether the human-expert grammars are the most suitable grammars for this task based on compression ability. We find that while the best expert grammars do perform well, they are indeed beaten by other grammars.

Data and figures are available on Google Colab; the code is on GitHub.