Displaying detailed RNA secondary structure at PDBe entry pages

Image of RNA base pairing focused on the specific base pairs in the center of the image

Visualising RNA secondary structures as two-dimensional (2D) topology diagrams is a widespread practice in RNA biology. This approach offers a clear and concise method for understanding structural features within RNA molecules. The previous year, the Protein Data Bank in Europe (PDBe) introduced a 2D topology component on their RNA macromolecules webpages (/pdbe/news/rna-topology-viewer-added-pdbe-pages). This innovative feature lets users visualise RNA structures in 2D topology and three-dimensional (3D) formats using the Mol* viewer.

Notably, these components are interactive, with corresponding regions highlighted in 2D and 3D viewers. This interactivity allows users to effortlessly identify and examine regions of interest in the 3D viewer by interacting with the secondary structure component, which is also fully zoomable for enhanced usability. This integration aims to make RNA structure analysis more accessible and engaging for a broader scientific audience.

The updated version of the RNA two-dimensional (2D) topology component now features base pairing information, emphasising the presence of secondary structure elements like helical stems and single-stranded loops. This enhancement was developed and integrated into the PDBe RNA topology viewer by Holly McCann and Caeden Meade, collaborators from Georgia Tech. Base pairing information is generated using the and incorporated into the PDBe's weekly release process with assistance from partners at . This collaboration enables the creation of base pair data for every new RNA-containing entry in the Protein Data Bank (PDB). PDBe offers this information as JSON files, keyed on PDB entry and chain identifiers (click to view an example).

The open-source RNA 2D topology component, available at , presents the secondary structure in consistent, reproducible, and easily recognisable layouts using . Base pairing annotations are superimposed on the R2DT-generated layouts, employing the . Standard Watson-Crick base pairs appear in the viewer as lines connecting the base letters. In contrast, non-Watson-Crick base pairs are depicted using Leontis-Westhof notation to emphasise various interaction types within the secondary structure elements. Clicking on bases in the 2D topology diagram directs users to the corresponding nucleotides in the Mol* viewer. Further interaction with the specific base in the Mol* viewer reveals the local region in greater detail, including the base pairing interactions between nucleic acid bases.

The example below illustrates the macromolecule page for the 23S ribosomal RNA derived from E. coli, seen in PDB entry 3cc2. The 2D RNA topology component on the left side displays a zoomed-in region encompassing residues 652 to 753. By clicking on residue G691, highlighted in orange, the Mol* viewer on the right side zooms in on this nucleotide, highlighted in pink. Seamless interactivity enables users to visualise RNA structures with varying levels of detail.

 

Image of RNA structure components, with topology shown on the left and the 3D structure on the right
Snapshot of the 2D and 3D components on the macromolecule page for 23S ribosomal RNA from E. coli, in PDB entry 3cc2.

 

The presence of specific non-Watson-Crick base pairs may indicate recurrent motifs, such as . While these additional annotations are not displayed by default in the component, they can be included using the 'base pairings' drop-down menu at the bottom. For instance, a kink-turn internal loop motif (Kt-23) is found in helix 23 of the 16S ribosomal RNA from PDB entry 5j7l.

 

Image of RNA structure components, with topology shown on the left and the 3D structure on the right
Snapshot of the 2D and 3D components on the macromolecule page for 16S ribosomal RNA from E. coli, in PDB entry 5j7l.

 

The RNA topology component is zoomed into the region around the kink-turn, with the base pairs shown using the Leontis-Westhof nomenclature with different symbols, indicating the types of base pairs formed. The tHS base pairs between G685-G705 and A687-G703 are indicated by squares and triangles which represent the Hoogsteen and Sugar edges of the base respectively, with this region also highlighted in orange in the adjacent Mol* 3D viewer.

Users can also view long-range interactions that link distant regions in the 3D structure using the options in the ‘base pairings� drop-down. For example, cSS/tSS base-pairs make long-range interactions called A-minor motifs which helps to bring distant regions in the secondary structure into close contact helping RNA to fold into a compact 3D structure. All these additional annotations can help users of RNA structure data better interpret these specific topologies.

 

Acknowledgements

We would like to thank Dr Anton S. Petrov, Holly McCann, and Caeden Meade (Georgia Tech) for adding the base pair visualisation to the frontend component, Prof Craig Zirbel and Jacob Mitchell (Bowling Green State University) for helping to deploy FR3D at PDBe, as well as Dr Anton I. Petrov (Riboscope Ltd) for coordination and feedback.

 

Data and Code Availability

Weekly updated basepair information is available from static JSON files, keyed on PDB entry and chain identifiers. The URL pattern is /pdbe/static/entry/[PDBID]_[CHAINID]_basepair.json, for example, /pdbe/static/entry/5j7l_CA_basepair.json

The source code of the RNA topology viewer is available on GitHub, at , under Apache 2.0 license. For more information, visit the documentation page at