Enriched chemical component files at PDBe

Magnifying glass image overlaid onto chemical structure of small molecule

For users looking to access data on the small molecules in the PDB archive, the wwPDB chemical component dictionary (CCD) provides detailed information about geometry and linkage information for these ligands. To provide even more detailed information about these molecules, the PDBe team has created a process to provide enriched, updated versions of these CCD files, providing additional data.

These files are available through the PDBe FTP area at the following URL: 

The folders are constructed based upon the first character of the CCD ID, followed by the full length CCD ID, for example:  

The additional data provided in the PDBe updated CCD files includes information on links to external databases generated through the UniChem service, including identifiers for ChEMBL, ChEBI, ,  and more. This process also generates more extensive synonym information from these related databases. There is also mapping to IDs, providing information on drug classification and its protein targets.

There is also data provided on chemical structure, with information about Murcko scaffolds and fragments for the molecule. Furthermore, we also provide a number of physicochemical properties through these files, based on analysis with the RDKit () software. These physicochemical properties include information such as number of rotatable bonds, number of hydrogen bonds/acceptors and many more.

Finally we also make available 2D atom coordinates and bond order required to create standardised images of the molecule. This information is used to display the ligands on our 2D ligand interactions component on the PDBe website (e.g. ibuprofen binding site in PDB entry 3p6h). The ligand interactions component displays the ligand chemical structure, highlighting interactions with other components in the structure. Users can also find idealised 3D conformers generated using RDKit in these updated CCD files.