Note
Go to the end to download the full example code.
Visualize biomolecules with MDAnalysis¶
This example shows how to visualize biomolecules in chemiscope with MDAnalysis, and how to leverage the select_atoms() method to show only a subset of the atoms.
Biomolecules often contain a large number of atoms, which makes the classical ball-and-stick representation of molecules hard to be understand. The “cartoon” representation focuses on the main structural elements, such as backbone atoms, to highlight the secondary structure, and is often more readable.
import urllib.request
import MDAnalysis as mda
import chemiscope
Retrieving the PDB file from RCSB Protein Data Bank¶
The RCSB Protein Data Bank (RCSB PDB) is a database of crystal structures of proteins, nucleic acids and small molecules. To start with, we will retrieve a structure from the PDB database. Here we choose “10MH”, a complex consisting of a protein, a nucleic acid, small molecules, and crystallographic water.
pdb_id = "10MH"
urllib.request.urlretrieve(
f"https://files.rcsb.org/view/{pdb_id}.pdb", f"./{pdb_id}.pdb"
)
('./10MH.pdb', <http.client.HTTPMessage object at 0x7f5f31490b90>)
Reading the PDB file and visualizing it in Chemiscope¶
We use MDAnalysis to read the PDB file, interpreting also the metadata that describes the structure of the protein.
universe = mda.Universe(f"./{pdb_id}.pdb")
The chemiscope takes a MDAnalysis.AtomGroup as input. You can toggle the cartoon representation in the hamburger menu in the top-right corner of the widget. When the cartoon representation is off, the representation will automatically fall back to the ball-and-stick representation.
ag = universe.atoms
chemiscope.show(
frames=ag,
mode="structure",
settings=chemiscope.quick_settings(structure_settings={"cartoon": True}),
)
Selecting atoms of interest¶
The crystallographic water in the structure is not of interest, so we can use the select_atoms() method to only show the complex for a cleaner visualization.
sol = universe.select_atoms("not water")
chemiscope.show(
frames=sol,
mode="structure",
settings=chemiscope.quick_settings(structure_settings={"cartoon": True}),
)
Exploring the sampled conformational space¶
We can use the map mode to explore the conformational space sampled by the MD simulation easily. Here, we use a protein-lipid system taken from the MDAnalysisTests as an example.
from MDAnalysis.tests.datafiles import GRO_MEMPROT, XTC_MEMPROT # noqa
complx = mda.Universe(GRO_MEMPROT, XTC_MEMPROT)
We describe the conformational space by two features: the z-axis distance between the geometric centers of protein and lipid, and the root mean square deviation (RMSD) of the atomic positions of protein with respect to its initial conformation.
import numpy as np # noqa
from MDAnalysis.analysis.distances import distance_array # noqa
from MDAnalysis.analysis.rms import RMSD # noqa
# Distance calculation
distances = []
for _ in complx.trajectory:
lipid_center = complx.select_atoms("resname POP*").center_of_geometry()
protein_center = complx.select_atoms("protein").center_of_geometry()
distances.append((protein_center - lipid_center)[2])
distances = np.abs(distances)
# RMSD calculation
ref = mda.Universe(GRO_MEMPROT)
R = RMSD(complx, ref, select="backbone")
R.run()
rmsd = R.results.rmsd.T[2]
We can then use the map mode to visualize the sampled conformational space.
# Given that trajectories can be very large, we load the frames on disk to
# reduce the memory usage of the viewer
external_frames = chemiscope.write_external_structures(complx.atoms, "protein-rmsd")
chemiscope.show(
frames=external_frames,
meta={
"name": "Protein-Lipid Complex",
"description": (
"Conformational space of a protein-lipid complex featurized "
"by the protein-lipid z-axis distance and the protein RMSD"
),
},
properties={
"Protein-Lipid Distance": {
"target": "structure",
"values": distances,
"units": "Å",
"description": (
"Z-axis distance between the geometric centers of protein and lipid"
),
},
"Protein Backbone RMSD": {
"target": "structure",
"values": rmsd,
"units": "Å",
"description": (
"RMSD of the atomic positions of protein with respect to "
"its initial conformation"
),
},
},
)
Total running time of the script: (0 minutes 2.288 seconds)