Jupyter notebooks¶

Chemiscope can be used as a widget in Jupyter notebooks, that should work in both Jupyter classic and JupyterLab. The widget can be created in default mode (showing both a structure and a map panel), or used to display only structures or only properties.

Once created, it is possible to interact with the widget using a traitlet interface, modeled after Jupyter widgets.

Creating a chemiscope widget¶

chemiscope.show(frames=None, properties=None, meta=None, environments=None, shapes=None, settings=None, mode='default', warning_timeout=10000)¶

Show the dataset defined by the given frames and properties (optionally meta, environments and shapes as well) using an embedded chemiscope visualizer inside a Jupyter notebook. These parameters have the same meaning as in the chemiscope.create_input() function.

The mode keyword also allows overriding the default two-panels visualization to show only a structure panel (mode = "structure") or the map panel (mode = "map"). These modes also make it possible to view a dataset for which properties (or frames) are not available. The widget displays warning messages, that disappear after the specified warning_timeout (in ms). Set to a negative value to disable warnings, and to zero to make them persistent.

When inside a jupyter notebook, the returned object will create a new chemiscope visualizer displaying the dataset. The object exposes a settings traitlet, that allows to modify the visualization options (possibly even linking the parameters to another widget). Printing the value of the settings property is also a good way to see a full list of the available options.

The returned object also have a save function that can be used to save the dataset to a .json or .json.gz file to load it in the main website later. The visualization options will be those used in the active widget, so this is also a good way to tweak the appearance of the visualization before saving it.

import chemiscope
from sklearn.decomposition import PCA
import ase.io

pca = PCA(n_components=3)

frames = ase.io.read(...)
properties = {
    "PCA": pca.fit_transform(some_data),
}

widget = chemiscope.show(frames, properties)
# display the dataset in a chemiscope visualizer inside the notebook
widget
# ...

# NB: due to how traitlet work, you should always set the value of
# the `settings` property. Only the properties that are explicitly
# indicated will be modified.
widget.settings = {"map": {"symbol": "tag"}}
widget.settings["map"]["symbol"] = "tag"  # << does nothing!

# Save the file for later use
widget.save("dataset.json")

chemiscope.show_input(path, mode='default', warning_timeout=10000)¶

Loads and shows the chemiscope input in path.

If path ends with .gz, the file is loaded as a gzip compressed JSON string. If path is a file-like object, it is read as JSON input.

Parameters:

path (str | Path | file-like) – load the chemiscope input from this path or file-like object
mode (str) – widget mode, either default, structure or map.
warning_timeout (float) – timeout (in ms) for warnings. Set to a negative value to disable warnings, and to zero to make them persistent.

import chemiscope

widget = chemiscope.show_input("dataset.json")

# or

with open("dataset.json", "r") as f:
    widget = chemiscope.show_input(f)

Dataset exploration¶

chemiscope.explore(frames, featurizer=None, properties=None, environments=None, settings=None, mode='default', write_input=None, **kwargs)¶

Automatically generate an interactive Chemiscope visualization of atomic structures.

This function creates a low-dimensional representation of the input frames and displays them using a Chemiscope widget. It supports automatic featurization with PETMADFeaturizer or a custom featurization function.

The default PETMADFeaturizer computes PET-MAD features from the structures and projects them into the 3D MAD latent space.

If available, all properties are extracted automatically from the structures.

If one does not specify a featurizer (or sets it as a None), only properties will be displayed on the map visualizer panel, as long as there are at least two of them.

Overall, the visualization can include: properties extracted from the frames, additional user-provided properties, features from either the built-in PET-MAD featurizer, with dimensionality reduction, or a custom user-provided featurization functions.

Parameters:

frames (list) – list of frames
featurizer – either string specifying a featurizer version (currently only ‘pet-mad-1.0’), a custom callable function, or None. Used to compute features and perform dimensionality reduction on the frames. For automatic default option, use pet-mad-1.0. The callable should take frames as the first argument and environments as the second argument. The return value must be a features array of shape (n_frames, n_features) if environments is None, or (n_environments, n_features) otherwise.
properties (dict) – optional. Additional properties to be included in the visualization. This dictionary can contain any other relevant data associated with the atomic structures. Properties can be extracted from frames with extract_properties() or manually defined by the user.
environments – optional. List of environments (described as (structure id, center id, cutoff)) to include when extracting the atomic properties. Can be extracted from frames with all_atomic_environments() or manually defined.
settings (dict) – optional dictionary of settings to use when displaying the data. Possible entries for the settings dictionary are documented in the chemiscope input file reference.
mode (str) – optional. Visualization mode for the chemiscope widget. Can be one of “default”, “structure”, or “map”. The default mode is “default”.
device (str) – torch device to use for the calculation with the default PETMADFeaturizer. If None, we will try the options in the model’s supported_device in order.
batch_size (int) – optional. Number of structures processed in each batch with the default PETMADFeaturizer.
write_input (string) – optional. A path to save the chemiscope input file created by this function. Afterwards, the file can be loaded using chemiscope.read_input()
kwargs – additional keyword arguments passed to support backward compatibility. Currently, only the deprecated featurize argument is supported, which was renamed to featurizer

Returns:

a chemiscope widget for interactive visualization

To use this function, additional dependencies are required, specifically, pet_mad

Returns:: a chemiscope widget for interactive visualization

To use this function, additional dependencies are required, specifically, pet-mad used for the default dimensionality reduction. They can be installed with the following command:

pip install chemiscope[explore]

Here is an example using this function with and without a featurizer function. The frames are obtained by reading the structures from a file that ase can read, and performing Kernel PCA using sklearn on a descriptor computed with SOAP using the dscribe library.

import chemiscope
import ase.io
import dscribe.descriptors
import sklearn.decomposition

# Read the structures from the dataset
frames = ase.io.read("trajectory.xyz", ":")

# 1) Basic usage with default featurizer (PET-MAD featurization + Sketch-Map)
chemiscope.explore(frames, featurizer="pet-mad-1.0")

# or
featurizer = chemiscope.get_featurizer("pet-mad-1.0")
chemiscope.explore(frames, featurizer=featurizer)


# Define a function for dimensionality reduction
def soap_kpca_featurize(frames, environments):
    if environments is not None:
        raise ValueError("'environments' are not supported by this featurizer")
    # Compute descriptors
    soap = dscribe.descriptors.SOAP(
        species=["C"],
        r_cut=4.5,
        n_max=8,
        l_max=6,
        periodic=True,
    )
    descriptors = soap.create(frames)

    # Apply KPCA
    kpca = sklearn.decomposition.KernelPCA(n_components=2, gamma=0.05)

    # Return a 2D array of reduced features
    return kpca.fit_transform(descriptors)


# 2) Example with a custom featurizer function
chemiscope.explore(frames, featurizer=soap_kpca_featurize)

For more examples, see the related documentation.

chemiscope.get_featurizer(name)¶

Get a featurizer by name for feature extraction. Currently available version is: “pet-mad-1.0”, which returns an instance of PETMADFeaturizer.

Parameters:: name (str) – name of the featurizer. Must match one of the known versions. Currently available is “pet-mad-1.0”

Warning

This function requires additional dependencies. Install them using:

pip install chemiscope[explore]

chemiscope.metatomic_featurizer(model, *, extensions_directory=None, check_consistency=None, device=None, length_unit='Angstrom')¶

Create a featurizer function using a metatomic model to obtain the features from structures. The model must be able to create a "features" output.

Parameters:

model – model to use for the calculation. It can be a file path, a Python instance of metatomic.torch.AtomisticModel, or the output of torch.jit.script() on metatomic.torch.AtomisticModel.
extensions_directory – a directory where model extensions are located
check_consistency – should we check the model for consistency when running, defaults to False.
device – a torch device to use for the calculation. If None, the function will use the options in model’s supported_device attribute.
length_unit – Unit of length used in the structures

Returns:

a function that takes a list of frames and returns the features.

To use this function, additional dependencies are required. They can be installed with the following command:

pip install chemiscope[explore]

Here is an example using a pre-trained metatomic model, stored as a model.pt file with the compiled extensions stored in the extensions/ directory. The frames are obtained by reading structures from a file that ase can read.

import chemiscope
import ase.io

# Read the structures from the dataset frames =
ase.io.read("data/explore_c-gap-20u.xyz", ":")

# Provide model file ("model.pt") to `metatensor_featurizer`
featurizer = chemiscope.metatensor_featurizer(
    "model.pt", extensions_directory="extensions"
)

chemiscope.explore(frames, featurizer=featurizer)

For more examples, see the related documentation.