Jupyter notebooks¶
Chemiscope can be used as a widget in Jupyter notebooks, that should work in
both Jupyter classic and JupyterLab. The widget can be created in default
mode (showing both a structure and a map panel), or used to display only
structures or only properties.
Once created, it is possible to interact with the widget using a traitlet interface, modeled after Jupyter widgets.
Creating a chemiscope widget¶
- chemiscope.show(frames=None, properties=None, meta=None, environments=None, shapes=None, settings=None, mode='default', warning_timeout=10000)¶
Show the dataset defined by the given
frames
andproperties
(optionallymeta
,environments
andshapes
as well) using an embedded chemiscope visualizer inside a Jupyter notebook. These parameters have the same meaning as in thechemiscope.create_input()
function.The
mode
keyword also allows overriding the default two-panels visualization to show only a structure panel (mode = "structure"
) or the map panel (mode = "map"
). These modes also make it possible to view a dataset for which properties (or frames) are not available. The widget displays warning messages, that disappear after the specifiedwarning_timeout
(in ms). Set to a negative value to disable warnings, and to zero to make them persistent.When inside a jupyter notebook, the returned object will create a new chemiscope visualizer displaying the dataset. The object exposes a
settings
traitlet, that allows to modify the visualization options (possibly even linking the parameters to another widget). Printing the value of thesettings
property is also a good way to see a full list of the available options.The returned object also have a
save
function that can be used to save the dataset to a.json
or.json.gz
file to load it in the main website later. The visualization options will be those used in the active widget, so this is also a good way to tweak the appearance of the visualization before saving it.import chemiscope from sklearn.decomposition import PCA import ase.io pca = PCA(n_components=3) frames = ase.io.read(...) properties = { "PCA": pca.fit_transform(some_data), } widget = chemiscope.show(frames, properties) # display the dataset in a chemiscope visualizer inside the notebook widget # ... # NB: due to how traitlet work, you should always set the value of # the `settings` property. Only the properties that are explicitly # indicated will be modified. widget.settings = {"map": {"symbol": "tag"}} widget.settings["map"]["symbol"] = "tag" # << does nothing! # Save the file for later use widget.save("dataset.json")
- chemiscope.show_input(path, mode='default', warning_timeout=10000)¶
Loads and shows the chemiscope input in
path
.If
path
ends with.gz
, the file is loaded as a gzip compressed JSON string. Ifpath
is a file-like object, it is read as JSON input.- Parameters:
path (str | Path | file-like) – load the chemiscope input from this path or file-like object
mode (str) – widget mode, either
default
,structure
ormap
.warning_timeout (float) – timeout (in ms) for warnings. Set to a negative value to disable warnings, and to zero to make them persistent.
import chemiscope widget = chemiscope.show_input("dataset.json") # or with open("dataset.json", "r") as f: widget = chemiscope.show_input(f)
Dataset exploration¶
- chemiscope.explore(frames, featurizer=None, properties=None, environments=None, settings=None, mode='default', write_input=None, **kwargs)¶
Automatically generate an interactive Chemiscope visualization of atomic structures.
This function creates a low-dimensional representation of the input
frames
and displays them using a Chemiscope widget. It supports automatic featurization with PETMADFeaturizer or a custom featurization function.The default
PETMADFeaturizer
computes PET-MAD features from the structures and projects them into the 3D MAD latent space.If available, all properties are extracted automatically from the structures.
If one does not specify a
featurizer
(or sets it as aNone
), only properties will be displayed on the map visualizer panel, as long as there are at least two of them.Overall, the visualization can include: properties extracted from the frames, additional user-provided properties, features from either the built-in PET-MAD featurizer, with dimensionality reduction, or a custom user-provided featurization functions.
- Parameters:
frames (list) – list of frames
featurizer – either string specifying a featurizer version (currently only ‘pet-mad-1.0’), a custom callable function, or None. Used to compute features and perform dimensionality reduction on the
frames
. For automatic default option, usepet-mad-1.0
. The callable should takeframes
as the first argument andenvironments
as the second argument. The return value must be a features array of shape(n_frames, n_features)
ifenvironments
isNone
, or(n_environments, n_features)
otherwise.properties (dict) – optional. Additional properties to be included in the visualization. This dictionary can contain any other relevant data associated with the atomic structures. Properties can be extracted from frames with
extract_properties()
or manually defined by the user.environments – optional. List of environments (described as
(structure id, center id, cutoff)
) to include when extracting the atomic properties. Can be extracted from frames withall_atomic_environments()
or manually defined.settings (dict) – optional dictionary of settings to use when displaying the data. Possible entries for the
settings
dictionary are documented in the chemiscope input file reference.mode (str) – optional. Visualization mode for the chemiscope widget. Can be one of “default”, “structure”, or “map”. The default mode is “default”.
device (str) – torch device to use for the calculation with the default
PETMADFeaturizer
. If None, we will try the options in the model’ssupported_device
in order.batch_size (int) – optional. Number of structures processed in each batch with the default
PETMADFeaturizer
.write_input (string) – optional. A path to save the chemiscope input file created by this function. Afterwards, the file can be loaded using
chemiscope.read_input()
kwargs – additional keyword arguments passed to support backward compatibility. Currently, only the deprecated
featurize
argument is supported, which was renamed tofeaturizer
- Returns:
a chemiscope widget for interactive visualization
To use this function, additional dependencies are required, specifically, pet_mad
- Returns:
a chemiscope widget for interactive visualization
To use this function, additional dependencies are required, specifically, pet-mad used for the default dimensionality reduction. They can be installed with the following command:
pip install chemiscope[explore]
Here is an example using this function with and without a featurizer function. The frames are obtained by reading the structures from a file that ase can read, and performing Kernel PCA using sklearn on a descriptor computed with SOAP using the dscribe library.
import chemiscope import ase.io import dscribe.descriptors import sklearn.decomposition # Read the structures from the dataset frames = ase.io.read("trajectory.xyz", ":") # 1) Basic usage with default featurizer (PET-MAD featurization + Sketch-Map) chemiscope.explore(frames, featurizer="pet-mad-1.0") # or featurizer = chemiscope.get_featurizer("pet-mad-1.0") chemiscope.explore(frames, featurizer=featurizer) # Define a function for dimensionality reduction def soap_kpca_featurize(frames, environments): if environments is not None: raise ValueError("'environments' are not supported by this featurizer") # Compute descriptors soap = dscribe.descriptors.SOAP( species=["C"], r_cut=4.5, n_max=8, l_max=6, periodic=True, ) descriptors = soap.create(frames) # Apply KPCA kpca = sklearn.decomposition.KernelPCA(n_components=2, gamma=0.05) # Return a 2D array of reduced features return kpca.fit_transform(descriptors) # 2) Example with a custom featurizer function chemiscope.explore(frames, featurizer=soap_kpca_featurize)
For more examples, see the related documentation.
- chemiscope.get_featurizer(name)¶
Get a featurizer by name for feature extraction. Currently available version is: “pet-mad-1.0”, which returns an instance of PETMADFeaturizer.
- Parameters:
name (str) – name of the featurizer. Must match one of the known versions. Currently available is “pet-mad-1.0”
Warning
This function requires additional dependencies. Install them using:
pip install chemiscope[explore]
- chemiscope.metatomic_featurizer(model, *, extensions_directory=None, check_consistency=None, device=None, length_unit='Angstrom')¶
Create a featurizer function using a metatomic model to obtain the features from structures. The model must be able to create a
"features"
output.- Parameters:
model – model to use for the calculation. It can be a file path, a Python instance of
metatomic.torch.AtomisticModel
, or the output oftorch.jit.script()
onmetatomic.torch.AtomisticModel
.extensions_directory – a directory where model extensions are located
check_consistency – should we check the model for consistency when running, defaults to False.
device – a torch device to use for the calculation. If
None
, the function will use the options in model’ssupported_device
attribute.length_unit – Unit of length used in the structures
- Returns:
a function that takes a list of frames and returns the features.
To use this function, additional dependencies are required. They can be installed with the following command:
pip install chemiscope[explore]
Here is an example using a pre-trained metatomic model, stored as a
model.pt
file with the compiled extensions stored in theextensions/
directory. The frames are obtained by reading structures from a file that ase can read.import chemiscope import ase.io # Read the structures from the dataset frames = ase.io.read("data/explore_c-gap-20u.xyz", ":") # Provide model file ("model.pt") to `metatensor_featurizer` featurizer = chemiscope.metatensor_featurizer( "model.pt", extensions_directory="extensions" ) chemiscope.explore(frames, featurizer=featurizer)
For more examples, see the related documentation.