core.preprocessing.atoms_to_graphs#

Copyright (c) Meta, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

Attributes#

Classes#

AtomsToGraphs

A class to help convert periodic atomic structures to graphs.

Module Contents#

core.preprocessing.atoms_to_graphs.AseAtomsAdaptor = None#
core.preprocessing.atoms_to_graphs.shell#
class core.preprocessing.atoms_to_graphs.AtomsToGraphs(max_neigh: int = 200, radius: int = 6, r_energy: bool = False, r_forces: bool = False, r_distances: bool = False, r_edges: bool = True, r_fixed: bool = True, r_pbc: bool = False, r_stress: bool = False, r_data_keys: collections.abc.Sequence[str] | None = None)#

A class to help convert periodic atomic structures to graphs.

The AtomsToGraphs class takes in periodic atomic structures in form of ASE atoms objects and converts them into graph representations for use in PyTorch. The primary purpose of this class is to determine the nearest neighbors within some radius around each individual atom, taking into account PBC, and set the pair index and distance between atom pairs appropriately. Lastly, atomic properties and the graph information are put into a PyTorch geometric data object for use with PyTorch.

Parameters:
  • max_neigh (int) – Maximum number of neighbors to consider.

  • radius (int or float) – Cutoff radius in Angstroms to search for neighbors.

  • r_energy (bool) – Return the energy with other properties. Default is False, so the energy will not be returned.

  • r_forces (bool) – Return the forces with other properties. Default is False, so the forces will not be returned.

  • r_stress (bool) – Return the stress with other properties. Default is False, so the stress will not be returned.

  • r_distances (bool) – Return the distances with other properties.

  • False (Default is)

  • returned. (so the periodic boundary conditions will not be)

  • r_edges (bool) – Return interatomic edges with other properties. Default is True, so edges will be returned.

  • r_fixed (bool) – Return a binary vector with flags for fixed (1) vs free (0) atoms.

  • True (Default is)

  • returned.

  • r_pbc (bool) – Return the periodic boundary conditions with other properties.

  • False

  • returned.

  • r_data_keys (sequence of str, optional) – Return values corresponding to given keys in atoms.info data with other

  • None (properties. Default is)

  • properties. (so no data will be returned as)

max_neigh#

Maximum number of neighbors to consider.

Type:

int

radius#

Cutoff radius in Angstoms to search for neighbors.

Type:

int or float

r_energy#

Return the energy with other properties. Default is False, so the energy will not be returned.

Type:

bool

r_forces#

Return the forces with other properties. Default is False, so the forces will not be returned.

Type:

bool

r_stress#

Return the stress with other properties. Default is False, so the stress will not be returned.

Type:

bool

r_distances#

Return the distances with other properties.

Type:

bool

Default is False, so the distances will not be returned.
r_edges#

Return interatomic edges with other properties. Default is True, so edges will be returned.

Type:

bool

r_fixed#

Return a binary vector with flags for fixed (1) vs free (0) atoms.

Type:

bool

Default is True, so the fixed indices will be returned.
r_pbc#

Return the periodic boundary conditions with other properties.

Type:

bool

Default is False, so the periodic boundary conditions will not be returned.
r_data_keys#

Return values corresponding to given keys in atoms.info data with other

Type:

sequence of str, optional

properties. Default is None, so no data will be returned as properties.
max_neigh#
radius#
r_energy#
r_forces#
r_stress#
r_distances#
r_fixed#
r_edges#
r_pbc#
r_data_keys#
_get_neighbors_pymatgen(atoms: ase.Atoms)#

Preforms nearest neighbor search and returns edge index, distances, and cell offsets

_reshape_features(c_index, n_index, n_distance, offsets)#

Stack center and neighbor index and reshapes distances, takes in np.arrays and returns torch tensors

get_edge_distance_vec(pos, edge_index, cell, cell_offsets)#
convert(atoms: ase.Atoms, sid=None)#

Convert a single atomic structure to a graph.

Parameters:
  • atoms (ase.atoms.Atoms) – An ASE atoms object.

  • sid (uniquely identifying object) – An identifier that can be used to track the structure in downstream

  • integers. (tasks. Common sids used in OCP datasets include unique strings or)

Returns:

A torch geometic data object with positions, atomic_numbers, tags, and optionally, energy, forces, distances, edges, and periodic boundary conditions. Optional properties can included by setting r_property=True when constructing the class.

Return type:

data (torch_geometric.data.Data)

convert_all(atoms_collection, processed_file_path: str | None = None, collate_and_save=False, disable_tqdm=False)#

Convert all atoms objects in a list or in an ase.db to graphs.

Parameters:
  • atoms_collection (list of ase.atoms.Atoms or ase.db.sqlite.SQLite3Database)

  • database. (Either a list of ASE atoms objects or an ASE)

  • processed_file_path (str)

  • None. (A string of the path to where the processed file will be written. Default is)

  • collate_and_save (bool) – A boolean to collate and save or not. Default is False, so will not write a file.

Returns:

A list of torch geometric data objects containing molecular graph info and properties.

Return type:

data_list (list of torch_geometric.data.Data)