data.oc.core#

Submodules#

Classes#

Adsorbate

Initializes an adsorbate object in one of 4 ways:

AdsorbateSlabConfig

Initializes a list of adsorbate-catalyst systems for a given Adsorbate and Slab.

Bulk

Initializes a bulk object in one of 4 ways:

MultipleAdsorbateSlabConfig

Class to represent a slab with multiple adsorbates on it. This class only

Slab

Initializes a slab object, i.e. a particular slab tiled along xyz, in

Package Contents#

class data.oc.core.Adsorbate(adsorbate_atoms: ase.Atoms = None, adsorbate_id_from_db: int | None = None, adsorbate_smiles_from_db: str | None = None, adsorbate_db_path: str = ADSORBATE_PKL_PATH, adsorbate_db: dict[int, tuple[Any, Ellipsis]] | None = None, adsorbate_binding_indices: list | None = None)#

Initializes an adsorbate object in one of 4 ways: - Directly pass in an ase.Atoms object.

For this, you should also provide the index of the binding atom.

  • Pass in index of adsorbate to select from adsorbate database.

  • Pass in the SMILES string of the adsorbate to select from the database.

  • Randomly sample an adsorbate from the adsorbate database.

Parameters:
  • adsorbate_atoms (ase.Atoms) – Adsorbate structure.

  • adsorbate_id_from_db (int) – Index of adsorbate to select.

  • adsorbate_smiles_from_db (str) – A SMILES string of the desired adsorbate.

  • adsorbate_db_path (str) – Path to adsorbate database.

  • adsorbate_binding_indices (list) – The index/indices of the adsorbate atoms which are expected to bind.

__len__()#
__str__()#

Return str(self).

__repr__()#

Return repr(self).

_get_adsorbate_from_random(adsorbate_db)#
_load_adsorbate(adsorbate: tuple[Any, Ellipsis]) None#

Saves the fields from an adsorbate stored in a database. Fields added after the first revision are conditionally added for backwards compatibility with older database files.

class data.oc.core.AdsorbateSlabConfig(slab: fairchem.data.oc.core.slab.Slab, adsorbate: fairchem.data.oc.core.slab.Adsorbate, num_sites: int = 100, num_augmentations_per_site: int = 1, interstitial_gap: float = 0.1, mode: str = 'random')#

Initializes a list of adsorbate-catalyst systems for a given Adsorbate and Slab.

Parameters:
  • slab (Slab) – Slab object.

  • adsorbate (Adsorbate) – Adsorbate object.

  • num_sites (int) – Number of sites to sample.

  • num_augmentations_per_site (int) – Number of augmentations of the adsorbate per site. Total number of generated structures will be num_sites * num_augmentations_per_site.

  • interstitial_gap (float) – Minimum distance in Angstroms between adsorbate and slab atoms.

  • mode (str) –

    “random”, “heuristic”, or “random_site_heuristic_placement”. This affects surface site sampling and adsorbate placement on each site.

    In “random”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along xyz, and place it such that the center of mass is at the site.

    In “heuristic”, we use Pymatgen’s AdsorbateSiteFinder to find the most energetically favorable sites, i.e., ontop, bridge, or hollow sites. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In “random_site_heuristic_placement”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In all cases, the adsorbate is placed at the closest position of no overlap with the slab plus interstitial_gap along the surface normal.

get_binding_sites(num_sites: int)#

Returns up to num_sites sites given the surface atoms’ positions.

place_adsorbate_on_site(adsorbate: fairchem.data.oc.core.slab.Adsorbate, site: numpy.ndarray, interstitial_gap: float = 0.1)#

Place the adsorbate at the given binding site.

place_adsorbate_on_sites(sites: list, num_augmentations_per_site: int = 1, interstitial_gap: float = 0.1)#

Place the adsorbate at the given binding sites.

_get_scaled_normal(adsorbate_c: ase.Atoms, slab_c: ase.Atoms, site: numpy.ndarray, unit_normal: numpy.ndarray, interstitial_gap: float = 0.1)#

Get the scaled normal that gives a proximate configuration without atomic overlap by:

  1. Projecting the adsorbate and surface atoms onto the surface plane.

  2. Identify all adsorbate atom - surface atom combinations for which

    an itersection when translating along the normal would occur. This is where the distance between the projected points is less than r_surface_atom + r_adsorbate_atom

  3. Explicitly solve for the scaled normal at which the distance between

    surface atom and adsorbate atom = r_surface_atom + r_adsorbate_atom + interstitial_gap. This exploits the superposition of vectors and the distance formula, so it requires root finding.

Assumes that the adsorbate’s binding atom or center-of-mass (depending on mode) is already placed at the site.

Parameters:
  • adsorbate_c (ase.Atoms) – A copy of the adsorbate with coordinates at the site

  • slab_c (ase.Atoms) – A copy of the slab

  • site (np.ndarray) – the coordinate of the site

  • adsorbate_atoms (ase.Atoms) – the translated adsorbate

  • unit_normal (np.ndarray) – the unit vector normal to the surface

  • interstitial_gap (float) – the desired distance between the covalent radii of the closest surface and adsorbate atom

Returns:

the magnitude of the normal vector for placement

Return type:

(float)

_find_combos_to_check(adsorbate_c2: ase.Atoms, slab_c2: ase.Atoms, unit_normal: numpy.ndarray, interstitial_gap: float)#

Find the pairs of surface and adsorbate atoms that would have an intersection event while traversing the normal vector. For each pair, return pertanent information for finding the point of intersection. :param adsorbate_c2: A copy of the adsorbate with coordinates at the centered site :type adsorbate_c2: ase.Atoms :param slab_c2: A copy of the slab with atoms wrapped s.t. things are centered

about the site

Parameters:
  • unit_normal (np.ndarray) – the unit vector normal to the surface

  • interstitial_gap (float) – the desired distance between the covalent radii of the closest surface and adsorbate atom

Returns:

each entry in the list corresponds to one pair to check. With the
following information:

[(adsorbate_idx, slab_idx), r_adsorbate_atom + r_slab_atom, slab_atom_position]

Return type:

(list[lists])

_get_projected_points(adsorbate_c2: ase.Atoms, slab_c2: ase.Atoms, unit_normal: numpy.ndarray)#

Find the x and y coordinates of each atom projected onto the surface plane. :param adsorbate_c2: A copy of the adsorbate with coordinates at the centered site :type adsorbate_c2: ase.Atoms :param slab_c2: A copy of the slab with atoms wrapped s.t. things are centered

about the site

Parameters:

unit_normal (np.ndarray) – the unit vector normal to the surface

Returns:

{“ads”: [[x1, y1], [x2, y2], …], “slab”: [[x1, y1], [x2, y2], …],}

Return type:

(dict)

get_metadata_dict(ind)#

Returns a dict containing the atoms object and metadata for one specified config, used for writing to files.

class data.oc.core.Bulk(bulk_atoms: ase.Atoms = None, bulk_id_from_db: int | None = None, bulk_src_id_from_db: str | None = None, bulk_db_path: str = BULK_PKL_PATH, bulk_db: list[dict[str, Any]] | None = None)#

Initializes a bulk object in one of 4 ways: - Directly pass in an ase.Atoms object. - Pass in index of bulk to select from bulk database. - Pass in the src_id of the bulk to select from the bulk database. - Randomly sample a bulk from bulk database if no other option is passed.

Parameters:
  • bulk_atoms (ase.Atoms) – Bulk structure.

  • bulk_id_from_db (int) – Index of bulk in database pkl to select.

  • bulk_src_id_from_db (int) – Src id of bulk to select (e.g. “mp-30”).

  • bulk_db_path (str) – Path to bulk database.

  • bulk_db (List[Dict[str, Any]]) – Already-loaded database.

_get_bulk_from_random(bulk_db)#
set_source_dataset_id(src_id: str)#
set_bulk_id_from_db(bulk_id_from_db: int)#
get_slabs(max_miller=2, precomputed_slabs_dir=None)#

Returns a list of possible slabs for this bulk instance.

__len__()#
__str__()#

Return str(self).

__repr__()#

Return repr(self).

__eq__(other) bool#

Return self==value.

class data.oc.core.MultipleAdsorbateSlabConfig(slab: fairchem.data.oc.core.slab.Slab, adsorbates: list[fairchem.data.oc.core.adsorbate.Adsorbate], num_sites: int = 100, num_configurations: int = 1, interstitial_gap: float = 0.1, mode: str = 'random_site_heuristic_placement')#

Bases: fairchem.data.oc.core.adsorbate_slab_config.AdsorbateSlabConfig

Class to represent a slab with multiple adsorbates on it. This class only returns a fixed combination of adsorbates placed on the surface. Unlike AdsorbateSlabConfig which enumerates all possible adsorbate placements, this problem gets combinatorially large.

Parameters:
  • slab (Slab) – Slab object.

  • adsorbates (List[Adsorbate]) – List of adsorbate objects to place on the slab.

  • num_sites (int) – Number of sites to sample.

  • num_configurations (int) – Number of configurations to generate per slab+adsorbate(s) combination. This corresponds to selecting different site combinations to place the adsorbates on.

  • interstitial_gap (float) – Minimum distance, in Angstroms, between adsorbate and slab atoms as well as the inter-adsorbate distance.

  • mode (str) –

    “random”, “heuristic”, or “random_site_heuristic_placement”. This affects surface site sampling and adsorbate placement on each site.

    In “random”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along xyz, and place it such that the center of mass is at the site.

    In “heuristic”, we use Pymatgen’s AdsorbateSiteFinder to find the most energetically favorable sites, i.e., ontop, bridge, or hollow sites. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In “random_site_heuristic_placement”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In all cases, the adsorbate is placed at the closest position of no overlap with the slab plus interstitial_gap along the surface normal.

place_adsorbates_on_sites(sites: list, num_configurations: int = 1, interstitial_gap: float = 0.1)#

Place the adsorbate at the given binding sites.

This method generates a fixed number of configurations where sites are selected to ensure that adsorbate binding indices are at least a fair distance away from each other (covalent radii + interstitial gap). While this helps prevent adsorbate overlap it does not gaurantee it since non-binding adsorbate atoms can overlap if the right combination of angles is sampled.

get_metadata_dict(ind)#

Returns a dict containing the atoms object and metadata for one specified config, used for writing to files.

class data.oc.core.Slab(bulk=None, slab_atoms: ase.Atoms = None, millers: tuple | None = None, shift: float | None = None, top: bool | None = None, oriented_bulk: pymatgen.core.structure.Structure = None, min_ab: float = 0.8)#

Initializes a slab object, i.e. a particular slab tiled along xyz, in one of 2 ways: - Pass in a Bulk object and a slab 5-tuple containing (atoms, miller, shift, top, oriented bulk). - Pass in a Bulk object and randomly sample a slab.

Parameters:
  • bulk (Bulk) – Corresponding Bulk object.

  • slab_atoms (ase.Atoms) – Slab atoms, tiled and tagged

  • millers (tuple) – Miller indices of slab.

  • shift (float) – Shift of slab.

  • top (bool) – Whether slab is top or bottom.

  • min_ab (float) – To confirm that the tiled structure spans this distance

classmethod from_bulk_get_random_slab(bulk=None, max_miller=2, min_ab=8.0, save_path=None)#
classmethod from_bulk_get_specific_millers(specific_millers, bulk=None, min_ab=8.0, save_path=None)#
classmethod from_bulk_get_all_slabs(bulk=None, max_miller=2, min_ab=8.0, save_path=None)#
classmethod from_precomputed_slabs_pkl(bulk=None, precomputed_slabs_pkl=None, max_miller=2, min_ab=8.0)#
classmethod from_atoms(atoms: ase.Atoms = None, bulk=None, **kwargs)#
has_surface_tagged()#
get_metadata_dict()#
__len__()#
__str__()#

Return str(self).

__repr__()#

Return repr(self).

__eq__(other)#

Return self==value.