data.oc.core

Contents

data.oc.core#

Submodules#

Classes#

Adsorbate

Initializes an adsorbate object in one of 4 ways:

AdsorbateSlabConfig

Initializes a list of adsorbate-catalyst systems for a given Adsorbate and Slab.

Bulk

Initializes a bulk object in one of 4 ways:

InterfaceConfig

Class to represent a solvent, adsorbate, slab, ion config. This class only

Ion

Initializes an ion object in one of 2 ways:

MultipleAdsorbateSlabConfig

Class to represent a slab with multiple adsorbates on it. This class only

Slab

Initializes a slab object, i.e. a particular slab tiled along xyz, in

Solvent

Initializes a solvent object in one of 2 ways:

Package Contents#

class data.oc.core.Adsorbate(adsorbate_atoms: ase.Atoms = None, adsorbate_id_from_db: int | None = None, adsorbate_smiles_from_db: str | None = None, adsorbate_db_path: str = ADSORBATE_PKL_PATH, adsorbate_db: dict[int, tuple[Any, Ellipsis]] | None = None, adsorbate_binding_indices: list | None = None)#

Initializes an adsorbate object in one of 4 ways: - Directly pass in an ase.Atoms object.

For this, you should also provide the index of the binding atom.

  • Pass in index of adsorbate to select from adsorbate database.

  • Pass in the SMILES string of the adsorbate to select from the database.

  • Randomly sample an adsorbate from the adsorbate database.

Parameters:
  • adsorbate_atoms (ase.Atoms) – Adsorbate structure.

  • adsorbate_id_from_db (int) – Index of adsorbate to select.

  • adsorbate_smiles_from_db (str) – A SMILES string of the desired adsorbate.

  • adsorbate_db_path (str) – Path to adsorbate database.

  • adsorbate_binding_indices (list) – The index/indices of the adsorbate atoms which are expected to bind.

adsorbate_id_from_db#
adsorbate_db_path#
__len__()#
__str__()#
__repr__()#
_get_adsorbate_from_random(adsorbate_db)#
_load_adsorbate(adsorbate: tuple[Any, Ellipsis]) None#

Saves the fields from an adsorbate stored in a database. Fields added after the first revision are conditionally added for backwards compatibility with older database files.

class data.oc.core.AdsorbateSlabConfig(slab: fairchem.data.oc.core.slab.Slab, adsorbate: fairchem.data.oc.core.slab.Adsorbate, num_sites: int = 100, num_augmentations_per_site: int = 1, interstitial_gap: float = 0.1, mode: str = 'random')#

Initializes a list of adsorbate-catalyst systems for a given Adsorbate and Slab.

Parameters:
  • slab (Slab) – Slab object.

  • adsorbate (Adsorbate) – Adsorbate object.

  • num_sites (int) – Number of sites to sample.

  • num_augmentations_per_site (int) – Number of augmentations of the adsorbate per site. Total number of generated structures will be num_sites * num_augmentations_per_site.

  • interstitial_gap (float) – Minimum distance in Angstroms between adsorbate and slab atoms.

  • mode (str) –

    “random”, “heuristic”, or “random_site_heuristic_placement”. This affects surface site sampling and adsorbate placement on each site.

    In “random”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along xyz, and place it such that the center of mass is at the site.

    In “heuristic”, we use Pymatgen’s AdsorbateSiteFinder to find the most energetically favorable sites, i.e., ontop, bridge, or hollow sites. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In “random_site_heuristic_placement”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In all cases, the adsorbate is placed at the closest position of no overlap with the slab plus interstitial_gap along the surface normal.

slab#
adsorbate#
num_sites#
num_augmentations_per_site#
interstitial_gap#
mode#
sites#
get_binding_sites(num_sites: int)#

Returns up to num_sites sites given the surface atoms’ positions.

place_adsorbate_on_site(adsorbate: fairchem.data.oc.core.slab.Adsorbate, site: numpy.ndarray, interstitial_gap: float = 0.1)#

Place the adsorbate at the given binding site.

place_adsorbate_on_sites(sites: list, num_augmentations_per_site: int = 1, interstitial_gap: float = 0.1)#

Place the adsorbate at the given binding sites.

_get_scaled_normal(adsorbate_c: ase.Atoms, slab_c: ase.Atoms, site: numpy.ndarray, unit_normal: numpy.ndarray, interstitial_gap: float = 0.1)#

Get the scaled normal that gives a proximate configuration without atomic overlap by:

  1. Projecting the adsorbate and surface atoms onto the surface plane.

  2. Identify all adsorbate atom - surface atom combinations for which

    an itersection when translating along the normal would occur. This is where the distance between the projected points is less than r_surface_atom + r_adsorbate_atom

  3. Explicitly solve for the scaled normal at which the distance between

    surface atom and adsorbate atom = r_surface_atom + r_adsorbate_atom + interstitial_gap. This exploits the superposition of vectors and the distance formula, so it requires root finding.

Assumes that the adsorbate’s binding atom or center-of-mass (depending on mode) is already placed at the site.

Parameters:
  • adsorbate_c (ase.Atoms) – A copy of the adsorbate with coordinates at the site

  • slab_c (ase.Atoms) – A copy of the slab

  • site (np.ndarray) – the coordinate of the site

  • adsorbate_atoms (ase.Atoms) – the translated adsorbate

  • unit_normal (np.ndarray) – the unit vector normal to the surface

  • interstitial_gap (float) – the desired distance between the covalent radii of the closest surface and adsorbate atom

Returns:

the magnitude of the normal vector for placement

Return type:

(float)

_find_combos_to_check(adsorbate_c2: ase.Atoms, slab_c2: ase.Atoms, unit_normal: numpy.ndarray, interstitial_gap: float)#

Find the pairs of surface and adsorbate atoms that would have an intersection event while traversing the normal vector. For each pair, return pertanent information for finding the point of intersection. :param adsorbate_c2: A copy of the adsorbate with coordinates at the centered site :type adsorbate_c2: ase.Atoms :param slab_c2: A copy of the slab with atoms wrapped s.t. things are centered

about the site

Parameters:
  • unit_normal (np.ndarray) – the unit vector normal to the surface

  • interstitial_gap (float) – the desired distance between the covalent radii of the closest surface and adsorbate atom

Returns:

each entry in the list corresponds to one pair to check. With the
following information:

[(adsorbate_idx, slab_idx), r_adsorbate_atom + r_slab_atom, slab_atom_position]

Return type:

(list[lists])

_get_projected_points(adsorbate_c2: ase.Atoms, slab_c2: ase.Atoms, unit_normal: numpy.ndarray)#

Find the x and y coordinates of each atom projected onto the surface plane. :param adsorbate_c2: A copy of the adsorbate with coordinates at the centered site :type adsorbate_c2: ase.Atoms :param slab_c2: A copy of the slab with atoms wrapped s.t. things are centered

about the site

Parameters:

unit_normal (np.ndarray) – the unit vector normal to the surface

Returns:

{“ads”: [[x1, y1], [x2, y2], …], “slab”: [[x1, y1], [x2, y2], …],}

Return type:

(dict)

get_metadata_dict(ind)#

Returns a dict containing the atoms object and metadata for one specified config, used for writing to files.

class data.oc.core.Bulk(bulk_atoms: ase.Atoms = None, bulk_id_from_db: int | None = None, bulk_src_id_from_db: str | None = None, bulk_db_path: str = BULK_PKL_PATH, bulk_db: list[dict[str, Any]] | None = None)#

Initializes a bulk object in one of 4 ways: - Directly pass in an ase.Atoms object. - Pass in index of bulk to select from bulk database. - Pass in the src_id of the bulk to select from the bulk database. - Randomly sample a bulk from bulk database if no other option is passed.

Parameters:
  • bulk_atoms (ase.Atoms) – Bulk structure.

  • bulk_id_from_db (int) – Index of bulk in database pkl to select.

  • bulk_src_id_from_db (int) – Src id of bulk to select (e.g. “mp-30”).

  • bulk_db_path (str) – Path to bulk database.

  • bulk_db (List[Dict[str, Any]]) – Already-loaded database.

bulk_id_from_db#
bulk_db_path#
_get_bulk_from_random(bulk_db)#
set_source_dataset_id(src_id: str)#
set_bulk_id_from_db(bulk_id_from_db: int)#
get_slabs(max_miller=2, precomputed_slabs_dir=None)#

Returns a list of possible slabs for this bulk instance.

__len__()#
__str__()#
__repr__()#
__eq__(other) bool#
class data.oc.core.InterfaceConfig(slab: fairchem.data.oc.core.slab.Slab, adsorbates: list[fairchem.data.oc.core.adsorbate.Adsorbate], solvent: fairchem.data.oc.core.solvent.Solvent, ions: list[fairchem.data.oc.core.ion.Ion] | None = None, num_sites: int = 100, num_configurations: int = 1, interstitial_gap: float = 0.1, vacuum_size: int = 15, solvent_interstitial_gap: float = 2, solvent_depth: float = 8, pbc_shift: float = 0.0, packmol_tolerance: float = 2, mode: str = 'random_site_heuristic_placement')#

Bases: fairchem.data.oc.core.multi_adsorbate_slab_config.MultipleAdsorbateSlabConfig

Class to represent a solvent, adsorbate, slab, ion config. This class only returns a fixed combination of adsorbates placed on the surface. Solvent placement is performed by packmol (https://m3g.github.io/packmol/userguide.shtml), with the number of solvent molecules controlled by its corresponding density. Ion placement is random within the desired volume.

Parameters:
  • slab (Slab) – Slab object.

  • adsorbates (List[Adsorbate]) – List of adsorbate objects to place on the slab.

  • solvent (Solvent) – Solvent object

  • ions (List[Ion] = []) – List of ion objects to place

  • num_sites (int) – Number of sites to sample.

  • num_configurations (int) – Number of configurations to generate per slab+adsorbate(s) combination. This corresponds to selecting different site combinations to place the adsorbates on.

  • interstitial_gap (float) – Minimum distance, in Angstroms, between adsorbate and slab atoms as well as the inter-adsorbate distance.

  • vacuum_size (int) – Size of vacuum layer to add to both ends of the resulting atoms object.

  • solvent_interstitial_gap (float) – Minimum distance, in Angstroms, between the solvent environment and the adsorbate-slab environment.

  • solvent_depth (float) – Volume depth to be used to pack solvents inside.

  • pbc_shift (float) – Cushion to add to the packmol volume to avoid overlapping atoms over pbc.

  • packmol_tolerance (float) – Packmol minimum distance to impose between molecules.

  • mode (str) –

    “random”, “heuristic”, or “random_site_heuristic_placement”. This affects surface site sampling and adsorbate placement on each site.

    In “random”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along xyz, and place it such that the center of mass is at the site.

    In “heuristic”, we use Pymatgen’s AdsorbateSiteFinder to find the most energetically favorable sites, i.e., ontop, bridge, or hollow sites. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In “random_site_heuristic_placement”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In all cases, the adsorbate is placed at the closest position of no overlap with the slab plus interstitial_gap along the surface normal.

solvent#
ions#
vacuum_size#
solvent_depth#
solvent_interstitial_gap#
pbc_shift#
packmol_tolerance#
n_mol_per_volume#
create_interface_on_sites(atoms_list: list[ase.Atoms], metadata_list: list[dict])#

Given adsorbate+slab configurations generated from (Multi)AdsorbateSlabConfig and its corresponding metadata, create the solvent/ion interface on top of the provided atoms objects.

create_packmol_atoms(geometry: fairchem.data.oc.utils.geometry.Geometry, n_solvent_mols: int)#

Pack solvent molecules in a provided unit cell volume. Packmol is used to randomly pack solvent molecules in the desired volume.

Parameters:
  • geometry (Geometry) – Geometry object corresponding to the desired cell.

  • n_solvent_mols (int) – Number of solvent molecules to pack in the volume.

run_packmol(packmol_input: str)#

Run packmol.

randomize_coords(atoms: ase.Atoms)#

Randomly place the atoms in its unit cell.

class data.oc.core.Ion(ion_atoms: ase.Atoms = None, ion_id_from_db: int | None = None, ion_db_path: str = ION_PKL_PATH)#

Initializes an ion object in one of 2 ways: - Directly pass in an ase.Atoms object. - Pass in index of ion to select from ion database.

Parameters:
  • ion_atoms (ase.Atoms) – ion structure.

  • ion_id_from_db (int) – Index of ion to select.

  • ion_db_path (str) – Path to ion database.

ion_id_from_db#
ion_db_path#
__len__()#
__str__()#
_load_ion(ion: dict) None#

Saves the fields from an ion stored in a database. Fields added after the first revision are conditionally added for backwards compatibility with older database files.

get_ion_concentration(volume)#

Compute the ion concentration units of M, given a volume in units of Angstrom^3.

class data.oc.core.MultipleAdsorbateSlabConfig(slab: fairchem.data.oc.core.slab.Slab, adsorbates: list[fairchem.data.oc.core.adsorbate.Adsorbate], num_sites: int = 100, num_configurations: int = 1, interstitial_gap: float = 0.1, mode: str = 'random_site_heuristic_placement')#

Bases: fairchem.data.oc.core.adsorbate_slab_config.AdsorbateSlabConfig

Class to represent a slab with multiple adsorbates on it. This class only returns a fixed combination of adsorbates placed on the surface. Unlike AdsorbateSlabConfig which enumerates all possible adsorbate placements, this problem gets combinatorially large.

Parameters:
  • slab (Slab) – Slab object.

  • adsorbates (List[Adsorbate]) – List of adsorbate objects to place on the slab.

  • num_sites (int) – Number of sites to sample.

  • num_configurations (int) – Number of configurations to generate per slab+adsorbate(s) combination. This corresponds to selecting different site combinations to place the adsorbates on.

  • interstitial_gap (float) – Minimum distance, in Angstroms, between adsorbate and slab atoms as well as the inter-adsorbate distance.

  • mode (str) –

    “random”, “heuristic”, or “random_site_heuristic_placement”. This affects surface site sampling and adsorbate placement on each site.

    In “random”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along xyz, and place it such that the center of mass is at the site.

    In “heuristic”, we use Pymatgen’s AdsorbateSiteFinder to find the most energetically favorable sites, i.e., ontop, bridge, or hollow sites. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In “random_site_heuristic_placement”, we do a Delaunay triangulation of the surface atoms, then sample sites uniformly at random within each triangle. When placing the adsorbate, we randomly rotate it along z with only slight rotation along x and y, and place it such that the binding atom is at the site.

    In all cases, the adsorbate is placed at the closest position of no overlap with the slab plus interstitial_gap along the surface normal.

slab#
adsorbates#
num_sites#
interstitial_gap#
mode#
sites#
place_adsorbates_on_sites(sites: list, num_configurations: int = 1, interstitial_gap: float = 0.1)#

Place the adsorbate at the given binding sites.

This method generates a fixed number of configurations where sites are selected to ensure that adsorbate binding indices are at least a fair distance away from each other (covalent radii + interstitial gap). While this helps prevent adsorbate overlap it does not gaurantee it since non-binding adsorbate atoms can overlap if the right combination of angles is sampled.

get_metadata_dict(ind)#

Returns a dict containing the atoms object and metadata for one specified config, used for writing to files.

class data.oc.core.Slab(bulk=None, slab_atoms: ase.Atoms = None, millers: tuple | None = None, shift: float | None = None, top: bool | None = None, oriented_bulk: pymatgen.core.structure.Structure = None, min_ab: float = 8.0)#

Initializes a slab object, i.e. a particular slab tiled along xyz, in one of 2 ways: - Pass in a Bulk object and a slab 5-tuple containing (atoms, miller, shift, top, oriented bulk). - Pass in a Bulk object and randomly sample a slab.

Parameters:
  • bulk (Bulk) – Corresponding Bulk object.

  • slab_atoms (ase.Atoms) – Slab atoms, tiled and tagged

  • millers (tuple) – Miller indices of slab.

  • shift (float) – Shift of slab.

  • top (bool) – Whether slab is top or bottom.

  • min_ab (float) – To confirm that the tiled structure spans this distance

bulk#
atoms#
millers#
shift#
top#
oriented_bulk#
classmethod from_bulk_get_random_slab(bulk=None, max_miller=2, min_ab=8.0, save_path=None)#
classmethod from_bulk_get_specific_millers(specific_millers, bulk=None, min_ab=8.0, save_path=None)#
classmethod from_bulk_get_all_slabs(bulk=None, max_miller=2, min_ab=8.0, save_path=None)#
classmethod from_precomputed_slabs_pkl(bulk=None, precomputed_slabs_pkl=None, max_miller=2, min_ab=8.0)#
classmethod from_atoms(atoms: ase.Atoms = None, bulk=None, **kwargs)#
has_surface_tagged()#
get_metadata_dict()#
__len__()#
__str__()#
__repr__()#
__eq__(other)#
class data.oc.core.Solvent(solvent_atoms: ase.Atoms = None, solvent_id_from_db: int | None = None, solvent_db_path: str | None = SOLVENT_PKL_PATH, solvent_density: float | None = None)#

Initializes a solvent object in one of 2 ways: - Directly pass in an ase.Atoms object. - Pass in index of solvent to select from solvent database.

Parameters:
  • solvent_atoms (ase.Atoms) – Solvent molecule

  • solvent_id_from_db (int) – Index of solvent to select.

  • solvent_db_path (str) – Path to solvent database.

  • solvent_density (float) – Desired solvent density to use. If not specified, the default is used from the solvent databases.

solvent_id_from_db#
solvent_db_path#
solvent_density#
molar_mass#
__len__()#
__str__()#
_load_solvent(solvent: dict) None#

Saves the fields from an adsorbate stored in a database. Fields added after the first revision are conditionally added for backwards compatibility with older database files.

property molecules_per_volume#
Convert the solvent density in g/cm3 to the number of molecules per
angstrom cubed of volume.