Common gotchas with fairchem

Common gotchas with fairchem#

OutOfMemoryError#

If you see errors like:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 390.00 MiB (GPU 0; 10.76 GiB total capacity; 9.59 GiB already allocated; 170.06 MiB free; 9.81 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

It means your GPU is out of memory. Some reasons could be that you have multiple notebooks open that are using the GPU, e.g. they have loaded a calculator or something. Try closing all the other notebooks.

It could also mean the batch size is too large to fit in memory. You can try making it smaller in the yml config file (optim.batch_size).

It is recommended you use automatic mixed precision, –amp, in the options to main.py, or in the config.yml.

If it is an option, you can try a GPU with more memory, or you may be able to split the job over multiple GPUs.

I want the energy of a gas phase atom#

But I get an error like

RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

The problem here is that no neighbors are found for the single atom which causes an error. This may be model dependent. There is currently no way to get atomic energies for some models.

from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/fairchem_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)

/home/runner/work/fairchem/fairchem/src/fairchem/core/models/scn/spherical_harmonics.py:23: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  _Jd = torch.load(os.path.join(os.path.dirname(__file__), "Jd.pt"))
/home/runner/work/fairchem/fairchem/src/fairchem/core/models/equiformer_v2/wigner.py:10: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  _Jd = torch.load(os.path.join(os.path.dirname(__file__), "Jd.pt"))
/home/runner/work/fairchem/fairchem/src/fairchem/core/models/escn/so3.py:23: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  _Jd = torch.load(os.path.join(os.path.dirname(__file__), "Jd.pt"))
/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))

WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: oc22_lmdb
  key_mapping:
    force: forces
    y: energy
  normalize_labels: false
  oc20_ref: /checkpoint/janlan/ocp/other_data/final_ref_energies_02_07_2021.pkl
  raw_energy_target: true
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 1
    fn: mae
- forces:
    coefficient: 1
    fn: l2mae
model:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  name: gemnet_oc
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: atomwisel2
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/57632342
  gpus_per_node: 8
  job_id: '57632342'
  job_name: gnoc_oc22_oc20_all_s2ef
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: ocp,learnaccel
  time: 4320
task:
  dataset: oc22_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: gemnet_oc

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

INFO:root:Loaded GemNetOC with 38864438 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

INFO:root:Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 

WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

%%capture
from ase.build import bulk
atoms = bulk('Cu', a=10)
atoms.set_calculator(calc)
atoms.get_potential_energy()

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 4
atoms = bulk('Cu', a=10)
atoms.set_calculator(calc)
----> 4 atoms.get_potential_energy()

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ase/atoms.py:770, in Atoms.get_potential_energy(self, force_consistent, apply_constraint)
   energy = self._calc.get_potential_energy(
       self, force_consistent=force_consistent)
else:
--> 770     energy = self._calc.get_potential_energy(self)
if apply_constraint:
   for constraint in self.constraints:

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ase/calculators/abc.py:24, in GetPropertiesMixin.get_potential_energy(self, atoms, force_consistent)
else:
   name = 'energy'
---> 24 return self.get_property(name, atoms)

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ase/calculators/calculator.py:538, in BaseCalculator.get_property(self, name, atoms, allow_calculation)
   if self.use_cache:
       self.atoms = atoms.copy()
--> 538     self.calculate(atoms, [name], system_changes)
if name not in self.results:
   # For some reason the calculator was not able to do what we want,
   # and that is OK.
   raise PropertyNotImplementedError(
       '{} not present in this ' 'calculation'.format(name)
   )

File ~/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:304, in OCPCalculator.calculate(self, atoms, properties, system_changes)
else:
   batch = atoms
--> 304 predictions = self.trainer.predict(batch, per_image=False, disable_tqdm=True)
for key in predictions:
   _pred = predictions[key]

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/utils/_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
@functools.wraps(func)
def decorate_context(*args, **kwargs):
   with ctx_factory():
--> 116         return func(*args, **kwargs)

File ~/work/fairchem/fairchem/src/fairchem/core/trainers/ocp_trainer.py:473, in OCPTrainer.predict(self, data_loader, per_image, results_file, disable_tqdm)
for _, batch in tqdm(
   enumerate(data_loader),
   total=len(data_loader),
   (...)    470     disable=disable_tqdm,
):
   with torch.autocast("cuda", enabled=self.scaler is not None):
--> 473         out = self._forward(batch)
   for target_key in self.config["outputs"]:
       pred = self._denorm_preds(target_key, out[target_key], batch)

File ~/work/fairchem/fairchem/src/fairchem/core/trainers/ocp_trainer.py:254, in OCPTrainer._forward(self, batch)
def _forward(self, batch):
--> 254     out = self.model(batch.to(self.device))
   outputs = {}
   batch_size = batch.natoms.numel()

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1553, in Module._wrapped_call_impl(self, *args, **kwargs)
   return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
else:
-> 1553     return self._call_impl(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1562, in Module._call_impl(self, *args, **kwargs)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
       or _global_backward_pre_hooks or _global_backward_hooks
       or _global_forward_hooks or _global_forward_pre_hooks):
-> 1562     return forward_call(*args, **kwargs)
try:
   result = None

File ~/work/fairchem/fairchem/src/fairchem/core/common/utils.py:177, in conditional_grad.<locals>.decorator.<locals>.cls_method(self, *args, **kwargs)
if self.regress_forces and not getattr(self, "direct_forces", 0):
   f = dec(func)
--> 177 return f(self, *args, **kwargs)

File ~/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/gemnet_oc.py:1217, in GemNetOC.forward(self, data)
(
   main_graph,
   a2a_graph,
   (...)   1204     quad_idx,
) = self.get_graphs_and_indices(data)
_, idx_t = main_graph["edge_index"]
(
   basis_rad_raw,
   basis_atom_update,
   basis_output,
   bases_qint,
   bases_e2e,
   bases_a2e,
   bases_e2a,
   basis_a2a_rad,
-> 1217 ) = self.get_bases(
   main_graph=main_graph,
   a2a_graph=a2a_graph,
   a2ee2a_graph=a2ee2a_graph,
   qint_graph=qint_graph,
   trip_idx_e2e=trip_idx_e2e,
   trip_idx_a2e=trip_idx_a2e,
   trip_idx_e2a=trip_idx_e2a,
   quad_idx=quad_idx,
   num_atoms=num_atoms,
)
# Embedding block
h = self.atom_emb(atomic_numbers)

File ~/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/gemnet_oc.py:1119, in GemNetOC.get_bases(self, main_graph, a2a_graph, a2ee2a_graph, qint_graph, trip_idx_e2e, trip_idx_a2e, trip_idx_e2a, quad_idx, num_atoms)
if self.quad_interaction:
   bases_qint["rad"] = self.mlp_rbf_qint(basis_rad_main_raw)
-> 1119     bases_qint["cir"] = self.mlp_cbf_qint(
       rad_basis=basis_rad_cir_qint_raw,
       sph_basis=basis_cir_qint_raw,
       idx_sph_outer=quad_idx["triplet_in"]["out"],
   )
   bases_qint["sph"] = self.mlp_sbf_qint(
       rad_basis=basis_rad_sph_qint_raw,
       sph_basis=basis_sph_qint_raw,
       idx_sph_outer=quad_idx["out"],
       idx_sph_inner=quad_idx["out_agg"],
   )
bases_a2e = {}

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1553, in Module._wrapped_call_impl(self, *args, **kwargs)
   return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
else:
-> 1553     return self._call_impl(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1562, in Module._call_impl(self, *args, **kwargs)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
       or _global_backward_pre_hooks or _global_backward_hooks
       or _global_forward_hooks or _global_forward_pre_hooks):
-> 1562     return forward_call(*args, **kwargs)
try:
   result = None

File ~/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/layers/efficient.py:103, in BasisEmbedding.forward(self, rad_basis, sph_basis, idx_rad_outer, idx_rad_inner, idx_sph_outer, idx_sph_inner, num_atoms)
   rad_W1 = rad_basis @ self.weight.reshape(self.weight.shape[0], -1)
   # (num_edges, emb_size_interm * num_spherical)
--> 103     rad_W1 = rad_W1.reshape(num_edges, -1, sph_basis.shape[-1])
   # (num_edges, emb_size_interm, num_spherical)
else:
   # MatMul: mul + sum over num_radial
   rad_W1 = rad_basis @ self.weight.T

RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 7] because the unspecified dimension size -1 can be any value and is ambiguous

I get wildly different energies from the different models#

Some models are trained on adsorption energies, and some are trained on total energies. You have to know which one you are using.

Sometimes you can tell by the magnitude of energies, but you should use care with this. If energies are “small” and near zero they are likely adsorption energies. If energies are “large” in magnitude they are probably total energies. This can be misleading though, as it depends on the total number of atoms in the systems.

# These are to suppress the output from making the calculators.
from io import StringIO
import contextlib

from ase.build import fcc111, add_adsorbate
slab = fcc111('Pt', size=(2, 2, 5), vacuum=10.0)
add_adsorbate(slab, 'O', height=1.2, position='fcc')

from fairchem.core.models.model_registry import model_name_to_local_file

# OC20 model - trained on adsorption energies
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EF-OC20-All', local_cache='/tmp/fairchem_checkpoints/')

with contextlib.redirect_stdout(StringIO()) as _:
    calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)

slab.set_calculator(calc)
slab.get_potential_energy()

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model GemNet-OC-S2EF-OC20-All

/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))
WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: trajectory_lmdb
  grad_target_mean: 0.0
  grad_target_std: 2.887317180633545
  key_mapping:
    force: forces
    y: energy
  normalize_labels: true
  target_mean: -0.7554450631141663
  target_std: 2.887317180633545
  transforms:
    normalizer:
      energy:
        mean: -0.7554450631141663
        stdev: 2.887317180633545
      forces:
        mean: 0.0
        stdev: 2.887317180633545
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 1
    fn: mae
- forces:
    coefficient: 100
    fn: l2mae
model:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  name: gemnet_oc
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 100
  load_balancing: atoms
  loss_energy: mae
  loss_force: l2mae
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/46876566
  gpus_per_node: 8
  job_id: '46876566'
  job_name: gemnet_q_all_fc100
  mem: 480GB
  nodes: 4
  ntasks_per_node: 8
  partition: learnaccel
  time: 4320
task:
  dataset: trajectory_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: gemnet_oc

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

INFO:root:Loaded GemNetOC with 38864438 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

INFO:root:Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 

/home/runner/work/fairchem/fairchem/src/fairchem/core/modules/normalization/normalizer.py:69: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  "mean": torch.tensor(state_dict["mean"]),
WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

/tmp/ipykernel_3657/2356712572.py:11: FutureWarning: Please use atoms.calc = calc
  slab.set_calculator(calc)

1.2851653099060059

# An OC22 checkpoint - trained on total energy
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/fairchem_checkpoints/')

with contextlib.redirect_stdout(StringIO()) as _:
    calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)



slab.set_calculator(calc)
slab.get_potential_energy()

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model GemNet-OC-S2EFS-OC20+OC22

/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))
WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: oc22_lmdb
  key_mapping:
    force: forces
    y: energy
  normalize_labels: false
  oc20_ref: /checkpoint/janlan/ocp/other_data/final_ref_energies_02_07_2021.pkl
  raw_energy_target: true
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 1
    fn: mae
- forces:
    coefficient: 1
    fn: l2mae
model:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  name: gemnet_oc
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: atomwisel2
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/57632342
  gpus_per_node: 8
  job_id: '57632342'
  job_name: gnoc_oc22_oc20_all_s2ef
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: ocp,learnaccel
  time: 4320
task:
  dataset: oc22_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: gemnet_oc

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

INFO:root:Loaded GemNetOC with 38864438 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

INFO:root:Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 

WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

/tmp/ipykernel_3657/2054440827.py:9: FutureWarning: Please use atoms.calc = calc
  slab.set_calculator(calc)

-110.40040588378906

# This eSCN model is trained on adsorption energies
checkpoint_path = model_name_to_local_file('eSCN-L4-M2-Lay12-S2EF-OC20-2M', local_cache='/tmp/fairchem_checkpoints/')

with contextlib.redirect_stdout(StringIO()) as _:
    calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)

slab.set_calculator(calc)
slab.get_potential_energy()

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model eSCN-L4-M2-Lay12-S2EF-OC20-2M

/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))
WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: trajectory_lmdb
  grad_target_mean: 0.0
  grad_target_std: 2.887317180633545
  key_mapping:
    force: forces
    y: energy
  normalize_labels: true
  target_mean: -0.7554450631141663
  target_std: 2.887317180633545
  transforms:
    normalizer:
      energy:
        mean: -0.7554450631141663
        stdev: 2.887317180633545
      forces:
        mean: 0.0
        stdev: 2.887317180633545
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 2
    fn: mae
- forces:
    coefficient: 100
    fn: l2mae
model:
  basis_width_scalar: 2.0
  cutoff: 12.0
  distance_function: gaussian
  hidden_channels: 256
  lmax_list:
  - 4
  max_neighbors: 20
  mmax_list:
  - 2
  name: escn
  num_layers: 12
  num_sphere_samples: 128
  otf_graph: true
  regress_forces: true
  sphere_channels: 128
  use_pbc: true
optim:
  batch_size: 6
  clip_grad_norm: 20
  ema_decay: 0.999
  energy_coefficient: 2
  eval_batch_size: 6
  eval_every: 5000
  force_coefficient: 100
  loss_energy: mae
  loss_force: l2mae
  lr_gamma: 0.3
  lr_initial: 0.0008
  lr_milestones:
  - 145833
  - 187500
  - 229166
  max_epochs: 12
  num_workers: 8
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  warmup_factor: 0.2
  warmup_steps: 100
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  cpus_per_task: 9
  folder: /checkpoint/zitnick/ocp_logs/3710525
  gpus_per_node: 8
  job_id: '3710525'
  job_name: eSCN-L4-M2-Lay12-2M
  mem: 480GB
  nodes: 2
  ntasks_per_node: 8
  partition: learnaccel
  time: 4320
task:
  dataset: trajectory_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  relax_dataset:
    src: /checkpoint/electrocatalysis/relaxations/features/init_to_relaxed/test/id/data.lmdb
  relax_opt:
    alpha: 70.0
    damping: 1.0
    maxstep: 0.04
    memory: 50
    name: lbfgs
    traj_dir: /checkpoint/zitnick/ocp/mloutputs/scn_relaxations/SCNF72-6-lay12/val_id/
  relaxation_steps: 200
  train_on_free_atoms: true
  type: regression
  write_pos: true
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: escn

INFO:root:Loaded eSCN with 36112896 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

/home/runner/work/fairchem/fairchem/src/fairchem/core/modules/normalization/normalizer.py:69: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  "mean": torch.tensor(state_dict["mean"]),
WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

/tmp/ipykernel_3657/1817216860.py:7: FutureWarning: Please use atoms.calc = calc
  slab.set_calculator(calc)

1.6758108139038086

Miscellaneous warnings#

In general, warnings are not errors.

Unrecognized arguments#

With Gemnet models you might see warnings like:

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

You can ignore this warning, it is not important for predictions.

Unable to identify ocp trainer#

The trainer is not specified in some checkpoints, and defaults to forces which means energy and forces are calculated. This is the default for the ASE OCP calculator, and this warning just alerts you it is setting that.

WARNING:root:Unable to identify ocp trainer, defaulting to `forces`. Specify the `trainer` argument into OCPCalculator if otherwise.

Request entity too large - can’t save your Notebook#

If you run commands that generate a lot of output in a notebook, sometimes the Jupyter notebook will become too large to save. It is kind of sad, the only thing I know to do is delete the output of the cell. Then maybe you can save it.

A solution after you know it happens is redirect output to a file.

This has happened when running training in a notebook where there are too many lines of output, or if you have a lot (20+) of inline images.

You need at least four atoms for molecules with some models#

Gemnet in particular seems to require at least 4 atoms. This has to do with interactions between atoms and their neighbors.

%%capture
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os

checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/fairchem_checkpoints/')

calc = OCPCalculator(checkpoint_path=checkpoint_path)

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model GemNet-OC-S2EFS-OC20+OC22

WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: oc22_lmdb
  key_mapping:
    force: forces
    y: energy
  normalize_labels: false
  oc20_ref: /checkpoint/janlan/ocp/other_data/final_ref_energies_02_07_2021.pkl
  raw_energy_target: true
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 1
    fn: mae
- forces:
    coefficient: 1
    fn: l2mae
model:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  name: gemnet_oc
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: atomwisel2
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/57632342
  gpus_per_node: 8
  job_id: '57632342'
  job_name: gnoc_oc22_oc20_all_s2ef
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: ocp,learnaccel
  time: 4320
task:
  dataset: oc22_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: gemnet_oc

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

INFO:root:Loaded GemNetOC with 38864438 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

INFO:root:Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 

WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

%%capture
from ase.build import molecule
import numpy as np

atoms = molecule('H2O')
atoms.set_tags(np.ones(len(atoms)))
atoms.set_calculator(calc)
atoms.get_potential_energy()

To tag or not?#

Some models use tags to determine which atoms to calculate energies for. For example, Gemnet uses a tag=1 to indicate the atom should be calculated. You will get an error with this model

%%capture
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os

checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/fairchem_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model GemNet-OC-S2EFS-OC20+OC22

WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: oc22_lmdb
  key_mapping:
    force: forces
    y: energy
  normalize_labels: false
  oc20_ref: /checkpoint/janlan/ocp/other_data/final_ref_energies_02_07_2021.pkl
  raw_energy_target: true
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 1
    fn: mae
- forces:
    coefficient: 1
    fn: l2mae
model:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  name: gemnet_oc
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: atomwisel2
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/57632342
  gpus_per_node: 8
  job_id: '57632342'
  job_name: gnoc_oc22_oc20_all_s2ef
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: ocp,learnaccel
  time: 4320
task:
  dataset: oc22_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: gemnet_oc

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

INFO:root:Loaded GemNetOC with 38864438 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

INFO:root:Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 

WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

%%capture
atoms = molecule('CH4')
atoms.set_calculator(calc)
atoms.get_potential_energy()  # error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[11], line 3
atoms = molecule('CH4')
atoms.set_calculator(calc)
----> 3 atoms.get_potential_energy()  # error

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ase/atoms.py:770, in Atoms.get_potential_energy(self, force_consistent, apply_constraint)
   energy = self._calc.get_potential_energy(
       self, force_consistent=force_consistent)
else:
--> 770     energy = self._calc.get_potential_energy(self)
if apply_constraint:
   for constraint in self.constraints:

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ase/calculators/abc.py:24, in GetPropertiesMixin.get_potential_energy(self, atoms, force_consistent)
else:
   name = 'energy'
---> 24 return self.get_property(name, atoms)

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ase/calculators/calculator.py:538, in BaseCalculator.get_property(self, name, atoms, allow_calculation)
   if self.use_cache:
       self.atoms = atoms.copy()
--> 538     self.calculate(atoms, [name], system_changes)
if name not in self.results:
   # For some reason the calculator was not able to do what we want,
   # and that is OK.
   raise PropertyNotImplementedError(
       '{} not present in this ' 'calculation'.format(name)
   )

File ~/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:304, in OCPCalculator.calculate(self, atoms, properties, system_changes)
else:
   batch = atoms
--> 304 predictions = self.trainer.predict(batch, per_image=False, disable_tqdm=True)
for key in predictions:
   _pred = predictions[key]

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/utils/_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
@functools.wraps(func)
def decorate_context(*args, **kwargs):
   with ctx_factory():
--> 116         return func(*args, **kwargs)

File ~/work/fairchem/fairchem/src/fairchem/core/trainers/ocp_trainer.py:473, in OCPTrainer.predict(self, data_loader, per_image, results_file, disable_tqdm)
for _, batch in tqdm(
   enumerate(data_loader),
   total=len(data_loader),
   (...)    470     disable=disable_tqdm,
):
   with torch.autocast("cuda", enabled=self.scaler is not None):
--> 473         out = self._forward(batch)
   for target_key in self.config["outputs"]:
       pred = self._denorm_preds(target_key, out[target_key], batch)

File ~/work/fairchem/fairchem/src/fairchem/core/trainers/ocp_trainer.py:254, in OCPTrainer._forward(self, batch)
def _forward(self, batch):
--> 254     out = self.model(batch.to(self.device))
   outputs = {}
   batch_size = batch.natoms.numel()

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1553, in Module._wrapped_call_impl(self, *args, **kwargs)
   return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
else:
-> 1553     return self._call_impl(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1562, in Module._call_impl(self, *args, **kwargs)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
       or _global_backward_pre_hooks or _global_backward_hooks
       or _global_forward_hooks or _global_forward_pre_hooks):
-> 1562     return forward_call(*args, **kwargs)
try:
   result = None

File ~/work/fairchem/fairchem/src/fairchem/core/common/utils.py:177, in conditional_grad.<locals>.decorator.<locals>.cls_method(self, *args, **kwargs)
if self.regress_forces and not getattr(self, "direct_forces", 0):
   f = dec(func)
--> 177 return f(self, *args, **kwargs)

File ~/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/gemnet_oc.py:1217, in GemNetOC.forward(self, data)
(
   main_graph,
   a2a_graph,
   (...)   1204     quad_idx,
) = self.get_graphs_and_indices(data)
_, idx_t = main_graph["edge_index"]
(
   basis_rad_raw,
   basis_atom_update,
   basis_output,
   bases_qint,
   bases_e2e,
   bases_a2e,
   bases_e2a,
   basis_a2a_rad,
-> 1217 ) = self.get_bases(
   main_graph=main_graph,
   a2a_graph=a2a_graph,
   a2ee2a_graph=a2ee2a_graph,
   qint_graph=qint_graph,
   trip_idx_e2e=trip_idx_e2e,
   trip_idx_a2e=trip_idx_a2e,
   trip_idx_e2a=trip_idx_e2a,
   quad_idx=quad_idx,
   num_atoms=num_atoms,
)
# Embedding block
h = self.atom_emb(atomic_numbers)

File ~/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/gemnet_oc.py:1119, in GemNetOC.get_bases(self, main_graph, a2a_graph, a2ee2a_graph, qint_graph, trip_idx_e2e, trip_idx_a2e, trip_idx_e2a, quad_idx, num_atoms)
if self.quad_interaction:
   bases_qint["rad"] = self.mlp_rbf_qint(basis_rad_main_raw)
-> 1119     bases_qint["cir"] = self.mlp_cbf_qint(
       rad_basis=basis_rad_cir_qint_raw,
       sph_basis=basis_cir_qint_raw,
       idx_sph_outer=quad_idx["triplet_in"]["out"],
   )
   bases_qint["sph"] = self.mlp_sbf_qint(
       rad_basis=basis_rad_sph_qint_raw,
       sph_basis=basis_sph_qint_raw,
       idx_sph_outer=quad_idx["out"],
       idx_sph_inner=quad_idx["out_agg"],
   )
bases_a2e = {}

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1553, in Module._wrapped_call_impl(self, *args, **kwargs)
   return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
else:
-> 1553     return self._call_impl(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:1562, in Module._call_impl(self, *args, **kwargs)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
       or _global_backward_pre_hooks or _global_backward_hooks
       or _global_forward_hooks or _global_forward_pre_hooks):
-> 1562     return forward_call(*args, **kwargs)
try:
   result = None

File ~/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/layers/efficient.py:103, in BasisEmbedding.forward(self, rad_basis, sph_basis, idx_rad_outer, idx_rad_inner, idx_sph_outer, idx_sph_inner, num_atoms)
   rad_W1 = rad_basis @ self.weight.reshape(self.weight.shape[0], -1)
   # (num_edges, emb_size_interm * num_spherical)
--> 103     rad_W1 = rad_W1.reshape(num_edges, -1, sph_basis.shape[-1])
   # (num_edges, emb_size_interm, num_spherical)
else:
   # MatMul: mul + sum over num_radial
   rad_W1 = rad_basis @ self.weight.T

RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 7] because the unspecified dimension size -1 can be any value and is ambiguous

atoms = molecule('CH4')
atoms.set_tags(np.ones(len(atoms)))  # <- critical line for Gemnet
atoms.set_calculator(calc)
atoms.get_potential_energy()

/tmp/ipykernel_3657/3906293788.py:3: FutureWarning: Please use atoms.calc = calc
  atoms.set_calculator(calc)

-23.71796226501465

Not all models require tags though. This EquiformerV2 model does not use them. This is another detail that is important to keep in mind.

from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os

checkpoint_path = model_name_to_local_file('EquiformerV2-31M-S2EF-OC20-All+MD', local_cache='/tmp/fairchem_checkpoints/')

calc = OCPCalculator(checkpoint_path=checkpoint_path)

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model EquiformerV2-31M-S2EF-OC20-All+MD

/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))
WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: trajectory_lmdb_v2
  grad_target_mean: 0.0
  grad_target_std: 2.887317180633545
  key_mapping:
    force: forces
    y: energy
  normalize_labels: true
  target_mean: -0.7554450631141663
  target_std: 2.887317180633545
  transforms:
    normalizer:
      energy:
        mean: -0.7554450631141663
        stdev: 2.887317180633545
      forces:
        mean: 0.0
        stdev: 2.887317180633545
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 4
    fn: mae
- forces:
    coefficient: 100
    fn: l2mae
model:
  alpha_drop: 0.1
  attn_activation: silu
  attn_alpha_channels: 64
  attn_hidden_channels: 64
  attn_value_channels: 16
  distance_function: gaussian
  drop_path_rate: 0.1
  edge_channels: 128
  ffn_activation: silu
  ffn_hidden_channels: 128
  grid_resolution: 18
  lmax_list:
  - 4
  max_neighbors: 20
  max_num_elements: 90
  max_radius: 12.0
  mmax_list:
  - 2
  name: equiformer_v2
  norm_type: layer_norm_sh
  num_distance_basis: 512
  num_heads: 8
  num_layers: 8
  num_sphere_samples: 128
  otf_graph: true
  proj_drop: 0.0
  regress_forces: true
  sphere_channels: 128
  use_atom_edge_embedding: true
  use_gate_act: false
  use_grid_mlp: true
  use_pbc: true
  use_s2_act_attn: false
  weight_init: uniform
optim:
  batch_size: 8
  clip_grad_norm: 100
  ema_decay: 0.999
  energy_coefficient: 4
  eval_batch_size: 8
  eval_every: 10000
  force_coefficient: 100
  grad_accumulation_steps: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: l2mae
  lr_initial: 0.0004
  max_epochs: 3
  num_workers: 8
  optimizer: AdamW
  optimizer_params:
    weight_decay: 0.001
  scheduler: LambdaLR
  scheduler_params:
    epochs: 1009275
    lambda_type: cosine
    lr: 0.0004
    lr_min_factor: 0.01
    warmup_epochs: 3364.25
    warmup_factor: 0.2
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 9
  folder: /checkpoint/abhshkdz/open-catalyst-project/logs/equiformer_v2/8307793
  gpus_per_node: 8
  job_id: '8307793'
  job_name: eq2s_051701_allmd
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: learnaccel
  time: 4320
task:
  dataset: trajectory_lmdb_v2
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  primary_metric: forces_mae
  train_on_free_atoms: true
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: equiformer_v2

WARNING:root:equiformer_v2 (EquiformerV2) class is deprecated in favor of equiformer_v2_backbone_and_heads  (EquiformerV2BackboneAndHeads)

INFO:root:Loaded EquiformerV2 with 31058690 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

/home/runner/work/fairchem/fairchem/src/fairchem/core/modules/normalization/normalizer.py:69: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  "mean": torch.tensor(state_dict["mean"]),
WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

atoms = molecule('CH4')

atoms.set_calculator(calc)
atoms.get_potential_energy()

/tmp/ipykernel_3657/4094489779.py:3: FutureWarning: Please use atoms.calc = calc
  atoms.set_calculator(calc)

-0.42973771691322327

Stochastic simulation results#

Some models are not deterministic (SCN/eSCN/EqV2), i.e. you can get slightly different answers each time you run it. An example is shown below. See Issue 563 for more discussion. This happens because a random selection of is made to sample edges, and a different selection is made each time you run it.

from fairchem.core.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator

checkpoint_path = model_name_to_local_file('EquiformerV2-31M-S2EF-OC20-All+MD', local_cache='/tmp/fairchem_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=True)

from ase.build import fcc111, add_adsorbate
from ase.optimize import BFGS
slab = fcc111('Pt', size=(2, 2, 5), vacuum=10.0)
add_adsorbate(slab, 'O', height=1.2, position='fcc')
slab.set_calculator(calc)

results = []
for i in range(10):
    calc.calculate(slab, ['energy'], None)
    results += [slab.get_potential_energy()]

import numpy as np
print(np.mean(results), np.std(results))
for result in results:
    print(result)

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model EquiformerV2-31M-S2EF-OC20-All+MD

/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))

WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: trajectory_lmdb_v2
  grad_target_mean: 0.0
  grad_target_std: 2.887317180633545
  key_mapping:
    force: forces
    y: energy
  normalize_labels: true
  target_mean: -0.7554450631141663
  target_std: 2.887317180633545
  transforms:
    normalizer:
      energy:
        mean: -0.7554450631141663
        stdev: 2.887317180633545
      forces:
        mean: 0.0
        stdev: 2.887317180633545
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 4
    fn: mae
- forces:
    coefficient: 100
    fn: l2mae
model:
  alpha_drop: 0.1
  attn_activation: silu
  attn_alpha_channels: 64
  attn_hidden_channels: 64
  attn_value_channels: 16
  distance_function: gaussian
  drop_path_rate: 0.1
  edge_channels: 128
  ffn_activation: silu
  ffn_hidden_channels: 128
  grid_resolution: 18
  lmax_list:
  - 4
  max_neighbors: 20
  max_num_elements: 90
  max_radius: 12.0
  mmax_list:
  - 2
  name: equiformer_v2
  norm_type: layer_norm_sh
  num_distance_basis: 512
  num_heads: 8
  num_layers: 8
  num_sphere_samples: 128
  otf_graph: true
  proj_drop: 0.0
  regress_forces: true
  sphere_channels: 128
  use_atom_edge_embedding: true
  use_gate_act: false
  use_grid_mlp: true
  use_pbc: true
  use_s2_act_attn: false
  weight_init: uniform
optim:
  batch_size: 8
  clip_grad_norm: 100
  ema_decay: 0.999
  energy_coefficient: 4
  eval_batch_size: 8
  eval_every: 10000
  force_coefficient: 100
  grad_accumulation_steps: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: l2mae
  lr_initial: 0.0004
  max_epochs: 3
  num_workers: 8
  optimizer: AdamW
  optimizer_params:
    weight_decay: 0.001
  scheduler: LambdaLR
  scheduler_params:
    epochs: 1009275
    lambda_type: cosine
    lr: 0.0004
    lr_min_factor: 0.01
    warmup_epochs: 3364.25
    warmup_factor: 0.2
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 9
  folder: /checkpoint/abhshkdz/open-catalyst-project/logs/equiformer_v2/8307793
  gpus_per_node: 8
  job_id: '8307793'
  job_name: eq2s_051701_allmd
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: learnaccel
  time: 4320
task:
  dataset: trajectory_lmdb_v2
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  primary_metric: forces_mae
  train_on_free_atoms: true
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: equiformer_v2

WARNING:root:equiformer_v2 (EquiformerV2) class is deprecated in favor of equiformer_v2_backbone_and_heads  (EquiformerV2BackboneAndHeads)

INFO:root:Loaded EquiformerV2 with 31058690 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

/home/runner/work/fairchem/fairchem/src/fairchem/core/modules/normalization/normalizer.py:69: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  "mean": torch.tensor(state_dict["mean"]),
WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

/tmp/ipykernel_3657/3396863997.py:11: FutureWarning: Please use atoms.calc = calc
  slab.set_calculator(calc)

212760066986084 1.4032305688440477e-06
2127609252929688
212761402130127
2127597332000732
2127599716186523
212759256362915
2127573490142822
212759017944336
2127597332000732
2127604484558105
2127628326416016

The forces don’t sum to zero#

In DFT, the forces on all the atoms should sum to zero; otherwise, there is a net translational or rotational force present. This is not enforced in fairchem models. Instead, individual forces are predicted, with no constraint that they sum to zero. If the force predictions are very accurate, then they sum close to zero. You can further improve this if you subtract the mean force from each atom.

from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('EquiformerV2-31M-S2EF-OC20-All+MD', local_cache='/tmp/fairchem_checkpoints/')

from fairchem.core.common.relaxation.ase_utils import OCPCalculator
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=True)

from ase.build import fcc111, add_adsorbate
from ase.optimize import BFGS
slab = fcc111('Pt', size=(2, 2, 5), vacuum=10.0)
add_adsorbate(slab, 'O', height=1.2, position='fcc')
slab.set_calculator(calc)

f = slab.get_forces()
f.sum(axis=0)

INFO:root:Checking local cache: /tmp/fairchem_checkpoints/ for model EquiformerV2-31M-S2EF-OC20-All+MD

/home/runner/work/fairchem/fairchem/src/fairchem/core/common/relaxation/ase_utils.py:200: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=torch.device("cpu"))
WARNING:root:Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.

INFO:root:amp: true
cmd:
  checkpoint_dir: /home/runner/work/fairchem/fairchem/docs/core/checkpoints/2025-03-29-04-41-36
  commit: core:b88b66e,experimental:NA
  identifier: ''
  logs_dir: /home/runner/work/fairchem/fairchem/docs/core/logs/wandb/2025-03-29-04-41-36
  print_every: 100
  results_dir: /home/runner/work/fairchem/fairchem/docs/core/results/2025-03-29-04-41-36
  seed: null
  timestamp_id: 2025-03-29-04-41-36
  version: 0.1.dev1+gb88b66e
dataset:
  format: trajectory_lmdb_v2
  grad_target_mean: 0.0
  grad_target_std: 2.887317180633545
  key_mapping:
    force: forces
    y: energy
  normalize_labels: true
  target_mean: -0.7554450631141663
  target_std: 2.887317180633545
  transforms:
    normalizer:
      energy:
        mean: -0.7554450631141663
        stdev: 2.887317180633545
      forces:
        mean: 0.0
        stdev: 2.887317180633545
evaluation_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gp_gpus: null
gpus: 0
logger: wandb
loss_functions:
- energy:
    coefficient: 4
    fn: mae
- forces:
    coefficient: 100
    fn: l2mae
model:
  alpha_drop: 0.1
  attn_activation: silu
  attn_alpha_channels: 64
  attn_hidden_channels: 64
  attn_value_channels: 16
  distance_function: gaussian
  drop_path_rate: 0.1
  edge_channels: 128
  ffn_activation: silu
  ffn_hidden_channels: 128
  grid_resolution: 18
  lmax_list:
  - 4
  max_neighbors: 20
  max_num_elements: 90
  max_radius: 12.0
  mmax_list:
  - 2
  name: equiformer_v2
  norm_type: layer_norm_sh
  num_distance_basis: 512
  num_heads: 8
  num_layers: 8
  num_sphere_samples: 128
  otf_graph: true
  proj_drop: 0.0
  regress_forces: true
  sphere_channels: 128
  use_atom_edge_embedding: true
  use_gate_act: false
  use_grid_mlp: true
  use_pbc: true
  use_s2_act_attn: false
  weight_init: uniform
optim:
  batch_size: 8
  clip_grad_norm: 100
  ema_decay: 0.999
  energy_coefficient: 4
  eval_batch_size: 8
  eval_every: 10000
  force_coefficient: 100
  grad_accumulation_steps: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: l2mae
  lr_initial: 0.0004
  max_epochs: 3
  num_workers: 8
  optimizer: AdamW
  optimizer_params:
    weight_decay: 0.001
  scheduler: LambdaLR
  scheduler_params:
    epochs: 1009275
    lambda_type: cosine
    lr: 0.0004
    lr_min_factor: 0.01
    warmup_epochs: 3364.25
    warmup_factor: 0.2
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
relax_dataset: {}
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 9
  folder: /checkpoint/abhshkdz/open-catalyst-project/logs/equiformer_v2/8307793
  gpus_per_node: 8
  job_id: '8307793'
  job_name: eq2s_051701_allmd
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: learnaccel
  time: 4320
task:
  dataset: trajectory_lmdb_v2
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  primary_metric: forces_mae
  train_on_free_atoms: true
test_dataset: {}
trainer: ocp
val_dataset: {}

INFO:root:Loading model: equiformer_v2

WARNING:root:equiformer_v2 (EquiformerV2) class is deprecated in favor of equiformer_v2_backbone_and_heads  (EquiformerV2BackboneAndHeads)

INFO:root:Loaded EquiformerV2 with 31058690 parameters.

INFO:root:Loading checkpoint in inference-only mode, not loading keys associated with trainer state!

/home/runner/work/fairchem/fairchem/src/fairchem/core/modules/normalization/normalizer.py:69: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  "mean": torch.tensor(state_dict["mean"]),
WARNING:root:No seed has been set in modelcheckpoint or OCPCalculator! Results may not be reproducible on re-run

/tmp/ipykernel_3657/4037009387.py:11: FutureWarning: Please use atoms.calc = calc
  slab.set_calculator(calc)

array([ 0.01599004,  0.00170514, -0.07198024], dtype=float32)

# This makes them sum closer to zero by removing net translational force
(f - f.mean(axis=0)).sum(axis=0)

array([ 1.3370300e-07,  5.7567377e-08, -2.3841858e-07], dtype=float32)