| Property | Value |
|---|---|
| Size | 932 NEB relaxation trajectories |
| Reaction Types | Desorptions, Dissociations, Transfers |
| Purpose | Transition state energy calculations |
| Paper | CatTSunami (arXiv) |
| License | CC-BY-4.0 |
Overview¶
This is a validation dataset which was used to assess model performance in CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks. It is comprised of 932 NEB relaxation trajectories. There are three different types of reactions represented: desorptions, dissociations, and transfers. NEB calculations allow us to find transition states. The rate of reaction is determined by the transition state energy, so access to transition states is very important for catalysis research. For more information, check out the paper.
File Structure and Contents¶
The tar file contains 3 subdirectories: dissociations, desorptions, and transfers. As the names imply, these directories contain the converged DFT trajectories for each of the reaction classes. Within these directories, the trajectories are named to identify the contents of the file. Here is an example and the anatomy of the name:
desorption_id_83_2409_9_111-4_neb1.0.traj
desorptionindicates the reaction type (dissociation and transfer are the other possibilities)ididentifies that the material belongs to the validation in domain split (ood - out of domain is th e other possibility)83is the task id. This does not provide relavent information2409is the bulk index of the bulk used in the ocdata bulk pickle file9is the reaction index. for each reaction type there is a reaction pickle file in the repository. In this case it is the 9th entry to that pickle file111-4the first 3 numbers are the miller indices (i.e. the (1,1,1) surface), and the last number cooresponds to the shift value. In this case the 4th shift enumerated was the one used.neb1.0the number here indicates the k value used. For the full dataset, 1.0 was used so this does not distiguish any of the trajectories from one another.
The content of these trajectory files is the repeating frame sets. Despite the initial and final frames not being optimized during the NEB, the initial and final frames are saved for every iteration in the trajectory. For the dataset, 10 frames were used - 8 which were optimized over the neb. So the length of the trajectory is the number of iterations (N) * 10. If you wanted to look at the frame set prior to optimization and the optimized frame set, you could get them like this:
from __future__ import annotations
!wget https://dl.fbaipublicfiles.com/opencatalystproject/data/large_files/desorption_id_83_2409_9_111-4_neb1.0.traj
from ase.io import read
traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
unrelaxed_frames = traj[0:10]
relaxed_frames = traj[-10:]--2026-04-22 06:30:20-- https://dl.fbaipublicfiles.com/opencatalystproject/data/large_files/desorption_id_83_2409_9_111-4_neb1.0.traj
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 108.138.85.116, 108.138.85.47, 108.138.85.129, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|108.138.85.116|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10074935 (9.6M) [binary/octet-stream]
Saving to: ‘desorption_id_83_2409_9_111-4_neb1.0.traj’
desorptio 0%[ ] 0 --.-KB/s desorption 3%[ ] 372.51K 1.54MB/s desorption_ 32%[=====> ] 3.12M 6.77MB/s desorption_id_83_24 100%[===================>] 9.61M 17.7MB/s in 0.5s
2026-04-22 06:30:21 (17.7 MB/s) - ‘desorption_id_83_2409_9_111-4_neb1.0.traj’ saved [10074935/10074935]
Download¶
| Splits | Size of compressed version (in bytes) | Size of uncompressed version (in bytes) | MD5 checksum (download link) |
|---|---|---|---|
| ASE Trajectories | 1.5G | 6.3G | 52af34a93758c82fae951e52af445089 |
Use¶
One more note: We have not prepared an lmdb for this dataset. This is because it is NEB calculations are not supported directly in ocp. You must use the ase native OCP class along with ase infrastructure to run NEB calculations. Here is an example of a use:
import os
from ase.io import read
from ase.mep import DyNEB
from ase.optimize import BFGS
from fairchem.core import FAIRChemCalculator, pretrained_mlip
traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
images = traj[0:10]
predictor = pretrained_mlip.get_predict_unit("uma-s-1p2")
neb = DyNEB(images, k=1)
for image in images:
image.calc = FAIRChemCalculator(predictor, task_name="oc20")
optimizer = BFGS(
neb,
trajectory="neb.traj",
)
# Use a small number of steps here to keep the docs fast during CI, but otherwise do quite reasonable settings.
fast_docs = os.environ.get("FAST_DOCS", "false").lower() == "true"
if fast_docs:
optimization_steps = 20
else:
optimization_steps = 300
conv = optimizer.run(fmax=0.45, steps=optimization_steps)
if conv:
neb.climb = True
conv = optimizer.run(fmax=0.05, steps=optimization_steps)Warp DeprecationWarning: The symbol `warp.vec` will soon be removed from the public API. Use `warp.types.vector` instead.
WARNING:root:device was not explicitly set, using device='cuda'.
/home/runner/work/_tool/Python/3.12.13/x64/lib/python3.12/site-packages/ase/mep/neb.py:329: UserWarning: The default method has changed from 'aseneb' to 'improvedtangent'. The 'aseneb' method is an unpublished, custom implementation that is not recommended as it frequently results in very poor bands. Please explicitly set method='improvedtangent' to silence this warning, or set method='aseneb' if you strictly require the old behavior (results may vary). See: https://gitlab.com/ase/ase/-/merge_requests/3952
warnings.warn(
Step Time Energy fmax
BFGS: 0 06:30:58 -305.702815 5.240339
BFGS: 1 06:31:01 -305.626497 11.579487
BFGS: 2 06:31:03 -305.852102 1.880920
BFGS: 3 06:31:05 -305.868544 2.642307
BFGS: 4 06:31:09 -305.945521 2.276070
BFGS: 5 06:31:13 -305.943725 6.797752
BFGS: 6 06:31:18 -306.192788 9.295232
BFGS: 7 06:31:26 -306.167309 3.395485
BFGS: 8 06:31:31 -306.230167 4.795827
BFGS: 9 06:31:38 -306.255482 0.709488
BFGS: 10 06:31:49 -306.269536 0.620074
BFGS: 11 06:31:56 -306.299099 1.588225
BFGS: 12 06:32:00 -306.353745 1.870320
BFGS: 13 06:32:06 -306.385057 0.461430
BFGS: 14 06:32:09 -306.423201 0.729227
BFGS: 15 06:32:17 -306.461802 1.995952
BFGS: 16 06:32:22 -306.472030 0.835504
BFGS: 17 06:32:26 -306.483140 0.476550
BFGS: 18 06:32:31 -306.503220 0.977266
BFGS: 19 06:32:36 -306.523605 1.272048
BFGS: 20 06:32:40 -306.533017 0.881958
BFGS: 21 06:32:47 -306.535390 1.868985
BFGS: 22 06:32:51 -306.548758 0.544064
BFGS: 23 06:33:00 -306.561675 0.731448
BFGS: 24 06:33:05 -306.571411 0.855244
BFGS: 25 06:33:09 -306.580968 0.496279
BFGS: 26 06:33:16 -306.540646 0.414831
BFGS: 27 06:33:20 -306.282177 2.578132
BFGS: 28 06:33:26 -306.454088 0.717098
BFGS: 29 06:33:30 -306.462170 0.715020
BFGS: 30 06:33:36 -306.457041 0.768458
BFGS: 31 06:33:42 -306.441175 0.804552
BFGS: 32 06:33:46 -306.433374 0.811759
BFGS: 33 06:33:51 -306.396212 0.817737
BFGS: 34 06:33:56 -306.368560 0.774275
BFGS: 35 06:34:02 -306.302298 0.596570
BFGS: 36 06:34:10 -306.261979 0.511326
BFGS: 37 06:34:16 -306.242581 0.525419
BFGS: 38 06:34:21 -306.233574 0.435672
BFGS: 39 06:34:25 -306.228991 0.357363
BFGS: 40 06:34:29 -306.227357 0.294984
BFGS: 41 06:34:34 -306.231344 0.246929
BFGS: 42 06:34:44 -306.236044 0.244458
BFGS: 43 06:34:53 -306.243061 0.243958
BFGS: 44 06:34:59 -306.241176 0.278472
BFGS: 45 06:35:03 -306.229698 0.298817
BFGS: 46 06:35:07 -306.225879 0.222160
BFGS: 47 06:35:09 -306.229998 0.146080
BFGS: 48 06:35:14 -306.233410 0.142553
BFGS: 49 06:35:19 -306.236912 0.164969
BFGS: 50 06:35:21 -306.245134 0.176785
BFGS: 51 06:35:29 -306.255608 0.160678
BFGS: 52 06:35:33 -306.257590 0.127198
BFGS: 53 06:35:37 -306.258149 0.109303
BFGS: 54 06:35:40 -306.258773 0.111618
BFGS: 55 06:35:44 -306.259673 0.150208
BFGS: 56 06:35:47 -306.261080 0.173427
BFGS: 57 06:35:51 -306.260775 0.128418
BFGS: 58 06:35:55 -306.266180 0.127103
BFGS: 59 06:35:59 -306.269670 0.128857
BFGS: 60 06:36:03 -306.270875 0.105971
BFGS: 61 06:36:10 -306.273330 0.229309
BFGS: 62 06:36:16 -306.274814 0.071356
BFGS: 63 06:36:22 -306.276950 0.136389
BFGS: 64 06:36:26 -306.279597 0.219186
BFGS: 65 06:36:30 -306.281142 0.106237
BFGS: 66 06:36:34 -306.281540 0.055007
BFGS: 67 06:36:37 -306.281540 0.141969
BFGS: 68 06:36:43 -306.281540 0.150631
BFGS: 69 06:36:47 -306.281540 0.156283
BFGS: 70 06:36:52 -306.281540 0.104621
BFGS: 71 06:37:00 -306.281540 0.048919