Open Catalyst 2020 Nudged Elastic Band (OC20NEB)

Open Catalyst 2020 Nudged Elastic Band (OC20NEB)#

Overview#

This is a validation dataset which was used to assess model performance in CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks. It is comprised of 932 NEB relaxation trajectories. There are three different types of reactions represented: desorptions, dissociations, and transfers. NEB calculations allow us to find transition states. The rate of reaction is determined by the transition state energy, so access to transition states is very important for catalysis research. For more information, check out the paper.

File Structure and Contents#

The tar file contains 3 subdirectories: dissociations, desorptions, and transfers. As the names imply, these directories contain the converged DFT trajectories for each of the reaction classes. Within these directories, the trajectories are named to identify the contents of the file. Here is an example and the anatomy of the name:

desorption_id_83_2409_9_111-4_neb1.0.traj

  1. desorption indicates the reaction type (dissociation and transfer are the other possibilities)

  2. id identifies that the material belongs to the validation in domain split (ood - out of domain is th e other possibility)

  3. 83 is the task id. This does not provide relavent information

  4. 2409 is the bulk index of the bulk used in the ocdata bulk pickle file

  5. 9 is the reaction index. for each reaction type there is a reaction pickle file in the repository. In this case it is the 9th entry to that pickle file

  6. 111-4 the first 3 numbers are the miller indices (i.e. the (1,1,1) surface), and the last number cooresponds to the shift value. In this case the 4th shift enumerated was the one used.

  7. neb1.0 the number here indicates the k value used. For the full dataset, 1.0 was used so this does not distiguish any of the trajectories from one another.

The content of these trajectory files is the repeating frame sets. Despite the initial and final frames not being optimized during the NEB, the initial and final frames are saved for every iteration in the trajectory. For the dataset, 10 frames were used - 8 which were optimized over the neb. So the length of the trajectory is the number of iterations (N) * 10. If you wanted to look at the frame set prior to optimization and the optimized frame set, you could get them like this:

from __future__ import annotations

!wget https://dl.fbaipublicfiles.com/opencatalystproject/data/large_files/desorption_id_83_2409_9_111-4_neb1.0.traj

from ase.io import read

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
unrelaxed_frames = traj[0:10]
relaxed_frames = traj[-10:]
--2025-09-23 00:23:16--  https://dl.fbaipublicfiles.com/opencatalystproject/data/large_files/desorption_id_83_2409_9_111-4_neb1.0.traj
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 
52.84.217.5, 52.84.217.128, 52.84.217.124, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|52.84.217.5|:443... connected.
HTTP request sent, awaiting response... 
200 OK
Length: 10074935 (9.6M) [binary/octet-stream]
Saving to: ‘desorption_id_83_2409_9_111-4_neb1.0.traj’


          desorptio   0%[                    ]       0  --.-KB/s               
         desorption   4%[                    ] 399.97K  1.73MB/s               
        desorption_  69%[============>       ]   6.64M  14.7MB/s               
desorption_id_83_24 100%[===================>]   9.61M  20.2MB/s    in 0.5s    

2025-09-23 00:23:17 (20.2 MB/s) - ‘desorption_id_83_2409_9_111-4_neb1.0.traj’ saved [10074935/10074935]

Download#

Splits

Size of compressed version (in bytes)

Size of uncompressed version (in bytes)

MD5 checksum (download link)

ASE Trajectories

1.5G

6.3G

52af34a93758c82fae951e52af445089

Use#

One more note: We have not prepared an lmdb for this dataset. This is because it is NEB calculations are not supported directly in ocp. You must use the ase native OCP class along with ase infrastructure to run NEB calculations. Here is an example of a use:

import os

from ase.io import read
from ase.mep import DyNEB
from ase.optimize import BFGS
from fairchem.core import FAIRChemCalculator, pretrained_mlip

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
images = traj[0:10]
predictor = pretrained_mlip.get_predict_unit("uma-s-1p1")

neb = DyNEB(images, k=1)
for image in images:
    image.calc = FAIRChemCalculator(predictor, task_name="oc20")

optimizer = BFGS(
    neb,
    trajectory="neb.traj",
)

# Use a small number of steps here to keep the docs fast during CI, but otherwise do quite reasonable settings.
fast_docs = os.environ.get("FAST_DOCS", "false").lower() == "true"
if fast_docs:
    optimization_steps = 20
else:
    optimization_steps = 300

conv = optimizer.run(fmax=0.45, steps=optimization_steps)
if conv:
    neb.climb = True
    conv = optimizer.run(fmax=0.05, steps=optimization_steps)
WARNING:root:device was not explicitly set, using device='cuda'.
INFO:matplotlib.font_manager:generated new fontManager
      Step     Time          Energy          fmax
BFGS:    0 00:23:33     -305.763014        5.169706
BFGS:    1 00:23:34     -305.691690       11.366598
BFGS:    2 00:23:35     -305.916311        1.889963
BFGS:    3 00:23:36     -305.932505        2.616029
BFGS:    4 00:23:37     -306.010364        2.264345
BFGS:    5 00:23:38     -306.003679        6.892189
BFGS:    6 00:23:39     -306.254761        9.617125
BFGS:    7 00:23:40     -306.224749        3.371285
BFGS:    8 00:23:42     -306.290792        4.665820
BFGS:    9 00:23:43     -306.315121        0.727091
BFGS:   10 00:23:44     -306.329403        0.653947
BFGS:   11 00:23:45     -306.357723        1.619294
BFGS:   12 00:23:46     -306.412187        1.940798
BFGS:   13 00:23:47     -306.441267        0.604978
BFGS:   14 00:23:48     -306.471018        0.559646
BFGS:   15 00:23:49     -306.495146        2.148019
BFGS:   16 00:23:50     -306.497873        0.480605
BFGS:   17 00:23:51     -306.504490        0.516435
BFGS:   18 00:23:52     -306.511322        0.708891
BFGS:   19 00:23:53     -306.508459        0.831833
BFGS:   20 00:23:54     -306.478218        1.208057
BFGS:   21 00:23:55     -306.508818        0.552390
BFGS:   22 00:23:56     -306.510044        0.379422
BFGS:   23 00:23:57     -306.394710        3.084677
BFGS:   24 00:23:59     -306.427078        1.007963
BFGS:   25 00:24:00     -306.392054        0.994024
BFGS:   26 00:24:01     -306.184511        0.895734
BFGS:   27 00:24:02     -306.127008        0.642143
BFGS:   28 00:24:03     -306.158203        0.672238
BFGS:   29 00:24:04     -306.240436        0.423648
BFGS:   30 00:24:05     -306.258085        0.529111
BFGS:   31 00:24:06     -306.257112        0.610357
BFGS:   32 00:24:07     -306.249447        0.647151
BFGS:   33 00:24:08     -306.256735        0.535627
BFGS:   34 00:24:09     -306.273350        0.433455
BFGS:   35 00:24:10     -306.310816        0.512639
BFGS:   36 00:24:11     -306.361214        0.544150
BFGS:   37 00:24:12     -306.433222        0.516889
BFGS:   38 00:24:13     -306.504661        0.482603
BFGS:   39 00:24:14     -306.531591        0.793364
BFGS:   40 00:24:15     -306.458557        1.430714
BFGS:   41 00:24:16     -306.301132        1.010522
BFGS:   42 00:24:17     -306.237249        0.791099
BFGS:   43 00:24:17     -306.261778        0.388615
BFGS:   44 00:24:18     -306.289652        0.345346
BFGS:   45 00:24:19     -306.317480        0.403818
BFGS:   46 00:24:20     -306.326824        0.514846
BFGS:   47 00:24:21     -306.306648        0.538523
BFGS:   48 00:24:22     -306.282291        0.424176
BFGS:   49 00:24:23     -306.273052        0.487417
BFGS:   50 00:24:24     -306.272732        0.292699
BFGS:   51 00:24:25     -306.276274        0.362387
BFGS:   52 00:24:26     -306.292303        0.291724
BFGS:   53 00:24:27     -306.316896        0.357494
BFGS:   54 00:24:27     -306.316236        0.356624
BFGS:   55 00:24:28     -306.308653        0.319056
BFGS:   56 00:24:29     -306.312988        0.271624
BFGS:   57 00:24:30     -306.320799        0.294268
BFGS:   58 00:24:31     -306.324556        0.262827
BFGS:   59 00:24:32     -306.330988        0.286278
BFGS:   60 00:24:33     -306.335382        0.247002
BFGS:   61 00:24:34     -306.332397        0.276673
BFGS:   62 00:24:35     -306.335266        0.216994
BFGS:   63 00:24:36     -306.344211        0.202654
BFGS:   64 00:24:37     -306.352611        0.222141
BFGS:   65 00:24:38     -306.349783        0.274684
BFGS:   66 00:24:39     -306.345886        0.271670
BFGS:   67 00:24:40     -306.364408        0.230120
BFGS:   68 00:24:41     -306.369257        0.380298
BFGS:   69 00:24:42     -306.326990        0.388998
BFGS:   70 00:24:43     -306.341980        0.211296
BFGS:   71 00:24:44     -306.350273        0.243286
BFGS:   72 00:24:45     -306.352587        0.127783
BFGS:   73 00:24:46     -306.351372        0.092888
BFGS:   74 00:24:47     -306.351833        0.074129
BFGS:   75 00:24:48     -306.355076        0.192559
BFGS:   76 00:24:49     -306.351450        0.227728
BFGS:   77 00:24:50     -306.348654        0.179328
BFGS:   78 00:24:50     -306.351635        0.179561
BFGS:   79 00:24:51     -306.356544        0.200022
BFGS:   80 00:24:52     -306.357462        0.092459
BFGS:   81 00:24:53     -306.350241        0.194663
BFGS:   82 00:24:54     -306.337191        0.261703
BFGS:   83 00:24:55     -306.236305        0.527327
BFGS:   84 00:24:56     -306.226904        0.544698
BFGS:   85 00:24:57     -306.302145        0.616413
BFGS:   86 00:24:59     -306.348980        0.873442
BFGS:   87 00:25:00     -306.356737        0.644207
BFGS:   88 00:25:01     -306.342197        0.285928
BFGS:   89 00:25:02     -306.349890        0.137757
BFGS:   90 00:25:03     -306.349125        0.133680
BFGS:   91 00:25:04     -306.346142        0.146332
BFGS:   92 00:25:05     -306.336712        0.196609
BFGS:   93 00:25:06     -306.320446        0.265598
BFGS:   94 00:25:07     -306.303858        0.478625
BFGS:   95 00:25:08     -306.324236        0.862113
BFGS:   96 00:25:09     -306.309334        1.142772
BFGS:   97 00:25:10     -306.310905        0.775792
BFGS:   98 00:25:11     -306.352718        0.711500
BFGS:   99 00:25:12     -306.359920        0.252151
BFGS:  100 00:25:13     -306.356319        0.143572
BFGS:  101 00:25:14     -306.352728        0.190386
BFGS:  102 00:25:15     -306.350527        0.185382
BFGS:  103 00:25:16     -306.342928        0.400624
BFGS:  104 00:25:17     -306.346033        0.311215
BFGS:  105 00:25:18     -306.370729        0.606570
BFGS:  106 00:25:20     -306.372637        0.434609
BFGS:  107 00:25:21     -306.351589        0.185459
BFGS:  108 00:25:22     -306.359194        0.140847
BFGS:  109 00:25:23     -306.362461        0.237821
BFGS:  110 00:25:24     -306.361626        0.336477
BFGS:  111 00:25:25     -306.349756        0.404897
BFGS:  112 00:25:26     -306.342737        0.286626
BFGS:  113 00:25:27     -306.350254        0.268007
BFGS:  114 00:25:28     -306.356449        0.161728
BFGS:  115 00:25:29     -306.359920        0.141921
BFGS:  116 00:25:30     -306.361425        0.160321
BFGS:  117 00:25:31     -306.361132        0.170817
BFGS:  118 00:25:32     -306.348057        0.420472
BFGS:  119 00:25:33     -306.347004        0.386306
BFGS:  120 00:25:34     -306.354097        0.197166
BFGS:  121 00:25:35     -306.357624        0.124620
BFGS:  122 00:25:36     -306.360071        0.227305
BFGS:  123 00:25:37     -306.360758        0.107308
BFGS:  124 00:25:38     -306.360989        0.102954
BFGS:  125 00:25:39     -306.360687        0.095947
BFGS:  126 00:25:39     -306.358166        0.132361
BFGS:  127 00:25:40     -306.360659        0.097813
BFGS:  128 00:25:41     -306.360659        0.111395
BFGS:  129 00:25:41     -306.360659        0.085474
BFGS:  130 00:25:42     -306.360659        0.064465
BFGS:  131 00:25:42     -306.360659        0.045615