OMol25 Leaderboard

OMol25 Leaderboard#

As part of the OMol25 release, we present a community leaderboard for researchers to submit their predictions for evaluation - fairchem_leaderboard. Below we outline the steps to generate predictions and submit them to the leaderboard.

The leaderboard is broken into two different sections - “S2EF” and “Evaluations”. Structure to Energy and Forces (S2EF) is the most straightforward evaluation for MLIPs - given a structure, how well can you predict the total energy and per-atom forces. Evaluations correspond to several chemistry relevant tasks (spin gap, ligand-strain, etc.) introduced in OMol25 to evaluate MLIPs beyond simple energy and force metrics (see the paper for more details).

The simplest way to get started is to have an ASE-compatible MLIP calculator that can make energy and force predictions. Input data for the different benchmarks can be downloaded below.

💾 Download#

Benchmarks	URL
S2EF (Val/Test)	https://huggingface.co/facebook/OMol25/blob/main/DATASET.md#dataset-splits
Evaluations	https://huggingface.co/facebook/OMol25/blob/main/DATASET.md#evaluation-data

Install the necessary packages#

pip install "fairchem-core>=2.5.0"
pip install "fairchem-data-omol>=0.1.1"

S2EF#

The leadebroard supports S2EF evaluations for both the OMol25 “Validation” and “Test” sets. Validation labels are already accessible in the released dataset for local benchmarking and debugging, so we highly encourage users to make Test submissions to fairly and accurately compare models. The size of each split is as follows:

Split	Size
Val	2,762,021
Test	2,805,046

Predictions must be saved as “.npz” files and shall contain the following information:

ids <class 'numpy.ndarray'>
energy <class 'numpy.ndarray'>
forces <class 'numpy.ndarray'>
natoms <class 'numpy.ndarray'>

Where,

ids corresponds to the unique identifier, atoms.info["source"]
energy is the predicted energy
forces is the predicted forces, concatenated across all systems
natoms is the number of atoms corresponding to each prediction

As an example:

from fairchem.core.datasets import AseDBDataset
from fairchem.core import pretrained_mlip, FAIRChemCalculator

### Define your MLIP calculator
predictor = pretrained_mlip.get_predict_unit(args.checkpoint, device="cuda")
calc = FAIRChemCalculator(predictor, task_name="omol")

### Read in the dataset you wish to submit predictions to
dataset = AseDBDataset({"src": "path/to/omol/test_data"})

ids = []
energy = []
forces = []
natoms = []
for idx in range(len(dataset)):
    atoms = dataset.get_atoms(idx)
    atoms.calc = calc
    ids.append(atoms.info["source"])
    natoms.append(len(atoms))
    energy.append(atoms.get_potential_energy())
    forces.append(atoms.get_forces())

### Do not forget this! Your submission will fail.
forces = np.concatenate(forces)

np.savez_compressed(
    "test_predictions.npz",
    ids=ids,
    energy=energy,
    forces=forces,
    natoms=natoms,
)

:warning: DISCLAIMER: The above example can be very slow on a single GPU and we encourage users to parallelize this however they like. We provide the example as a means to understand the expected format for the leaderboard.

Once a prediction file is generated, proceed to the leaderboard, fill in the submission form, upload your file, select “Validation” or “Test” and hit submit. Stay on the page until you see the success message.

Evaluations#

The following evaluations are currently available on the OMol25 leaderboard:

Ligand pocket: Protein-ligand interaction energy as a proxy to the binding energy, central to many biological processes.
Ligand strain: Ligand-strain energy is an important task to understanding protein-ligand binding.
Conformers: Identifying the lowest energy conformer is a crucial part of many biological and pharmaceutical tasks.
Protonation: As a proxy to pKa prediction, we evaluate energy differences of structures differing by one proton.
Distance scaling: Short range and long range intermolecular interactions are essential for observable properties like phase changes, density, etc.
IE/EA: The addition, removal, and transfer of electrons is central to many redox processes.
Spin gap: Differences between spin states can play a critical role of molecular optic devices and photactive catalysts.

For a detailed descripion of each task we refer people to the original manuscript.

To generate prediction files for the different tasks, we have released a set of recipes to be used with ASE-compatible calculators. Each evaluation task has its own unique structure, a detailed description of the expected output is provided in the recipe docstrings. The following recipes should be used to evaluate the corresponding task:

As an example, to run the ligand_pocket evaluation:

import json
import pickle
from fairchem.core import pretrained_mlip, FAIRChemCalculator
from fairchem.core.components.calculate.recipes.omol import ligand_pocket

### Define your MLIP calculator
predictor = pretrained_mlip.get_predict_unit(args.checkpoint, device="cuda")
calc = FAIRChemCalculator(predictor, task_name="omol")

### Load the desired evaluation task input data
with open("path/to/ligand_pocket_inputs.pkl", "rb") as f:
    ligand_pocket_data = pickle.load(f)

results = ligand_pocket(ligand_pocket_data, calc)
with open("ligand_pocket_results.json") as f:
    json.dump(results, f)

:warning: DISCLAIMER: Conformers, Protonation, Ligand strain, and Distance scaling can be quite slow on a single GPU and we encourage userse to parallelize this however they like.

Once a prediction file is generated, proceed to the leaderboard, fill in the submission form, upload your file, select the corresponding evaluation task and hit submit. Stay on the page until you see the success message.