Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Evaluating Pretrained Models

FAIRChem v2 provides a number of methods used to benchmark and evaluate the UMA models that will be helpful for apples-to-apples comparisons with the paper results.

Running Model Evaluations

To evaluate a UMA model using a pre-existing configuration file, follow these steps. Example configuration files used to evaluate UMA models are stored in configs/uma/evaluate.

Run the evaluation script:

fairchem --config evaluation_config.yaml

Replace evaluation_config.yaml with the desired config file. For example, configs/uma/evaluate/uma_conserving.yaml

Evaluation Configuration File Format

Evaluation configuration files are written in Hydra YAML format and specify how a model evaluation should be run. UMA evaluation configuration files, which can be used as templates to evaluate other models if needed, are located in configs/uma/evaluate/.

Top-Level Keys

Similar to training configuration files, the only allowed top-level keys are the job and runner keys as well interpolation keys that are resolved at runtime.

Important configuration options are nested under these keys as follows:

Under job:

Specifications of how to run the actual job. The configuration options are the same here as those in a training job. Some notable flags are detailed below:

Under runner:

The actual benchmark details such as model checkpoint and the dataset are specified under the runner flag. An evaluation run should use the EvalRunner class which relies on an MLIPEvalUnit to run inference using a pretrained model.

Using the defaults Key to Define Config Groups

The defaults key is a Hydra feature that allows you to compose configuration files from modular config groups. Each entry under defaults refers to a config group (such as model, data, or other reusable components) that is merged into the final configuration at runtime. This makes it easy to swap out models, datasets, or other settings without duplicating configuration code.

Using config groups allows you to easily override defaults in the CLI. For example:

fairchem --config evaluation_config.yaml cluster=cluster_config checkpoint=checkpoint_config

Where cluster_config and checkpoint_config are cluster and checkpoint configuration files written to directories under cluster and checkpoint respectively. See the files in configs/uma/evaluate as a full example.