core.models.equiformer_v2.equiformer_v2

core.models.equiformer_v2.equiformer_v2#

Copyright (c) Meta, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

Attributes#

`_AVG_NUM_NODES`
`_AVG_DEGREE`

Classes#

`EquiformerV2ForceHead`	Base class for all neural network modules.
`EquiformerV2EnergyHead`	Base class for all neural network modules.
`EquiformerV2Backbone`	Equiformer with graph attention built upon SO(2) convolution and feedforward network built upon S2 activation

Module Contents#

core.models.equiformer_v2.equiformer_v2._AVG_NUM_NODES = 77.81317#

core.models.equiformer_v2.equiformer_v2._AVG_DEGREE = 23.395238876342773#

class core.models.equiformer_v2.equiformer_v2.EquiformerV2ForceHead(backbone)#

Bases: fairchem.core.models.equiformer_v2.heads.EqV2VectorHead

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables:: training (bool) – Boolean represents whether this module is in training or evaluation mode.

class core.models.equiformer_v2.equiformer_v2.EquiformerV2EnergyHead(backbone, reduce: str = 'sum')#

Bases: fairchem.core.models.equiformer_v2.heads.EqV2ScalarHead

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables:: training (bool) – Boolean represents whether this module is in training or evaluation mode.

class core.models.equiformer_v2.equiformer_v2.EquiformerV2Backbone(use_pbc: bool = True, use_pbc_single: bool = False, regress_forces: bool = True, otf_graph: bool = True, max_neighbors: int = 500, max_radius: float = 5.0, max_num_elements: int = 90, num_layers: int = 12, sphere_channels: int = 128, attn_hidden_channels: int = 128, num_heads: int = 8, attn_alpha_channels: int = 32, attn_value_channels: int = 16, ffn_hidden_channels: int = 512, norm_type: str = 'rms_norm_sh', lmax_list: list[int] | None = None, mmax_list: list[int] | None = None, grid_resolution: int | None = None, num_sphere_samples: int = 128, edge_channels: int = 128, use_atom_edge_embedding: bool = True, share_atom_edge_embedding: bool = False, use_m_share_rad: bool = False, distance_function: str = 'gaussian', num_distance_basis: int = 512, attn_activation: str = 'scaled_silu', use_s2_act_attn: bool = False, use_attn_renorm: bool = True, ffn_activation: str = 'scaled_silu', use_gate_act: bool = False, use_grid_mlp: bool = False, use_sep_s2_act: bool = True, alpha_drop: float = 0.1, drop_path_rate: float = 0.05, proj_drop: float = 0.0, weight_init: str = 'normal', enforce_max_neighbors_strictly: bool = True, avg_num_nodes: float | None = None, avg_degree: float | None = None, use_energy_lin_ref: bool | None = False, load_energy_lin_ref: bool | None = False, activation_checkpoint: bool | None = False)#

Bases: torch.nn.Module, fairchem.core.models.base.GraphModelMixin

Equiformer with graph attention built upon SO(2) convolution and feedforward network built upon S2 activation

Parameters:

use_pbc (bool) – Use periodic boundary conditions
use_pbc_single (bool) – Process batch PBC graphs one at a time
regress_forces (bool) – Compute forces
otf_graph (bool) – Compute graph On The Fly (OTF)
max_neighbors (int) – Maximum number of neighbors per atom
max_radius (float) – Maximum distance between nieghboring atoms in Angstroms
max_num_elements (int) – Maximum atomic number
num_layers (int) – Number of layers in the GNN
sphere_channels (int) – Number of spherical channels (one set per resolution)
attn_hidden_channels (int) – Number of hidden channels used during SO(2) graph attention
num_heads (int) – Number of attention heads
attn_alpha_head (int) – Number of channels for alpha vector in each attention head
attn_value_head (int) – Number of channels for value vector in each attention head
ffn_hidden_channels (int) – Number of hidden channels used during feedforward network
norm_type (str) – Type of normalization layer ([‘layer_norm’, ‘layer_norm_sh’, ‘rms_norm_sh’])
lmax_list (int) – List of maximum degree of the spherical harmonics (1 to 10)
mmax_list (int) – List of maximum order of the spherical harmonics (0 to lmax)
grid_resolution (int) – Resolution of SO3_Grid
num_sphere_samples (int) – Number of samples used to approximate the integration of the sphere in the output blocks
edge_channels (int) – Number of channels for the edge invariant features
use_atom_edge_embedding (bool) – Whether to use atomic embedding along with relative distance for edge scalar features
share_atom_edge_embedding (bool) – Whether to share atom_edge_embedding across all blocks
use_m_share_rad (bool) – Whether all m components within a type-L vector of one channel share radial function weights
distance_function ("gaussian", "sigmoid", "linearsigmoid", "silu") – Basis function used for distances
attn_activation (str) – Type of activation function for SO(2) graph attention
use_s2_act_attn (bool) – Whether to use attention after S2 activation. Otherwise, use the same attention as Equiformer
use_attn_renorm (bool) – Whether to re-normalize attention weights
ffn_activation (str) – Type of activation function for feedforward network
use_gate_act (bool) – If True, use gate activation. Otherwise, use S2 activation
use_grid_mlp (bool) – If True, use projecting to grids and performing MLPs for FFNs.
use_sep_s2_act (bool) – If True, use separable S2 activation when use_gate_act is False.
alpha_drop (float) – Dropout rate for attention weights
drop_path_rate (float) – Drop path rate
proj_drop (float) – Dropout rate for outputs of attention and FFN in Transformer blocks
weight_init (str) – [‘normal’, ‘uniform’] initialization of weights of linear layers except those in radial functions
enforce_max_neighbors_strictly (bool) – When edges are subselected based on the max_neighbors arg, arbitrarily select amongst equidistant / degenerate edges to have exactly the correct number.
avg_num_nodes (float) – Average number of nodes per graph
avg_degree (float) – Average degree of nodes in the graph
use_energy_lin_ref (bool) – Whether to add the per-atom energy references during prediction. During training and validation, this should be kept False since we use the lin_ref parameter in the OC22 dataloader to subtract the per-atom linear references from the energy targets. During prediction (where we don’t have energy targets), this can be set to True to add the per-atom linear references to the predicted energies.
load_energy_lin_ref (bool) – Whether to add nn.Parameters for the per-element energy references. This additional flag is there to ensure compatibility when strict-loading checkpoints, since the use_energy_lin_ref flag can be either True or False even if the model is trained with linear references. You can’t have use_energy_lin_ref = True and load_energy_lin_ref = False, since the model will not have the parameters for the linear references. All other combinations are fine.

activation_checkpoint#

use_pbc#

use_pbc_single#

regress_forces#

otf_graph#

max_neighbors#

max_radius#

cutoff#

max_num_elements#

num_layers#

sphere_channels#

attn_hidden_channels#

num_heads#

attn_alpha_channels#

attn_value_channels#

ffn_hidden_channels#

norm_type#

lmax_list#

mmax_list#

grid_resolution#

num_sphere_samples#

edge_channels#

use_atom_edge_embedding#

share_atom_edge_embedding#

use_m_share_rad#

distance_function#

num_distance_basis#

attn_activation#

use_s2_act_attn#

use_attn_renorm#

ffn_activation#

use_gate_act#

use_grid_mlp#

use_sep_s2_act#

alpha_drop#

drop_path_rate#

proj_drop#

avg_num_nodes#

avg_degree#

use_energy_lin_ref#

load_energy_lin_ref#

weight_init#

enforce_max_neighbors_strictly#

device = 'cpu'#

grad_forces = False#

num_resolutions: int#

sphere_channels_all: int#

sphere_embedding#

edge_channels_list#

SO3_rotation#

mappingReduced#

SO3_grid#

edge_degree_embedding#

blocks#

norm#

forward(data: torch_geometric.data.batch.Batch) → dict[str, torch.Tensor]#

_init_gp_partitions(atomic_numbers_full, data_batch_full, edge_index, edge_distance, edge_distance_vec)#: Graph Parallel This creates the required partial tensors for each rank given the full tensors. The tensors are split on the dimension along the node index using node_partition.

_init_edge_rot_mat(data, edge_index, edge_distance_vec)#

property num_params#

no_weight_decay() → set#: Returns a list of parameters with no weight decay.