core.models.equiformer_v2#
Submodules#
- core.models.equiformer_v2.activation
- core.models.equiformer_v2.drop
- core.models.equiformer_v2.edge_rot_mat
- core.models.equiformer_v2.equiformer_v2
- core.models.equiformer_v2.equiformer_v2_deprecated
- core.models.equiformer_v2.eqv2_to_eqv2_hydra
- core.models.equiformer_v2.gaussian_rbf
- core.models.equiformer_v2.heads
- core.models.equiformer_v2.input_block
- core.models.equiformer_v2.layer_norm
- core.models.equiformer_v2.module_list
- core.models.equiformer_v2.radial_function
- core.models.equiformer_v2.so2_ops
- core.models.equiformer_v2.so3
- core.models.equiformer_v2.trainers
- core.models.equiformer_v2.transformer_block
- core.models.equiformer_v2.weight_initialization
- core.models.equiformer_v2.wigner
Classes#
THIS CLASS HAS BEEN DEPRECATED! Please use "EquiformerV2BackboneAndHeads" |
Package Contents#
- class core.models.equiformer_v2.EquiformerV2(use_pbc: bool = True, use_pbc_single: bool = False, regress_forces: bool = True, otf_graph: bool = True, max_neighbors: int = 500, max_radius: float = 5.0, max_num_elements: int = 90, num_layers: int = 12, sphere_channels: int = 128, attn_hidden_channels: int = 128, num_heads: int = 8, attn_alpha_channels: int = 32, attn_value_channels: int = 16, ffn_hidden_channels: int = 512, norm_type: str = 'rms_norm_sh', lmax_list: list[int] | None = None, mmax_list: list[int] | None = None, grid_resolution: int | None = None, num_sphere_samples: int = 128, edge_channels: int = 128, use_atom_edge_embedding: bool = True, share_atom_edge_embedding: bool = False, use_m_share_rad: bool = False, distance_function: str = 'gaussian', num_distance_basis: int = 512, attn_activation: str = 'scaled_silu', use_s2_act_attn: bool = False, use_attn_renorm: bool = True, ffn_activation: str = 'scaled_silu', use_gate_act: bool = False, use_grid_mlp: bool = False, use_sep_s2_act: bool = True, alpha_drop: float = 0.1, drop_path_rate: float = 0.05, proj_drop: float = 0.0, weight_init: str = 'normal', enforce_max_neighbors_strictly: bool = True, avg_num_nodes: float | None = None, avg_degree: float | None = None, use_energy_lin_ref: bool | None = False, load_energy_lin_ref: bool | None = False)#
Bases:
torch.nn.Module
,fairchem.core.models.base.GraphModelMixin
THIS CLASS HAS BEEN DEPRECATED! Please use “EquiformerV2BackboneAndHeads”
Equiformer with graph attention built upon SO(2) convolution and feedforward network built upon S2 activation
- Parameters:
use_pbc (bool) – Use periodic boundary conditions
use_pbc_single (bool) – Process batch PBC graphs one at a time
regress_forces (bool) – Compute forces
otf_graph (bool) – Compute graph On The Fly (OTF)
max_neighbors (int) – Maximum number of neighbors per atom
max_radius (float) – Maximum distance between nieghboring atoms in Angstroms
max_num_elements (int) – Maximum atomic number
num_layers (int) – Number of layers in the GNN
sphere_channels (int) – Number of spherical channels (one set per resolution)
attn_hidden_channels (int) – Number of hidden channels used during SO(2) graph attention
num_heads (int) – Number of attention heads
attn_alpha_head (int) – Number of channels for alpha vector in each attention head
attn_value_head (int) – Number of channels for value vector in each attention head
ffn_hidden_channels (int) – Number of hidden channels used during feedforward network
norm_type (str) – Type of normalization layer ([‘layer_norm’, ‘layer_norm_sh’, ‘rms_norm_sh’])
lmax_list (int) – List of maximum degree of the spherical harmonics (1 to 10)
mmax_list (int) – List of maximum order of the spherical harmonics (0 to lmax)
grid_resolution (int) – Resolution of SO3_Grid
num_sphere_samples (int) – Number of samples used to approximate the integration of the sphere in the output blocks
edge_channels (int) – Number of channels for the edge invariant features
use_atom_edge_embedding (bool) – Whether to use atomic embedding along with relative distance for edge scalar features
share_atom_edge_embedding (bool) – Whether to share atom_edge_embedding across all blocks
use_m_share_rad (bool) – Whether all m components within a type-L vector of one channel share radial function weights
distance_function ("gaussian", "sigmoid", "linearsigmoid", "silu") – Basis function used for distances
attn_activation (str) – Type of activation function for SO(2) graph attention
use_s2_act_attn (bool) – Whether to use attention after S2 activation. Otherwise, use the same attention as Equiformer
use_attn_renorm (bool) – Whether to re-normalize attention weights
ffn_activation (str) – Type of activation function for feedforward network
use_gate_act (bool) – If True, use gate activation. Otherwise, use S2 activation
use_grid_mlp (bool) – If True, use projecting to grids and performing MLPs for FFNs.
use_sep_s2_act (bool) – If True, use separable S2 activation when use_gate_act is False.
alpha_drop (float) – Dropout rate for attention weights
drop_path_rate (float) – Drop path rate
proj_drop (float) – Dropout rate for outputs of attention and FFN in Transformer blocks
weight_init (str) – [‘normal’, ‘uniform’] initialization of weights of linear layers except those in radial functions
enforce_max_neighbors_strictly (bool) – When edges are subselected based on the max_neighbors arg, arbitrarily select amongst equidistant / degenerate edges to have exactly the correct number.
avg_num_nodes (float) – Average number of nodes per graph
avg_degree (float) – Average degree of nodes in the graph
use_energy_lin_ref (bool) – Whether to add the per-atom energy references during prediction. During training and validation, this should be kept False since we use the lin_ref parameter in the OC22 dataloader to subtract the per-atom linear references from the energy targets. During prediction (where we don’t have energy targets), this can be set to True to add the per-atom linear references to the predicted energies.
load_energy_lin_ref (bool) – Whether to add nn.Parameters for the per-element energy references. This additional flag is there to ensure compatibility when strict-loading checkpoints, since the use_energy_lin_ref flag can be either True or False even if the model is trained with linear references. You can’t have use_energy_lin_ref = True and load_energy_lin_ref = False, since the model will not have the parameters for the linear references. All other combinations are fine.
- use_pbc#
- use_pbc_single#
- regress_forces#
- otf_graph#
- max_neighbors#
- max_radius#
- cutoff#
- max_num_elements#
- num_layers#
- sphere_channels#
- num_heads#
- attn_alpha_channels#
- attn_value_channels#
- norm_type#
- lmax_list#
- mmax_list#
- grid_resolution#
- num_sphere_samples#
- edge_channels#
- use_atom_edge_embedding#
- distance_function#
- num_distance_basis#
- attn_activation#
- use_s2_act_attn#
- use_attn_renorm#
- ffn_activation#
- use_gate_act#
- use_grid_mlp#
- use_sep_s2_act#
- alpha_drop#
- drop_path_rate#
- proj_drop#
- avg_num_nodes#
- avg_degree#
- use_energy_lin_ref#
- load_energy_lin_ref#
- weight_init#
- enforce_max_neighbors_strictly#
- device = 'cpu'#
- grad_forces = False#
- num_resolutions: int#
- sphere_channels_all: int#
- sphere_embedding#
- edge_channels_list#
- SO3_rotation#
- mappingReduced#
- SO3_grid#
- edge_degree_embedding#
- blocks#
- norm#
- energy_block#
- _init_gp_partitions(atomic_numbers_full, data_batch_full, edge_index, edge_distance, edge_distance_vec)#
Graph Parallel This creates the required partial tensors for each rank given the full tensors. The tensors are split on the dimension along the node index using node_partition.
- forward(data)#
- _init_edge_rot_mat(data, edge_index, edge_distance_vec)#
- property num_params#
- _init_weights(m)#
- _uniform_init_rad_func_linear_weights(m)#
- _uniform_init_linear_weights(m)#
- no_weight_decay() set #
Returns a list of parameters with no weight decay.