core.models.equiformer_v2.transformer_block

core.models.equiformer_v2.transformer_block#

Classes#

`SO2EquivariantGraphAttention`	SO2EquivariantGraphAttention: Perform MLP attention + non-linear message passing
`FeedForwardNetwork`	FeedForwardNetwork: Perform feedforward network with S2 activation or gate activation
`TransBlockV2`

Module Contents#

class core.models.equiformer_v2.transformer_block.SO2EquivariantGraphAttention(sphere_channels: int, hidden_channels: int, num_heads: int, attn_alpha_channels: int, attn_value_channels: int, output_channels: int, lmax_list: list[int], mmax_list: list[int], SO3_rotation, mappingReduced, SO3_grid, max_num_elements: int, edge_channels_list, use_atom_edge_embedding: bool = True, use_m_share_rad: bool = False, activation='scaled_silu', use_s2_act_attn: bool = False, use_attn_renorm: bool = True, use_gate_act: bool = False, use_sep_s2_act: bool = True, alpha_drop: float = 0.0)#

Bases: torch.nn.Module

SO2EquivariantGraphAttention: Perform MLP attention + non-linear message passing: SO(2) Convolution with radial function -> S2 Activation -> SO(2) Convolution -> attention weights and non-linear messages attention weights * non-linear messages -> Linear

Parameters:

sphere_channels (int) – Number of spherical channels
hidden_channels (int) – Number of hidden channels used during the SO(2) conv
num_heads (int) – Number of attention heads
attn_alpha_head (int) – Number of channels for alpha vector in each attention head
attn_value_head (int) – Number of channels for value vector in each attention head
output_channels (int) – Number of output channels
(list (edge_channels_list) – int): List of degrees (l) for each resolution
(list – int): List of orders (m) for each resolution
(list – SO3_Rotation): Class to calculate Wigner-D matrices and rotate embeddings
mappingReduced (CoefficientMappingModule) – Class to convert l and m indices once node embedding is rotated
SO3_grid (SO3_grid) – Class used to convert from grid the spherical harmonic representations
max_num_elements (int) – Maximum number of atomic numbers
(list – int): List of sizes of invariant edge embedding. For example, [input_channels, hidden_channels, hidden_channels]. The last one will be used as hidden size when use_atom_edge_embedding is True.
use_atom_edge_embedding (bool) – Whether to use atomic embedding along with relative distance for edge scalar features
use_m_share_rad (bool) – Whether all m components within a type-L vector of one channel share radial function weights
activation (str) – Type of activation function
use_s2_act_attn (bool) – Whether to use attention after S2 activation. Otherwise, use the same attention as Equiformer
use_attn_renorm (bool) – Whether to re-normalize attention weights
use_gate_act (bool) – If True, use gate activation. Otherwise, use S2 activation.
use_sep_s2_act (bool) – If True, use separable S2 activation when use_gate_act is False.
alpha_drop (float) – Dropout rate for attention weights

sphere_channels#

hidden_channels#

num_heads#

attn_alpha_channels#

attn_value_channels#

output_channels#

lmax_list#

mmax_list#

num_resolutions#

SO3_rotation#

mappingReduced#

SO3_grid#

max_num_elements#

edge_channels_list#

use_atom_edge_embedding#

use_m_share_rad#

use_s2_act_attn#

use_attn_renorm#

use_gate_act#

use_sep_s2_act#

so2_conv_1#

alpha_dropout = None#

so2_conv_2#

proj#

forward(x: torch.Tensor, atomic_numbers, edge_distance: torch.Tensor, edge_index, node_offset: int = 0)#

class core.models.equiformer_v2.transformer_block.FeedForwardNetwork(sphere_channels: int, hidden_channels: int, output_channels: int, lmax_list: list[int], mmax_list: list[int], SO3_grid, activation: str = 'scaled_silu', use_gate_act: bool = False, use_grid_mlp: bool = False, use_sep_s2_act: bool = True)#

Bases: torch.nn.Module

FeedForwardNetwork: Perform feedforward network with S2 activation or gate activation

Parameters:

sphere_channels (int) – Number of spherical channels
hidden_channels (int) – Number of hidden channels used during feedforward network
output_channels (int) – Number of output channels
(list (mmax_list) – int): List of degrees (l) for each resolution
(list – int): List of orders (m) for each resolution
SO3_grid (SO3_grid) – Class used to convert from grid the spherical harmonic representations
activation (str) – Type of activation function
use_gate_act (bool) – If True, use gate activation. Otherwise, use S2 activation
use_grid_mlp (bool) – If True, use projecting to grids and performing MLPs.
use_sep_s2_act (bool) – If True, use separable grid MLP when use_grid_mlp is True.

sphere_channels#

hidden_channels#

output_channels#

lmax_list#

mmax_list#

num_resolutions#

sphere_channels_all#

SO3_grid#

use_gate_act#

use_grid_mlp#

use_sep_s2_act#

max_lmax#

so3_linear_1#

so3_linear_2#

forward(input_embedding)#

class core.models.equiformer_v2.transformer_block.TransBlockV2(sphere_channels: int, attn_hidden_channels: int, num_heads: int, attn_alpha_channels: int, attn_value_channels: int, ffn_hidden_channels: int, output_channels: int, lmax_list: list[int], mmax_list: list[int], SO3_rotation, mappingReduced, SO3_grid, max_num_elements: int, edge_channels_list: list[int], use_atom_edge_embedding: bool = True, use_m_share_rad: bool = False, attn_activation: str = 'silu', use_s2_act_attn: bool = False, use_attn_renorm: bool = True, ffn_activation: str = 'silu', use_gate_act: bool = False, use_grid_mlp: bool = False, use_sep_s2_act: bool = True, norm_type: str = 'rms_norm_sh', alpha_drop: float = 0.0, drop_path_rate: float = 0.0, proj_drop: float = 0.0)#

Bases: torch.nn.Module

Parameters:

sphere_channels (int) – Number of spherical channels
attn_hidden_channels (int) – Number of hidden channels used during SO(2) graph attention
num_heads (int) – Number of attention heads
attn_alpha_head (int) – Number of channels for alpha vector in each attention head
attn_value_head (int) – Number of channels for value vector in each attention head
ffn_hidden_channels (int) – Number of hidden channels used during feedforward network
output_channels (int) – Number of output channels
(list (edge_channels_list) – int): List of degrees (l) for each resolution
(list – int): List of orders (m) for each resolution
(list – SO3_Rotation): Class to calculate Wigner-D matrices and rotate embeddings
mappingReduced (CoefficientMappingModule) – Class to convert l and m indices once node embedding is rotated
SO3_grid (SO3_grid) – Class used to convert from grid the spherical harmonic representations
max_num_elements (int) – Maximum number of atomic numbers
(list – int): List of sizes of invariant edge embedding. For example, [input_channels, hidden_channels, hidden_channels]. The last one will be used as hidden size when use_atom_edge_embedding is True.
use_atom_edge_embedding (bool) – Whether to use atomic embedding along with relative distance for edge scalar features
use_m_share_rad (bool) – Whether all m components within a type-L vector of one channel share radial function weights
attn_activation (str) – Type of activation function for SO(2) graph attention
use_s2_act_attn (bool) – Whether to use attention after S2 activation. Otherwise, use the same attention as Equiformer
use_attn_renorm (bool) – Whether to re-normalize attention weights
ffn_activation (str) – Type of activation function for feedforward network
use_gate_act (bool) – If True, use gate activation. Otherwise, use S2 activation
use_grid_mlp (bool) – If True, use projecting to grids and performing MLPs for FFN.
use_sep_s2_act (bool) – If True, use separable S2 activation when use_gate_act is False.
norm_type (str) – Type of normalization layer ([‘layer_norm’, ‘layer_norm_sh’])
alpha_drop (float) – Dropout rate for attention weights
drop_path_rate (float) – Drop path rate
proj_drop (float) – Dropout rate for outputs of attention and FFN

norm_1#

ga#

drop_path#

proj_drop#

norm_2#

ffn#

forward(x, atomic_numbers, edge_distance, edge_index, batch, node_offset: int = 0)#