2024-09-18 21:13:15 (INFO): Running in local mode without elastic launch (single gpu only) 2024-09-18 21:13:15 (INFO): Setting env PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True 2024-09-18 21:13:15 (INFO): Project root: /home/runner/work/fairchem/fairchem/src/fairchem /home/runner/work/fairchem/fairchem/src/fairchem/core/models/escn/so3.py:23: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. _Jd = torch.load(os.path.join(os.path.dirname(__file__), "Jd.pt")) /home/runner/work/fairchem/fairchem/src/fairchem/core/models/scn/spherical_harmonics.py:23: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. _Jd = torch.load(os.path.join(os.path.dirname(__file__), "Jd.pt")) /home/runner/work/fairchem/fairchem/src/fairchem/core/models/equiformer_v2/wigner.py:10: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. _Jd = torch.load(os.path.join(os.path.dirname(__file__), "Jd.pt")) /home/runner/work/fairchem/fairchem/src/fairchem/core/models/equiformer_v2/layer_norm.py:75: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. @torch.cuda.amp.autocast(enabled=False) /home/runner/work/fairchem/fairchem/src/fairchem/core/models/equiformer_v2/layer_norm.py:175: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. @torch.cuda.amp.autocast(enabled=False) /home/runner/work/fairchem/fairchem/src/fairchem/core/models/equiformer_v2/layer_norm.py:263: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. @torch.cuda.amp.autocast(enabled=False) /home/runner/work/fairchem/fairchem/src/fairchem/core/models/equiformer_v2/layer_norm.py:357: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. @torch.cuda.amp.autocast(enabled=False) 2024-09-18 21:13:16 (INFO): amp: false cmd: checkpoint_dir: fine-tuning/checkpoints/2024-09-18-21-13-36-ft-oxides commit: '8226618' identifier: ft-oxides logs_dir: fine-tuning/logs/tensorboard/2024-09-18-21-13-36-ft-oxides print_every: 10 results_dir: fine-tuning/results/2024-09-18-21-13-36-ft-oxides seed: 0 timestamp_id: 2024-09-18-21-13-36-ft-oxides version: 0.1.dev1+g8226618 dataset: a2g_args: r_energy: true r_forces: true format: ase_db src: train.db evaluation_metrics: metrics: energy: - mae forces: - forcesx_mae - forcesy_mae - forcesz_mae - mae - cosine_similarity - magnitude_error misc: - energy_forces_within_threshold primary_metric: forces_mae gp_gpus: null gpus: 0 logger: tensorboard loss_functions: - energy: coefficient: 1 fn: mae - forces: coefficient: 1 fn: l2mae model: activation: silu atom_edge_interaction: true atom_interaction: true cbf: name: spherical_harmonics cutoff: 12.0 cutoff_aeaint: 12.0 cutoff_aint: 12.0 cutoff_qint: 12.0 direct_forces: true edge_atom_interaction: true emb_size_aint_in: 64 emb_size_aint_out: 64 emb_size_atom: 256 emb_size_cbf: 16 emb_size_edge: 512 emb_size_quad_in: 32 emb_size_quad_out: 32 emb_size_rbf: 16 emb_size_sbf: 32 emb_size_trip_in: 64 emb_size_trip_out: 64 envelope: exponent: 5 name: polynomial extensive: true forces_coupled: false max_neighbors: 30 max_neighbors_aeaint: 20 max_neighbors_aint: 1000 max_neighbors_qint: 8 name: gemnet_oc num_after_skip: 2 num_atom: 3 num_atom_emb_layers: 2 num_before_skip: 2 num_blocks: 4 num_concat: 1 num_global_out_layers: 2 num_output_afteratom: 3 num_radial: 128 num_spherical: 7 otf_graph: true output_init: HeOrthogonal qint_tags: - 1 - 2 quad_interaction: true rbf: name: gaussian regress_forces: true sbf: name: legendre_outer symmetric_edge_symmetrization: false optim: batch_size: 4 clip_grad_norm: 10 ema_decay: 0.999 energy_coefficient: 1 eval_batch_size: 16 eval_every: 10 factor: 0.8 force_coefficient: 1 loss_energy: mae lr_initial: 0.0005 max_epochs: 1 mode: min num_workers: 2 optimizer: AdamW optimizer_params: amsgrad: true patience: 3 scheduler: ReduceLROnPlateau weight_decay: 0 outputs: energy: level: system forces: eval_on_free_atoms: true level: atom train_on_free_atoms: true relax_dataset: {} slurm: {} task: {} test_dataset: a2g_args: r_energy: false r_forces: false format: ase_db src: test.db trainer: ocp val_dataset: a2g_args: r_energy: true r_forces: true format: ase_db src: val.db 2024-09-18 21:13:16 (INFO): Loading model: gemnet_oc 2024-09-18 21:13:16 (WARNING): Unrecognized arguments: ['symmetric_edge_symmetrization'] 2024-09-18 21:13:18 (INFO): Loaded GemNetOC with 38864438 parameters. 2024-09-18 21:13:18 (WARNING): log_summary for Tensorboard not supported 2024-09-18 21:13:18 (INFO): Loading dataset: ase_db 2024-09-18 21:13:18 (WARNING): Could not find dataset metadata.npz files in '[PosixPath('train.db')]' 2024-09-18 21:13:18 (WARNING): Disabled BalancedBatchSampler because num_replicas=1. 2024-09-18 21:13:18 (WARNING): Failed to get data sizes, falling back to uniform partitioning. BalancedBatchSampler requires a dataset that has a metadata attributed with number of atoms. 2024-09-18 21:13:18 (INFO): rank: 0: Sampler created... 2024-09-18 21:13:18 (INFO): Created BalancedBatchSampler with sampler=, batch_size=4, drop_last=False 2024-09-18 21:13:18 (WARNING): Could not find dataset metadata.npz files in '[PosixPath('val.db')]' 2024-09-18 21:13:18 (WARNING): Disabled BalancedBatchSampler because num_replicas=1. 2024-09-18 21:13:18 (WARNING): Failed to get data sizes, falling back to uniform partitioning. BalancedBatchSampler requires a dataset that has a metadata attributed with number of atoms. 2024-09-18 21:13:18 (INFO): rank: 0: Sampler created... 2024-09-18 21:13:18 (INFO): Created BalancedBatchSampler with sampler=, batch_size=16, drop_last=False 2024-09-18 21:13:18 (WARNING): Could not find dataset metadata.npz files in '[PosixPath('test.db')]' 2024-09-18 21:13:18 (WARNING): Disabled BalancedBatchSampler because num_replicas=1. 2024-09-18 21:13:18 (WARNING): Failed to get data sizes, falling back to uniform partitioning. BalancedBatchSampler requires a dataset that has a metadata attributed with number of atoms. 2024-09-18 21:13:18 (INFO): rank: 0: Sampler created... 2024-09-18 21:13:18 (INFO): Created BalancedBatchSampler with sampler=, batch_size=16, drop_last=False 2024-09-18 21:13:18 (WARNING): Using `weight_decay` from `optim` instead of `optim.optimizer_params`.Please update your config to use `optim.optimizer_params.weight_decay`.`optim.weight_decay` will soon be deprecated. 2024-09-18 21:13:18 (INFO): Attemping to load user specified checkpoint at /tmp/fairchem_checkpoints/gnoc_oc22_oc20_all_s2ef.pt 2024-09-18 21:13:18 (INFO): Loading checkpoint from: /tmp/fairchem_checkpoints/gnoc_oc22_oc20_all_s2ef.pt /home/runner/work/fairchem/fairchem/src/fairchem/core/trainers/base_trainer.py:590: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. checkpoint = torch.load(checkpoint_path, map_location=map_location) 2024-09-18 21:13:19 (INFO): Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. /home/runner/work/fairchem/fairchem/src/fairchem/core/trainers/ocp_trainer.py:155: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=self.scaler is not None): /home/runner/work/fairchem/fairchem/src/fairchem/core/models/gemnet_oc/gemnet_oc.py:1270: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(False): 2024-09-18 21:13:42 (INFO): energy_mae: 9.48e+00, forcesx_mae: 7.26e-02, forcesy_mae: 3.95e-02, forcesz_mae: 5.74e-02, forces_mae: 5.65e-02, forces_cosine_similarity: 1.12e-01, forces_magnitude_error: 1.11e-01, energy_forces_within_threshold: 0.00e+00, loss: 9.61e+00, lr: 5.00e-04, epoch: 1.69e-01, step: 1.00e+01 2024-09-18 21:13:43 (INFO): Evaluating on val. device 0: 0%| | 0/2 [00:00