core.common.test_utils#
Classes#
Functions#
|
|
|
|
|
Spawn single node, multi-rank function. |
|
Module Contents#
- class core.common.test_utils.ForkedPdb(completekey='tab', stdin=None, stdout=None, skip=None, nosigint=False, readrc=True)#
Bases:
pdb.Pdb
A Pdb subclass that may be used from a forked multiprocessing child https://stackoverflow.com/questions/4716533/how-to-attach-debugger-to-a-python-subproccess/23654936#23654936
example usage to debug a torch distributed run on rank 0: if torch.distributed.get_rank() == 0:
from fairchem.core.common.test_utils import ForkedPdb ForkedPdb().set_trace()
- interaction(*args, **kwargs)#
- class core.common.test_utils.PGConfig#
- backend: str#
- world_size: int#
- gp_group_size: int = 1#
- port: str = '12345'#
- use_gp: bool = True#
- core.common.test_utils.init_env_rank_and_launch_test(rank: int, pg_setup_params: PGConfig, mp_output_dict: dict[int, object], test_method: callable, args: list[object], kwargs: dict[str, object]) None #
- core.common.test_utils.init_pg_and_rank_and_launch_test(rank: int, pg_setup_params: PGConfig, mp_output_dict: dict[int, object], test_method: callable, args: list[object], kwargs: dict[str, object]) None #
- core.common.test_utils.spawn_multi_process(config: PGConfig, test_method: callable, init_and_launch: callable, *test_method_args: Any, **test_method_kwargs: Any) list[Any] #
Spawn single node, multi-rank function. Uses localhost and free port to communicate.
- Parameters:
world_size – number of processes
backend – backend to use. for example, “nccl”, “gloo”, etc
test_method – callable to spawn. first 3 arguments are rank, world_size and mp output dict
test_method_args – args for the test method
test_method_kwargs – kwargs for the test method
- Returns:
A list, l, where l[i] is the return value of test_method on rank i
- core.common.test_utils.init_local_distributed_process_group(backend='nccl')#