fastvideo.v1.distributed.device_communicators.pynccl
#
Module Contents#
Classes#
Data#
API#
- class fastvideo.v1.distributed.device_communicators.pynccl.PyNcclCommunicator(group: Union[torch.distributed.ProcessGroup, fastvideo.v1.distributed.utils.StatelessProcessGroup], device: Union[int, str, torch.device], library_path: Optional[str] = None)[source]#
Initialization
- Parameters:
group β the process group to work on. If None, it will use the default process group.
device β the device to bind the PyNcclCommunicator to. If None, it will be bind to fβcuda:{local_rank}β.
library_path β the path to the NCCL library. If None, it will use the default library path.
It is the callerβs responsibility to make sure each communicator is bind to a unique device.
- all_gather(output_tensor: torch.Tensor, input_tensor: torch.Tensor, stream=None)[source]#
- all_reduce(in_tensor: torch.Tensor, op: torch.distributed.ReduceOp = ReduceOp.SUM, stream=None) torch.Tensor [source]#
- broadcast(tensor: torch.Tensor, src: int, stream=None)[source]#
- recv(tensor: torch.Tensor, src: int, stream=None)[source]#
- reduce_scatter(output_tensor: torch.Tensor, input_tensor: torch.Tensor, op: torch.distributed.ReduceOp = ReduceOp.SUM, stream=None)[source]#
- send(tensor: torch.Tensor, dst: int, stream=None)[source]#