fastvideo.v1.distributed.device_communicators.pynccl#

Module Contents#

Classes#

Data#

API#

class fastvideo.v1.distributed.device_communicators.pynccl.PyNcclCommunicator(group: Union[torch.distributed.ProcessGroup, fastvideo.v1.distributed.utils.StatelessProcessGroup], device: Union[int, str, torch.device], library_path: Optional[str] = None)[source]#

Initialization

Parameters:
  • group – the process group to work on. If None, it will use the default process group.

  • device – the device to bind the PyNcclCommunicator to. If None, it will be bind to f”cuda:{local_rank}”.

  • library_path – the path to the NCCL library. If None, it will use the default library path.

It is the caller’s responsibility to make sure each communicator is bind to a unique device.

all_gather(output_tensor: torch.Tensor, input_tensor: torch.Tensor, stream=None)[source]#
all_reduce(in_tensor: torch.Tensor, op: torch.distributed.ReduceOp = ReduceOp.SUM, stream=None) torch.Tensor[source]#
broadcast(tensor: torch.Tensor, src: int, stream=None)[source]#
recv(tensor: torch.Tensor, src: int, stream=None)[source]#
reduce_scatter(output_tensor: torch.Tensor, input_tensor: torch.Tensor, op: torch.distributed.ReduceOp = ReduceOp.SUM, stream=None)[source]#
send(tensor: torch.Tensor, dst: int, stream=None)[source]#
fastvideo.v1.distributed.device_communicators.pynccl.logger[source]#

β€˜init_logger(…)’