Skip to content

transform

Classes

fastvideo.dataset.transform.CenterCropResizeVideo

CenterCropResizeVideo(size, top_crop=False, interpolation_mode='bilinear')

First use the short side for cropping length, center crop video, then resize to the specified size

Source code in fastvideo/dataset/transform.py
def __init__(
    self,
    size,
    top_crop=False,
    interpolation_mode="bilinear",
) -> None:
    if len(size) != 2:
        raise ValueError(
            f"size should be tuple (height, width), instead got {size}")
    self.size = size
    self.top_crop = top_crop
    self.interpolation_mode = interpolation_mode

Functions

fastvideo.dataset.transform.CenterCropResizeVideo.__call__
__call__(clip) -> Tensor

Parameters:

Name Type Description Default
clip tensor

Video clip to be cropped. Size is (T, C, H, W)

required

Returns: torch.tensor: scale resized / center cropped video clip. size is (T, C, crop_size, crop_size)

Source code in fastvideo/dataset/transform.py
def __call__(self, clip) -> torch.Tensor:
    """
    Args:
        clip (torch.tensor): Video clip to be cropped. Size is (T, C, H, W)
    Returns:
        torch.tensor: scale resized / center cropped video clip.
            size is (T, C, crop_size, crop_size)
    """
    clip_center_crop = center_crop_th_tw(clip,
                                         self.size[0],
                                         self.size[1],
                                         top_crop=self.top_crop)
    clip_center_crop_resize = resize(
        clip_center_crop,
        target_size=self.size,
        interpolation_mode=self.interpolation_mode,
    )
    return clip_center_crop_resize

fastvideo.dataset.transform.Normalize255

Normalize255()

Convert tensor data type from uint8 to float, divide value by 255.0 and

Source code in fastvideo/dataset/transform.py
def __init__(self) -> None:
    pass

Functions

fastvideo.dataset.transform.Normalize255.__call__
__call__(clip) -> Tensor

Parameters:

Name Type Description Default
clip torch.tensor, dtype=torch.uint8

Size is (T, C, H, W)

required

Return: clip (torch.tensor, dtype=torch.float): Size is (T, C, H, W)

Source code in fastvideo/dataset/transform.py
def __call__(self, clip) -> torch.Tensor:
    """
    Args:
        clip (torch.tensor, dtype=torch.uint8): Size is (T, C, H, W)
    Return:
        clip (torch.tensor, dtype=torch.float): Size is (T, C, H, W)
    """
    return normalize_video(clip)

fastvideo.dataset.transform.TemporalRandomCrop

TemporalRandomCrop(size)

Temporally crop the given frame indices at a random location.

Parameters:

Name Type Description Default
size int

Desired length of frames will be seen in the model.

required
Source code in fastvideo/dataset/transform.py
def __init__(self, size) -> None:
    self.size = size

Functions

fastvideo.dataset.transform.crop

crop(clip, i, j, h, w) -> Tensor

Parameters:

Name Type Description Default
clip tensor

Video clip to be cropped. Size is (T, C, H, W)

required
Source code in fastvideo/dataset/transform.py
def crop(clip, i, j, h, w) -> torch.Tensor:
    """
    Args:
        clip (torch.tensor): Video clip to be cropped. Size is (T, C, H, W)
    """
    if len(clip.size()) != 4:
        raise ValueError("clip should be a 4D tensor")
    return clip[..., i:i + h, j:j + w]

fastvideo.dataset.transform.normalize_video

normalize_video(clip) -> Tensor

Convert tensor data type from uint8 to float, divide value by 255.0 and permute the dimensions of clip tensor Args: clip (torch.tensor, dtype=torch.uint8): Size is (T, C, H, W) Return: clip (torch.tensor, dtype=torch.float): Size is (T, C, H, W)

Source code in fastvideo/dataset/transform.py
def normalize_video(clip) -> torch.Tensor:
    """
    Convert tensor data type from uint8 to float, divide value by 255.0 and
    permute the dimensions of clip tensor
    Args:
        clip (torch.tensor, dtype=torch.uint8): Size is (T, C, H, W)
    Return:
        clip (torch.tensor, dtype=torch.float): Size is (T, C, H, W)
    """
    _is_tensor_video_clip(clip)
    if not clip.dtype == torch.uint8:
        raise TypeError(
            f"clip tensor should have data type uint8. Got {clip.dtype}")
    # return clip.float().permute(3, 0, 1, 2) / 255.0
    return clip.float() / 255.0