Skip to content

image_processor

Minimal image processing utilities for FastVideo. This module provides lightweight image preprocessing without external dependencies beyond PyTorch/NumPy/PIL.

Classes

fastvideo.image_processor.ImageProcessor

ImageProcessor(vae_scale_factor: int = 8)

Minimal image processor for video frame preprocessing.

This is a lightweight alternative to diffusers.VideoProcessor that handles: - PIL image to tensor conversion - Resizing to specified dimensions - Normalization to [-1, 1] range

Parameters:

Name Type Description Default
vae_scale_factor int

The VAE scale factor used to ensure dimensions are multiples of this value.

8
Source code in fastvideo/image_processor.py
def __init__(self, vae_scale_factor: int = 8) -> None:
    self.vae_scale_factor = vae_scale_factor

Functions

fastvideo.image_processor.ImageProcessor.preprocess
preprocess(image: Image | ndarray | Tensor, height: int | None = None, width: int | None = None) -> Tensor

Preprocess an image to a normalized torch tensor.

Parameters:

Name Type Description Default
image Image | ndarray | Tensor

Input image (PIL Image, NumPy array, or torch tensor)

required
height int | None

Target height. If None, uses image's original height.

None
width int | None

Target width. If None, uses image's original width.

None

Returns:

Type Description
Tensor

torch.Tensor: Normalized tensor of shape (1, 3, height, width) or (1, 1, height, width) for grayscale, with values in range [-1, 1].

Source code in fastvideo/image_processor.py
def preprocess(
    self,
    image: PIL.Image.Image | np.ndarray | torch.Tensor,
    height: int | None = None,
    width: int | None = None,
) -> torch.Tensor:
    """
    Preprocess an image to a normalized torch tensor.

    Args:
        image: Input image (PIL Image, NumPy array, or torch tensor)
        height: Target height. If None, uses image's original height.
        width: Target width. If None, uses image's original width.

    Returns:
        torch.Tensor: Normalized tensor of shape (1, 3, height, width) or (1, 1, height, width) for grayscale,
                     with values in range [-1, 1].
    """
    # Handle different input types
    if isinstance(image, PIL.Image.Image):
        return self._preprocess_pil(image, height, width)
    elif isinstance(image, np.ndarray):
        return self._preprocess_numpy(image, height, width)
    elif isinstance(image, torch.Tensor):
        return self._preprocess_tensor(image, height, width)
    else:
        raise ValueError(
            f"Unsupported image type: {type(image)}. "
            "Supported types: PIL.Image.Image, np.ndarray, torch.Tensor")