image_processor ¶

Minimal image processing utilities for FastVideo. This module provides lightweight image preprocessing without external dependencies beyond PyTorch/NumPy/PIL.

Classes¶

fastvideo.image_processor.ImageProcessor ¶

ImageProcessor(vae_scale_factor: int = 8)

Minimal image processor for video frame preprocessing.

This is a lightweight alternative to diffusers.VideoProcessor that handles: - PIL image to tensor conversion - Resizing to specified dimensions - Normalization to [-1, 1] range

Parameters:

Name	Type	Description	Default
`vae_scale_factor`	`int`	The VAE scale factor used to ensure dimensions are multiples of this value.	`8`

Source code in fastvideo/image_processor.py

def __init__(self, vae_scale_factor: int = 8) -> None:
    self.vae_scale_factor = vae_scale_factor

Functions¶

fastvideo.image_processor.ImageProcessor.preprocess ¶

preprocess(image: Image | ndarray | Tensor, height: int | None = None, width: int | None = None) -> Tensor

Preprocess an image to a normalized torch tensor.

Parameters:

Name	Type	Description	Default
`image`	`Image \| ndarray \| Tensor`	Input image (PIL Image, NumPy array, or torch tensor)	required
`height`	`int \| None`	Target height. If None, uses image's original height.	`None`
`width`	`int \| None`	Target width. If None, uses image's original width.	`None`

Returns:

Type	Description
`Tensor`	torch.Tensor: Normalized tensor of shape (1, 3, height, width) or (1, 1, height, width) for grayscale, with values in range [-1, 1].

Source code in fastvideo/image_processor.py

def preprocess(
    self,
    image: PIL.Image.Image | np.ndarray | torch.Tensor,
    height: int | None = None,
    width: int | None = None,
) -> torch.Tensor:
    """
    Preprocess an image to a normalized torch tensor.

    Args:
        image: Input image (PIL Image, NumPy array, or torch tensor)
        height: Target height. If None, uses image's original height.
        width: Target width. If None, uses image's original width.

    Returns:
        torch.Tensor: Normalized tensor of shape (1, 3, height, width) or (1, 1, height, width) for grayscale,
                     with values in range [-1, 1].
    """
    # Handle different input types
    if isinstance(image, PIL.Image.Image):
        return self._preprocess_pil(image, height, width)
    elif isinstance(image, np.ndarray):
        return self._preprocess_numpy(image, height, width)
    elif isinstance(image, torch.Tensor):
        return self._preprocess_tensor(image, height, width)
    else:
        raise ValueError(
            f"Unsupported image type: {type(image)}. "
            "Supported types: PIL.Image.Image, np.ndarray, torch.Tensor")