fastvideo.v1.models.vision_utils

`fastvideo.v1.models.vision_utils`#

`get_default_height_width`	Returns the height and width of the image, downscaled to the next integer multiple of `vae_scale_factor`.
`load_image`	Loads `image` to a PIL Image.
`normalize`	Normalize an image array to [-1,1].
`numpy_to_pt`	Convert a NumPy image to a PyTorch tensor.
`pil_to_numpy`	Convert a PIL image or a list of PIL images to NumPy arrays.
`resize`	Resize image.

fastvideo.v1.models.vision_utils.get_default_height_width(image: Union[PIL.Image.Image, numpy.ndarray, torch.Tensor], vae_scale_factor: int, height: Optional[int] = None, width: Optional[int] = None) → Tuple[int, int][source]#

Returns the height and width of the image, downscaled to the next integer multiple of vae_scale_factor.

Parameters:

image (Union[PIL.Image.Image, np.ndarray, torch.Tensor]) – The image input, which can be a PIL image, NumPy array, or PyTorch tensor. If it is a NumPy array, it should have shape [batch, height, width] or [batch, height, width, channels]. If it is a PyTorch tensor, it should have shape [batch, channels, height, width].
height (Optional[int], optional, defaults to None) – The height of the preprocessed image. If None, the height of the image input will be used.
width (Optional[int], optional, defaults to None) – The width of the preprocessed image. If None, the width of the image input will be used.

Returns:

A tuple containing the height and width, both resized to the nearest integer multiple of vae_scale_factor.

Return type:

Tuple[int, int]

fastvideo.v1.models.vision_utils.load_image(image: Union[str, PIL.Image.Image], convert_method: Optional[Callable[[PIL.Image.Image], PIL.Image.Image]] = None) → PIL.Image.Image[source]#

Loads image to a PIL Image.

Parameters:

image (str or PIL.Image.Image) – The image to convert to the PIL Image format.
convert_method (Callable[[PIL.Image.Image], PIL.Image.Image], optional) – A conversion method to apply to the image after loading it. When set to None the image will be converted “RGB”.

Returns:

A PIL Image.

Return type:

PIL.Image.Image

Normalize an image array to [-1,1].

Parameters:: images (np.ndarray or torch.Tensor) – The image array to normalize.
Returns:: The normalized image array.
Return type:: np.ndarray or torch.Tensor

fastvideo.v1.models.vision_utils.numpy_to_pt(images: numpy.ndarray) → torch.Tensor[source]#

Convert a NumPy image to a PyTorch tensor.

Parameters:: images (np.ndarray) – The NumPy image array to convert to PyTorch format.
Returns:: A PyTorch tensor representation of the images.
Return type:: torch.Tensor

fastvideo.v1.models.vision_utils.pil_to_numpy(images: Union[List[PIL.Image.Image], PIL.Image.Image]) → numpy.ndarray[source]#

Convert a PIL image or a list of PIL images to NumPy arrays.

Parameters:: images (PIL.Image.Image or List[PIL.Image.Image]) – The PIL image or list of images to convert to NumPy format.
Returns:: A NumPy array representation of the images.
Return type:: np.ndarray

fastvideo.v1.models.vision_utils.resize(image: Union[PIL.Image.Image, numpy.ndarray, torch.Tensor], height: int, width: int, resize_mode: str = 'default', resample: str = 'lanczos') → Union[PIL.Image.Image, numpy.ndarray, torch.Tensor][source]#

Resize image.

Parameters:

image (PIL.Image.Image, np.ndarray or torch.Tensor) – The image input, can be a PIL image, numpy array or pytorch tensor.
height (int) – The height to resize to.
width (int) – The width to resize to.
resize_mode (str, optional, defaults to default) – The resize mode to use, can be one of default or fill. If default, will resize the image to fit within the specified width and height, and it may not maintaining the original aspect ratio. If fill, will resize the image to fit within the specified width and height, maintaining the aspect ratio, and then center the image within the dimensions, filling empty with data from image. If crop, will resize the image to fit within the specified width and height, maintaining the aspect ratio, and then center the image within the dimensions, cropping the excess. Note that resize_mode fill and crop are only supported for PIL image input.

Returns:

The resized image.

Return type:

PIL.Image.Image, np.ndarray or torch.Tensor