fastvideo.pipelines.stages.decoding

`fastvideo.pipelines.stages.decoding`#

Decoding stage for diffusion pipelines.

Module Contents#

Classes#

DecodingStage

Stage for decoding latent representations into pixel space.

Data#

logger

API#

class fastvideo.pipelines.stages.decoding.DecodingStage(vae, pipeline=None)[source]#

Bases: fastvideo.pipelines.stages.base.PipelineStage

Stage for decoding latent representations into pixel space.

This stage handles the decoding of latent representations into the final output format (e.g., pixel values).

Initialization

decode(latents: torch.Tensor, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) → torch.Tensor[source]#

Decode latent representations into pixel space using VAE.

Parameters:

latents – Input latent tensor with shape (batch, channels, frames, height_latents, width_latents)
fastvideo_args –
Configuration containing:
- disable_autocast: Whether to disable automatic mixed precision (default: False)
- pipeline_config.vae_precision: VAE computation precision (“fp32”, “fp16”, “bf16”)
- pipeline_config.vae_tiling: Whether to enable VAE tiling for memory efficiency

Returns:

Decoded video tensor with shape (batch, channels, frames, height, width), normalized to [0, 1] range and moved to CPU as float32

forward(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) → fastvideo.pipelines.pipeline_batch_info.ForwardBatch[source]#

Decode latent representations into pixel space.

This method processes the batch through the VAE decoder, converting latent representations to pixel-space video/images. It also optionally decodes trajectory latents for visualization purposes.

Parameters:

batch –
The current batch containing:
- latents: Tensor to decode (batch, channels, frames, height_latents, width_latents)
- return_trajectory_decoded (optional): Flag to decode trajectory latents
- trajectory_latents (optional): Latents at different timesteps
- trajectory_timesteps (optional): Corresponding timesteps
fastvideo_args –
Configuration containing:
- output_type: “latent” to skip decoding, otherwise decode to pixels
- vae_cpu_offload: Whether to offload VAE to CPU after decoding
- model_loaded: Track VAE loading state
- model_paths: Path to VAE model if loading needed

Returns:

output: Decoded frames (batch, channels, frames, height, width) as CPU float32
trajectory_decoded (if requested): List of decoded frames per timestep

Return type:

Modified batch with

verify_input(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) → fastvideo.pipelines.stages.validators.VerificationResult[source]#: Verify decoding stage inputs.

verify_output(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) → fastvideo.pipelines.stages.validators.VerificationResult[source]#: Verify decoding stage outputs.

fastvideo.pipelines.stages.decoding.logger[source]#: ‘init_logger(…)’