fastvideo.pipelines.stages.decoding#

Decoding stage for diffusion pipelines.

Module Contents#

Classes#

DecodingStage

Stage for decoding latent representations into pixel space.

Data#

API#

class fastvideo.pipelines.stages.decoding.DecodingStage(vae, pipeline=None)[source]#

Bases: fastvideo.pipelines.stages.base.PipelineStage

Stage for decoding latent representations into pixel space.

This stage handles the decoding of latent representations into the final output format (e.g., pixel values).

Initialization

decode(latents: torch.Tensor, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) torch.Tensor[source]#

Decode latent representations into pixel space using VAE.

Parameters:
  • latents – Input latent tensor with shape (batch, channels, frames, height_latents, width_latents)

  • fastvideo_args –

    Configuration containing:

    • disable_autocast: Whether to disable automatic mixed precision (default: False)

    • pipeline_config.vae_precision: VAE computation precision (β€œfp32”, β€œfp16”, β€œbf16”)

    • pipeline_config.vae_tiling: Whether to enable VAE tiling for memory efficiency

Returns:

Decoded video tensor with shape (batch, channels, frames, height, width), normalized to [0, 1] range and moved to CPU as float32

forward(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) fastvideo.pipelines.pipeline_batch_info.ForwardBatch[source]#

Decode latent representations into pixel space.

This method processes the batch through the VAE decoder, converting latent representations to pixel-space video/images. It also optionally decodes trajectory latents for visualization purposes.

Parameters:
  • batch –

    The current batch containing:

    • latents: Tensor to decode (batch, channels, frames, height_latents, width_latents)

    • return_trajectory_decoded (optional): Flag to decode trajectory latents

    • trajectory_latents (optional): Latents at different timesteps

    • trajectory_timesteps (optional): Corresponding timesteps

  • fastvideo_args –

    Configuration containing:

    • output_type: β€œlatent” to skip decoding, otherwise decode to pixels

    • vae_cpu_offload: Whether to offload VAE to CPU after decoding

    • model_loaded: Track VAE loading state

    • model_paths: Path to VAE model if loading needed

Returns:

  • output: Decoded frames (batch, channels, frames, height, width) as CPU float32

  • trajectory_decoded (if requested): List of decoded frames per timestep

Return type:

Modified batch with

verify_input(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) fastvideo.pipelines.stages.validators.VerificationResult[source]#

Verify decoding stage inputs.

verify_output(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) fastvideo.pipelines.stages.validators.VerificationResult[source]#

Verify decoding stage outputs.

fastvideo.pipelines.stages.decoding.logger[source]#

β€˜init_logger(…)’