fastvideo.pipelines.preprocess.preprocess_pipeline_base#

Module Contents#

Classes#

BasePreprocessPipeline

Base class for preprocessing pipelines that handles common functionality.

Data#

API#

class fastvideo.pipelines.preprocess.preprocess_pipeline_base.BasePreprocessPipeline(model_path: str, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs | fastvideo.fastvideo_args.TrainingArgs, required_config_modules: list[str] | None = None, loaded_modules: dict[str, torch.nn.Module] | None = None)[source]#

Bases: fastvideo.pipelines.composed_pipeline_base.ComposedPipelineBase

Base class for preprocessing pipelines that handles common functionality.

Initialization

Initialize the pipeline. After init, the pipeline should be ready to use. The pipeline should be stateless and not hold any batch state.

create_pipeline_stages(fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs)[source]#

Set up pipeline stages with proper dependency injection.

create_record(video_name: str, vae_latent: numpy.ndarray, text_embedding: numpy.ndarray, valid_data: dict[str, Any], idx: int, extra_features: dict[str, Any] | None = None) dict[str, Any][source]#

Create a record for the Parquet dataset.

create_record_for_schema(preprocess_batch: fastvideo.dataset.preprocessing_datasets.PreprocessBatch, schema: pyarrow.Schema, strict: bool = False) dict[str, Any][source]#

Create a record for the Parquet dataset using a generic schema-based approach.

Parameters:
  • preprocess_batch – The batch containing the data to extract

  • schema – PyArrow schema defining the expected fields

  • strict – If True, raises an exception when required fields are missing or unfilled

Returns:

Dictionary record matching the schema

Raises:

ValueError – If strict=True and required fields are missing or unfilled

forward(batch: fastvideo.pipelines.pipeline_batch_info.ForwardBatch, fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs, args)[source]#
get_extra_features(valid_data: dict[str, Any], fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs) dict[str, Any][source]#

Get additional features specific to the pipeline type. Override in subclasses.

abstract get_schema_fields() list[str][source]#

Get the schema fields for the pipeline type. Override in subclasses.

preprocess_video_and_text(fastvideo_args: fastvideo.fastvideo_args.FastVideoArgs, args)[source]#
static process_chunk_range(args: Any) int[source]#
fastvideo.pipelines.preprocess.preprocess_pipeline_base.logger[source]#

β€˜init_logger(…)’