preprocessing_datasets
¶
Classes¶
fastvideo.dataset.preprocessing_datasets.DataValidationStage
¶
Bases: DatasetFilterStage
Stage for validating data items.
Functions¶
fastvideo.dataset.preprocessing_datasets.DataValidationStage.process
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Process does nothing for validation - filtering is handled by should_keep.
fastvideo.dataset.preprocessing_datasets.DataValidationStage.should_keep
¶
should_keep(batch: PreprocessBatch, **kwargs) -> bool
Validate data item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch to validate |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if valid, False if invalid |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.DatasetFilterStage
¶
Bases: ABC
Abstract base class for dataset filtering stages.
These stages can filter out items during metadata processing.
Functions¶
fastvideo.dataset.preprocessing_datasets.DatasetFilterStage.process
abstractmethod
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Process the dataset batch (for non-filtering operations).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch to process |
required |
**kwargs
|
Additional processing parameters |
{}
|
Returns:
| Type | Description |
|---|---|
PreprocessBatch
|
Processed batch |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.DatasetFilterStage.should_keep
abstractmethod
¶
should_keep(batch: PreprocessBatch, **kwargs) -> bool
Check if batch should be kept.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch to check |
required |
**kwargs
|
Additional parameters |
{}
|
Returns:
| Type | Description |
|---|---|
bool
|
True if batch should be kept, False otherwise |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.DatasetStage
¶
Bases: ABC
Abstract base class for dataset processing stages.
Similar to PipelineStage but designed for dataset preprocessing operations.
Functions¶
fastvideo.dataset.preprocessing_datasets.DatasetStage.process
abstractmethod
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Process the dataset batch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch to process |
required |
**kwargs
|
Additional processing parameters |
{}
|
Returns:
| Type | Description |
|---|---|
PreprocessBatch
|
Processed batch |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.FrameSamplingStage
¶
FrameSamplingStage(num_frames: int, train_fps: int, speed_factor: int = 1, video_length_tolerance_range: float = 5.0, drop_short_ratio: float = 0.0, seed: int = 42)
Bases: DatasetFilterStage
Stage for temporal frame sampling and indexing.
Source code in fastvideo/dataset/preprocessing_datasets.py
Functions¶
fastvideo.dataset.preprocessing_datasets.FrameSamplingStage.process
¶
process(batch: PreprocessBatch, temporal_sample_fn=None, **kwargs) -> PreprocessBatch
Process frame sampling for video data items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch |
required |
temporal_sample_fn
|
Function for temporal sampling |
None
|
Returns:
| Type | Description |
|---|---|
PreprocessBatch
|
Updated batch with frame sampling info |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.FrameSamplingStage.should_keep
¶
should_keep(batch: PreprocessBatch, **kwargs) -> bool
Check if video should be kept based on length constraints.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if should be kept, False otherwise |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.ImageTransformStage
¶
Bases: DatasetStage
Stage for image data transformation.
Source code in fastvideo/dataset/preprocessing_datasets.py
Functions¶
fastvideo.dataset.preprocessing_datasets.ImageTransformStage.process
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Transform image data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch with image information |
required |
Returns:
| Type | Description |
|---|---|
PreprocessBatch
|
Batch with transformed image tensor |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.PreprocessBatch
dataclass
¶
PreprocessBatch(path: str, cap: str | list[str], resolution: dict | None = None, fps: float | None = None, duration: float | None = None, num_frames: int | None = None, sample_frame_index: list[int] | None = None, sample_num_frames: int | None = None, pixel_values: Tensor | None = None, text: str | None = None, input_ids: Tensor | None = None, cond_mask: Tensor | None = None)
Batch information for dataset processing stages.
This class holds all the information about a video-caption or image-caption pair as it moves through the processing pipeline. Fields are populated by different stages.
fastvideo.dataset.preprocessing_datasets.ResolutionFilterStage
¶
ResolutionFilterStage(max_h_div_w_ratio: float = 17 / 16, min_h_div_w_ratio: float = 8 / 16, max_height: int = 1024, max_width: int = 1024)
Bases: DatasetFilterStage
Stage for filtering data items based on resolution constraints.
Source code in fastvideo/dataset/preprocessing_datasets.py
Functions¶
fastvideo.dataset.preprocessing_datasets.ResolutionFilterStage.filter_resolution
¶
Filter based on height/width ratio.
fastvideo.dataset.preprocessing_datasets.ResolutionFilterStage.process
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Process does nothing for resolution filtering - filtering is handled by should_keep.
fastvideo.dataset.preprocessing_datasets.ResolutionFilterStage.should_keep
¶
should_keep(batch: PreprocessBatch, **kwargs) -> bool
Check if data item passes resolution filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch with resolution information |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if passes filter, False otherwise |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.TextDataset
¶
Bases: IterableDataset, Stateful
Text-only dataset for processing prompts from a simple text file.
Assumes that data_merge_path is a text file with one prompt per line: A cat playing with a ball A dog running in the park A person cooking dinner ...
This dataset processes text data through text encoding stages only.
Source code in fastvideo/dataset/preprocessing_datasets.py
Functions¶
fastvideo.dataset.preprocessing_datasets.TextDataset.__iter__
¶
Iterator for the dataset.
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.TextDataset.load_state_dict
¶
fastvideo.dataset.preprocessing_datasets.TextDataset.state_dict
¶
fastvideo.dataset.preprocessing_datasets.TextEncodingStage
¶
Bases: DatasetStage
Stage for text tokenization and encoding.
Source code in fastvideo/dataset/preprocessing_datasets.py
Functions¶
fastvideo.dataset.preprocessing_datasets.TextEncodingStage.process
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Process text data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch with caption information |
required |
Returns:
| Type | Description |
|---|---|
PreprocessBatch
|
Batch with encoded text information |
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.VideoCaptionMergedDataset
¶
VideoCaptionMergedDataset(data_merge_path: str, args, transform, temporal_sample, transform_topcrop, start_idx: int = 0, seed: int = 42)
Bases: IterableDataset, Stateful
Merged dataset for video and caption data with stage-based processing.
Assumes that data_merge_path is a txt file with the following format:
The folder should contain videos.
The json file should be a list of dictionaries with the following format:
[
{
"path": "1gGQy4nxyUo-Scene-016.mp4",
"resolution": {
"width": 1920,
"height": 1080
},
"size": 2439112,
"fps": 25.0,
"duration": 6.88,
"num_frames": 172,
"cap": [
"A watermelon wearing a helmet is crushed by a hydraulic press, causing it to flatten and burst open."
]
},
...
]
This dataset processes video and image data through a series of stages:
- Data validation
- Resolution filtering
- Frame sampling
- Transformation
- Text encoding
Source code in fastvideo/dataset/preprocessing_datasets.py
fastvideo.dataset.preprocessing_datasets.VideoTransformStage
¶
Bases: DatasetStage
Stage for video data transformation.
Source code in fastvideo/dataset/preprocessing_datasets.py
Functions¶
fastvideo.dataset.preprocessing_datasets.VideoTransformStage.process
¶
process(batch: PreprocessBatch, **kwargs) -> PreprocessBatch
Transform video data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
PreprocessBatch
|
Dataset batch with video information |
required |
Returns:
| Type | Description |
|---|---|
PreprocessBatch
|
Batch with transformed video tensor |