Profiling FastVideo¶
!!! warning Profiling is only intended for FastVideo developers and maintainers to understand the proportion of time spent in different parts of the codebase. FastVideo end-users should never turn on profiling as it will significantly slow down the inference.
Profiling with PyTorch¶
FastVideo exposes a process-wide torch profiler that you can enable via environment variables. Set FASTVIDEO_TORCH_PROFILER_DIR to an absolute directory path to start collecting traces, and specify the regions you want recorded with FASTVIDEO_TORCH_PROFILE_REGIONS:
FASTVIDEO_TORCH_PROFILER_DIR=/mnt/traces/fastvideo \
FASTVIDEO_TORCH_PROFILE_REGIONS="profiler_region_model_loading,profiler_region_training_step"
All profiled regions must be registered in fastvideo.profiler; the current list includes:
profiler_region_model_loading— pipeline/module loadingprofiler_region_inference_pre_denoisingprofiler_region_inference_denoisingprofiler_region_inference_post_denoisingprofiler_region_training_checkpoint_savingprofiler_region_training_ditprofiler_region_training_validationprofiler_region_training_epochprofiler_region_training_stepprofiler_region_training_backwardprofiler_region_training_optimizerprofiler_region_distillation_teacher_forwardprofiler_region_distillation_student_forwardprofiler_region_distillation_lossprofiler_region_distillation_update
While profiling is enabled, FastVideo records additional annotations:
fastvideo.region::<name>spans are emitted when entering a region.fastvideo.profiler.enable_collection/fastvideo.profiler.disable_collectionevents mark when torch profiler collection is toggled on or off.
Only one profiler instance is created per process; subsequent pipelines reuse the same controller. If you set FASTVIDEO_TORCH_PROFILE_REGIONS incorrectly (e.g. misspelled name), FastVideo logs a warning and ignores that entry.
Additional knobs:
FASTVIDEO_TORCH_PROFILER_RECORD_SHAPESFASTVIDEO_TORCH_PROFILER_WITH_PROFILE_MEMORYFASTVIDEO_TORCH_PROFILER_WITH_STACKFASTVIDEO_TORCH_PROFILER_WITH_FLOPS
Traces can be visualized using https://ui.perfetto.dev/.
Best Practices¶
- Keep the profiled step count small; traces can be large and slow down job shutdown while the profiler flushes data.
- After profiling, clean up trace directories to avoid filling disks.
- When adding new regions, register them in
fastvideo.profilerand wrap the corresponding code block withwith self.profiler_controller.region("your_region"):or the@profile_regiondecorator.