Skip to content

hunyuan15

Classes

fastvideo.configs.pipelines.hunyuan15.Hunyuan15T2V480PConfig dataclass

Hunyuan15T2V480PConfig(model_path: str = '', pipeline_config_path: str | None = None, embedded_cfg_scale: float = 6.0, flow_shift: int = 5, disable_autocast: bool = False, is_causal: bool = False, dit_config: DiTConfig = HunyuanVideo15Config(), dit_precision: str = 'bf16', vae_config: VAEConfig = Hunyuan15VAEConfig(), vae_precision: str = 'fp16', vae_tiling: bool = True, vae_sp: bool = True, image_encoder_config: EncoderConfig = EncoderConfig(), image_encoder_precision: str = 'fp32', text_encoder_configs: tuple[EncoderConfig, ...] = (lambda: (Qwen2_5_VLConfig(), T5Config()))(), text_encoder_precisions: tuple[str, ...] = (lambda: ('bf16', 'fp32'))(), preprocess_text_funcs: tuple[Callable[[Any], Any], ...] = (lambda: (qwen_preprocess_text, byt5_preprocess_text))(), postprocess_text_funcs: tuple[Callable[..., Any], ...] = (lambda: (qwen_postprocess_text, byt5_postprocess_text))(), pos_magic: str | None = None, neg_magic: str | None = None, timesteps_scale: bool | None = None, mask_strategy_file_path: str | None = None, STA_mode: STA_Mode = STA_INFERENCE, skip_time_steps: int = 15, dmd_denoising_steps: list[int] | None = None, ti2v_task: bool = False, boundary_ratio: float | None = None, text_encoder_crop_start: int = PROMPT_TEMPLATE_TOKEN_LENGTH, text_encoder_max_lengths: tuple[int, ...] = (lambda: (1000 + PROMPT_TEMPLATE_TOKEN_LENGTH, 256))())

Bases: PipelineConfig

Base configuration for HunYuan pipeline architecture.

fastvideo.configs.pipelines.hunyuan15.Hunyuan15T2V720PConfig dataclass

Hunyuan15T2V720PConfig(model_path: str = '', pipeline_config_path: str | None = None, embedded_cfg_scale: float = 6.0, flow_shift: int = 9, disable_autocast: bool = False, is_causal: bool = False, dit_config: DiTConfig = HunyuanVideo15Config(), dit_precision: str = 'bf16', vae_config: VAEConfig = Hunyuan15VAEConfig(), vae_precision: str = 'fp16', vae_tiling: bool = True, vae_sp: bool = True, image_encoder_config: EncoderConfig = EncoderConfig(), image_encoder_precision: str = 'fp32', text_encoder_configs: tuple[EncoderConfig, ...] = (lambda: (Qwen2_5_VLConfig(), T5Config()))(), text_encoder_precisions: tuple[str, ...] = (lambda: ('bf16', 'fp32'))(), preprocess_text_funcs: tuple[Callable[[Any], Any], ...] = (lambda: (qwen_preprocess_text, byt5_preprocess_text))(), postprocess_text_funcs: tuple[Callable[..., Any], ...] = (lambda: (qwen_postprocess_text, byt5_postprocess_text))(), pos_magic: str | None = None, neg_magic: str | None = None, timesteps_scale: bool | None = None, mask_strategy_file_path: str | None = None, STA_mode: STA_Mode = STA_INFERENCE, skip_time_steps: int = 15, dmd_denoising_steps: list[int] | None = None, ti2v_task: bool = False, boundary_ratio: float | None = None, text_encoder_crop_start: int = PROMPT_TEMPLATE_TOKEN_LENGTH, text_encoder_max_lengths: tuple[int, ...] = (lambda: (1000 + PROMPT_TEMPLATE_TOKEN_LENGTH, 256))())

Bases: Hunyuan15T2V480PConfig

Base configuration for HunYuan pipeline architecture.

Functions

fastvideo.configs.pipelines.hunyuan15.extract_glyph_texts

extract_glyph_texts(prompt: str) -> str | None

Extract glyph texts from prompt using regex pattern.

Parameters:

Name Type Description Default
prompt str

Input prompt string

required

Returns:

Type Description
str | None

List of extracted glyph texts

Source code in fastvideo/configs/pipelines/hunyuan15.py
def extract_glyph_texts(prompt: str) -> str | None:
    """
    Extract glyph texts from prompt using regex pattern.

    Args:
        prompt: Input prompt string

    Returns:
        List of extracted glyph texts
    """
    pattern = r"\"(.*?)\"|“(.*?)”"
    matches = re.findall(pattern, prompt)
    result = [match[0] or match[1] for match in matches]
    result = list(dict.fromkeys(result)) if len(result) > 1 else result

    if result:
        formatted_result = ". ".join([f'Text "{text}"'
                                      for text in result]) + ". "
    else:
        formatted_result = None

    return formatted_result

fastvideo.configs.pipelines.hunyuan15.format_text_input

format_text_input(prompt: str, system_message: str) -> list[dict[str, Any]]

Apply text to template.

Parameters:

Name Type Description Default
prompt List[str]

Input text.

required
system_message str

System message.

required

Returns:

Type Description
list[dict[str, Any]]

List[Dict[str, Any]]: List of chat conversation.

Source code in fastvideo/configs/pipelines/hunyuan15.py
def format_text_input(prompt: str, system_message: str) -> list[dict[str, Any]]:
    """
    Apply text to template.

    Args:
        prompt (List[str]): Input text.
        system_message (str): System message.

    Returns:
        List[Dict[str, Any]]: List of chat conversation.
    """

    template = [{
        "role": "system",
        "content": system_message
    }, {
        "role": "user",
        "content": prompt if prompt else " "
    }]

    return template