abstract
¶
Classes¶
fastvideo.attention.backends.abstract.AttentionBackend
¶
fastvideo.attention.backends.abstract.AttentionImpl
¶
AttentionImpl(num_heads: int, head_size: int, softmax_scale: float, causal: bool = False, num_kv_heads: int | None = None, prefix: str = '', **extra_impl_args)
Source code in fastvideo/attention/backends/abstract.py
Functions¶
fastvideo.attention.backends.abstract.AttentionImpl.postprocess_output
¶
Postprocess the output tensor after the attention operation.
Default implementation returns the tensor unchanged. Subclasses can override this to implement custom postprocessing like untiling, scaling, or other transformations.
Called BEFORE all_to_all for distributed attention
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output
|
Tensor
|
The output tensor from the attention operation |
required |
attn_metadata
|
T
|
Metadata for the attention operation |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Postprocessed output tensor |
Source code in fastvideo/attention/backends/abstract.py
fastvideo.attention.backends.abstract.AttentionImpl.preprocess_qkv
¶
Preprocess QKV tensor before performing attention operation.
Default implementation returns the tensor unchanged. Subclasses can override this to implement custom preprocessing like reshaping, tiling, scaling, or other transformations.
Called AFTER all_to_all for distributed attention
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
qkv
|
Tensor
|
The query-key-value tensor |
required |
attn_metadata
|
T
|
Metadata for the attention operation |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Processed QKV tensor |
Source code in fastvideo/attention/backends/abstract.py
fastvideo.attention.backends.abstract.AttentionMetadata
dataclass
¶
AttentionMetadata(current_timestep: int)
Attention metadata for prefill and decode batched together.
Functions¶
fastvideo.attention.backends.abstract.AttentionMetadata.asdict_zerocopy
¶
Similar to dataclasses.asdict, but avoids deepcopying.
Source code in fastvideo/attention/backends/abstract.py
fastvideo.attention.backends.abstract.AttentionMetadataBuilder
¶
Abstract class for attention metadata builders.
Create the builder, remember some configuration and parameters.