Video Sparse Attention (VSA)¶
Sparse attention mechanism selecting top-k blocks.
Installation¶
VSA is included in the fastvideo-kernel package. See the main Attention page for build instructions.
Usage¶
from fastvideo_kernel import video_sparse_attn
# q, k, v: [batch_size, num_heads, seq_len, head_dim]
# variable_block_sizes: Number of valid tokens per block
# topk: Number of blocks to attend
output = video_sparse_attn(
q, k, v,
variable_block_sizes=block_sizes,
topk=32
)
Citation¶
If you use Video Sparse Attention in your research, please cite: