1-2 |
Basics: Deep learning, computational graph, autodiff, ML frameworks |
3 |
GPUs, CUDA, Collective communication |
4 |
graph and memory optimizations |
4 |
Guest lecture: ML compilers |
5 |
Data and model parallelism, auto-parallelization |
6 |
Transformers, LLMs, MoE |
6 |
Guest lecture: LLM pretraining and open science |
7 |
LLM training: flash attention, quantization |
8 |
LLM inference and serving: paged attention, continuous batching, speculative decoding |
9 |
Guest lecture: fast inference |
9 |
Scaling Law, test-time compute, reasoning |
10 |
LLM + X (X = RAG, search, multi-modality, etc.) |
10 |
Guest lecture: LLM, tool use, and agents |
10 |
Final exam reviews |
11 |
Final exam |