Tentative Lecture Schedule

This schedule might change slightly during the quarter. The dates of the exam, however, will not change.
Slides will be uploaded to the course home page, typically before each lecture. The lectures themselves might deviate significantly from the textbooks. Thus, it is necessary to attend a lecture live or view its video asynchronously to keep up with course content.
The guest lectures are not included in the syllabus for the exams. But they will be the focus of the extra credit activities.
Some topics may take a few weeks to cover.

Week	Topic
1-2	Basics: Deep learning, computational graph, autodiff, ML frameworks
3	GPUs, CUDA, Collective communication
4	graph and memory optimizations
4	Guest lecture: ML compilers
5	Data and model parallelism, auto-parallelization
6	Transformers, LLMs, MoE
6	Guest lecture: LLM pretraining and open science
7	LLM training: flash attention, quantization
8	LLM inference and serving: paged attention, continuous batching, speculative decoding
9	Guest lecture: fast inference
9	Scaling Law, test-time compute, reasoning
10	LLM + X (X = RAG, search, multi-modality, etc.)
10	Guest lecture: LLM, tool use, and agents
10	Final exam reviews
11	Final exam