CSE 234: Data Systems for Machine Learning
Instructor: Hao Zhang, UC San Diego, Winter 2025
Announcements
Week 6 Announcements
- We have just released PA2, due March 4th 2025. Please start early!
Week 1
- Jan 7
-
- 1 Introduction
- Slides
- Survey Beginning of Quarter Survey (Due: End of Week 2 - 1/19)
- Readings (Due 1/14)
- Required: 1.1 - MLSys : Intro, 1.2 - DNN
- Optional: 1.3 - Petuum, 1.4 - Systems Challenges for AI
- Jan 9
-
- 2 Basics: Modern DL, computational graph, frameworks
- Slides • Recording • Scribe Note
Week 2
- Jan 14
-
- 3 Basics: autodiff, ML system architecture overview
- Slides • Recording • Scribe Note
- Readings (Due 1/21)
- Jan 16
-
- 4 Tensor format, matmul deep dive, accelerators
- Slides • Recording • Scribe Note
Week 3
- Jan 21
-
- 5 GPUs and CUDA
- Slides • Recording • Scribe Note
- Readings (Due 1/28)
- Required: 3.1 - GPU Performance, 3.2 - MI300X vs H100
- Optional: 3.3 - Moore’s Law, 3.4 - The Future of Moore’s Law
- Jan 23
-
- 6 GPU matmul, operator compilation
- Slides • Recording • Scribe Note
Week 4
- Jan 28
-
- 7 Triton, graph optimization and compilation
- Slides • Recording • Scribe Note
- Readings (Due 2/4)
- Required: 4.1 - TVM, 4.2 - Triton
- Optional: 4.3 - TASO, 4.4 - DL Compiler, 4.5 - Tensor Comprehensions
- Jan 30
-
- 8 Memory
- Slides • Recording • Scribe Note
Week 5
- Feb 4
-
Readings (Due 2/11)
- Required: 5.1 - Deep Compression, 5.2 - Quantization Survey
- Optional: 5.3 - AWQ, 5.4 - QLoRA, 5.5 - Scaling Laws for Mixed quantization
- Feb 6
Week 6
- Feb 11
- Readings (Due 2/18)
- Required: 6.1 - ML Parallelism Blog, 6.2 - Megatron
- Optional: 6.3 - Pytorch DDP, 6.4 - Parameter Server, 6.5 - Megatron v2
- Feb 13
-
- 12 Parallelization - 2, collective communication
- Slides • Recording • Scribe Note
Week 7
- Feb 18
-
- 13 Parallelization - 3, data, inter- and intra-op parallelism
- Slides • Recording • Scribe Note
- Readings (Due 2/25)
- Required: 7.1 - GPipe, 7.2 - Alpa
- Optional: 7.3 - Megatron v3, 7.4 - PipeDream, 7.5 - Chimera, 7.6 - GShard