MIT 6.S978: Deep Generative Models, Fall 2024

6.S978 Deep Generative Models

MIT EECS, Fall 2024

Course Description

This is a seminar course that introduces concepts, formulations, and applications of deep generative models. It covers scenarios mainly in computer vision (images, videos, geometry) and relevant areas such as robotics, biology, material science, etc. It focuses on the common paradigms and methods shared across different problems and disciplines. Core topics include variational autoencoders, autoregressive models, generative adversarial nets, diffusion models, as well as their applications. It covers foundational frameworks and latest research frontiers.

This is a graduate level course. The target audience of this seminar course is graduate students who are conducting (or plan to conduct) research on deep generative models.

Prereqs: DL 6.S898 (now 6.7960) or equivalent, and CV 6.8300/6.8301 or NLP 6.8610/6.8611

Schedule Piazza Canvas

People

Instructor: Kaiming He
OH: Monday 11-12am, 45-701H

TA: Minghao Guo
OH: Friday 12:30-13:30pm, 32-262

Logistics

Lectures and class meetings: 1:00 pm - 2:30 pm every Tuesday and Thursday in 26-168

The course will be a mix of of instructor-presented lectures, guest lectures, and student seminars.

The student seminars will include paper reading, presentation, and discussion.

Each seminar session will have 3 papers presented by students.

Expectations

Students will be expected to:

Attend all lectures and seminars

Complete bi-weekly problem sets

Present one paper in a semniar session: 20 min presentation + 10 min discussion and QA

Upload your presentation slides one day before presentating (before 1pm on Mon/Wed)

Complete a final project and a project presentation (max 2 students in a team)

Grading Policy

30%: Bi-weekly problem sets and participation (including attendance)

Possible grades: Good, Fair, Poor, None
Two lowest grades will be dropped
Late assignments not accepted (except for serious medical/life issues, with support from GradSupport for grad students)

30%: Seminars

5%: Register and upload on time.
20%: Presentation: clarify, depth, engagment.
5%: Discussion and QA.

40%: Final project (max 2 students in a team)

5% Proposal
15% Final presentation
20% Final blog

Generative Modeling by Estimating Gradients of the Data Distribution, by Yang Song
Perspectives on diffusion, by Sander Dieleman
Diffusion Models for Video Generation, by Lilian Weng
The AI Revolution: The Road to Superintelligence, by Maarten Steinbuch
Understanding LSTM Networks, by Christopher Olah
The Unreasonable Effectiveness of Recurrent Neural Networks, by Andrej Karpathy