6.S978 Deep Generative Models

MIT EECS, Fall 2024



This schedule is preliminary and subject to change as the term evolves.


Date Topics Course Materials Assignments
Week 1
Thurs 09/05/2024 Lecture: Introduction

slides
Week 2
Tues 09/10/2024 Reading: Modeling Image Prior Reading List:
  1. D. Zoran and Y. Weiss, From Learning Models of Natural Image Patches to Whole Image Restoration, ICCV 2011 [PDF]
  2. D. Zoran and Y. Weiss, Natural Images, Gaussian Mixtures and Dead Leaves, NeurIPS 2012 [PDF]
  3. D. Ulyanov et al., Deep Image Prior, CVPR 2018 [PDF]
(Optional) Recommended readings:
  1. A. A. Efros and T. K. Leung, Texture Synthesis by Non-parametric Sampling, 1999 [PDF]
  2. C. Barnes et al., PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing, 2009 [PDF]
  3. W. T. Freeman et al., Example-Based Super-Resolution, 2002 [PDF]
Assignment #1
Thurs 09/12/2024 Lecture: Variational Autoencoder (VAE)

slides
Week 3
Tues 09/17/2024 Reading: Normalizing Flows Reading List:
  1. D. Rezende and S. Mohamed, Variational Inference with Normalizing Flows, ICML 2015 [PDF]
  2. D. P. Kingma and P. Dhariwal, Natural Images, Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018 [PDF]
  3. J. Behrmann et al., Invertible residual networks, ICML 2019 [PDF]
(Optional) Recommended readings:
  1. Lilian Weng's post on "Flow-based Deep Generative Models" [Link]
Thurs 09/19/2024 Lecture: Autoregressive (AR) Models

slides
Week 4
Tues 09/24/2024 Reading: Autoregressive (AR) Models Reading List:
  1. Y. Bengio and S. Bengio, Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks, NIPS 1999 [PDF]
  2. A. Van Den Oord et al., Pixel Recurrent Neural Networks, ICML 2016 [PDF]
  3. D. P. Kingma et al., Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016 [PDF]
Assignment #1 Due
Assignment #2
Thurs 09/26/2024 Reading: AR and tokenizers Reading List:
  1. L. Yu et al., Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation, ICLR 2024 [PDF]
  2. K. Tian et al., Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction, arXiv [PDF]
  3. Q. Yu et al., An Image is Worth 32 Tokens for Reconstruction and Generation, arXiv [PDF]
(Optional) Recommended readings:
  1. M. Chen et al., Generative Pretraining from Pixels, ICML 2020 [PDF]
  2. A. Ramesh et al., Zero-Shot Text-to-Image Generation (DALLE1), ICML 2021 [PDF]
  3. F. Mentzer et al., Finite Scalar Quantization: VQ-VAE Made Simple, arXiv [PDF]
Week 5
Tues 10/01/2024 Reading: AR and Diffusion Reading List:
  1. T. Li et al., Autoregressive Image Generation without Vector Quantization, arXiv [PDF]
  2. C. Zhou et al., Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model, arXiv [PDF]
  3. J. Xie et al., Show-o: One Single Transformer to Unify Multimodal Understanding and Generation, arXiv [PDF]
(Optional) Recommended readings:
  1. E. Hoogeboom et al., Autoregressive Diffusion Models, ICLR 2022 [PDF]
  2. B. Chen et al., Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion, arXiv [PDF]
Thurs 10/03/2024 Lecture: Generative Adversarial Network (GAN)

slides
Week 6
Tues 10/08/2024 Reading: GAN in the era of diffusion Reading List:
  1. A. Sauer et al., StyleGAN-T- Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis, ICML 2023 [PDF]
  2. M. Kang et al., Scaling up GANs for Text-to-Image Synthesis, CVPR 2023 [PDF]
  3. M. Kang et al., Distilling Diffusion Models into Conditional GANs, ECCV 2024 [PDF]
Assignment #2 Due
Assignment #3
Thurs 10/10/2024 Reading: GAN in the era of Diffusion Reading List:
  1. N. Huang et al., The GAN is dead; long live the GAN! A Modern GAN Baseline, ICML 2024 [PDF]
  2. Z. Wang et al., Diffusion-GAN: Training GANs with Diffusion, ICLR 2023 [PDF]
  3. S. Asokan et al., GANs Settle Scores!, arXiv [PDF]
Week 7
Tues 10/15/2024 No class (student holiday)
Thurs 10/17/2024 Lecture: Energy-based Models, Score matching, Diffusion Models

slides
Week 8
Tues 10/22/2024 Reading: Diffusion Models Reading List:
  1. J. Ho and T. Salimans, Classifier-Free Diffusion Guidance, NeurIPS 2021 [PDF]
  2. T. Salimans and J. Ho, Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022 [PDF]
  3. E. Hoogeboom et al., Simple diffusion: End-to-end diffusion for high resolution images, ICML 2023 [PDF]
(Optional) Recommended readings:
  1. R. Rombach et al., High-Resolution Image Synthesis with Latent Diffusion Models, CVPR 2022 [PDF]
  2. T. Karras et al., Elucidating the Design Space of Diffusion-Based Generative Models, NeurIPS 2022 [PDF]
  3. A. Ramesh et al., Hierarchical Text-Conditional Image Generation with CLIP Latents, arXiv [PDF]
  4. T. Chen, On the Importance of Noise Scheduling for Diffusion Models, arXiv [PDF]
  5. Sander Dieleman's post on "Diffusion is spectral autoregression" [Link]
Assignment #3 Due
Assignment #4
Thurs 10/24/2024 Reading: Diffusion beyond Denoising Reading List:
  1. A. Bansal et al., Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise, NeurIPS 2023 [PDF]
  2. S. Rissanen et al., Generative Modelling With Inverse Heat Dissipation, ICLR 2023 [PDF]
  3. M. Delbracio et al., Inversion by Direct Iteration- An Alternative to Denoising Diffusion for Image Restoration, TMLR 2023 [PDF]
(Optional) Recommended readings:
  1. G. Daras et al., Soft Diffusion: Score Matching for General Corruptions, TMLR 2023 [PDF]
Week 9
Tues 10/29/2024 Reading: Discrete Diffusion Reading List:
  1. J. Austin et al., Structured Denoising Diffusion Models in Discrete State-Spaces, NeurIPS 2021 [PDF]
  2. S. Gong et al., DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models, ICLR 2023 [PDF]
  3. A. Lou et al., Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution, ICML 2024 [PDF]
Thurs 10/31/2024 Reading: Flow Matching 1 Reading List:
  1. Y. Lipman et al., Flow Matching for Generative Modeling, ICLR 2023 [PDF]
  2. M. S. Albergo et al., Building Normalizing Flows with Stochastic Interpolants, ICLR 2023 [PDF]
  3. X. Liu et al., Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow, ICLR 2023 [PDF]
(Optional) Recommended readings:
  1. T. Fjelde et al., Post on "An Introduction to Flow Matching" [Link]
Week 10
Tues 11/05/2024 Reading: Flow Matching 2 Reading List:
  1. P. Esser et al., Scaling Rectified Flow Transformers for High-Resolution Image Synthesis, ICML 2024 [PDF]
  2. N. Ma et al., SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers, ECCV 2024 [PDF]
  3. I. Gat et al., Discrete Flow Matching, arXiv [PDF]
Assignment #4 Due
Assignment #5
Thurs 11/07/2024 Guest Lecture: Jun-Yan Zhu - Ensuring Data Ownership in Generative Models

slides
Week 11
Tues 11/12/2024 Reading: Application - Videos Reading List:
  1. O. Bar-Tal et al., Lumiere: A Space-Time Diffusion Model for Video Generation, arXiv [PDF]
  2. J. Bruce et al., Genie: Generative Interactive Environments, arXiv [PDF]
  3. The Movie Gen team @ Meta, Movie Gen: A Cast of Media Foundation Models, arXiv [PDF]
Thurs 11/14/2024 Reading: Application - 3D and Geometry Reading List:
  1. B. Poole et al., DreamFusion: Text-to-3D using 2D Diffusion, ICLR 2023 [PDF]
  2. Y. Hong et al., LRM: Large Reconstruction Model for Single Image to 3D, ICLR 2024 [PDF]
  3. Y. Siddiqui et al., MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers, CVPR 2024 [PDF]
(Optional) Recommended readings:
  1. L. Zhang et al., CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets, SIGGRAPH 2024 [PDF]
  2. X. Wei et al., MeshLRM: Large Reconstruction Model for High-Quality Meshes, arXiv [PDF]
  3. T. Shen et al., Flexible Isosurface Extraction for Gradient-Based Mesh Optimization, SIGGRAPH 2023 [PDF]
Week 12
Tues 11/19/2024 Reading: Application - Robotics Reading List:
  1. M. Janner et al., Planning with Diffusion for Flexible Behavior Synthesis, ICML 2022 [PDF]
  2. C. Chi et al., Diffusion Policy: Visuomotor Policy Learning via Action Diffusion, RSS 2023 [PDF]
  3. S. Yang et al., UniSim: Learning Interactive Real-World Simulators, ICLR 2024 [PDF]
(Optional) Recommended readings:
  1. D. Driess et al., PaLM-E: An Embodied Multimodal Language Model, ICML 2023 [PDF]
Assignment #5 Due
Assignment #6
Thurs 11/21/2024 Guest Lecture: Yang Song - Consistency Models
Week 13
Tues 11/26/2024 Reading: Application - Material Science Reading List:
  1. W. Jin et al., Junction Tree Variational Autoencoder for Molecular Graph Generation, ICML 2018 [PDF]
  2. E. Hoogeboom et al., Equivariant Diffusion for Molecule Generation in 3D, ICML 2022 [PDF]
  3. G. Zhou et al., Uni-Mol: A Universal 3D Molecular Representation Learning Framework, ICLR 2023 [PDF]
(Optional) Recommended readings:
  1. M. Xu et al., Geometric Latent Diffusion Models for 3D Molecule Generation, ICML 2023 [PDF]
  2. M. Arts et al., Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics, arXiv [PDF]
Thurs 11/28/2024 No class (Thanksgiving)
Week 14
Tues 12/03/2024 Reading: Application - Protein and Biology Reading List:
  1. J. L. Watson et al., De novo design of protein structure and function with RFdiffusion, Nature [PDF]
  2. J. Abramson et al., Accurate structure prediction of biomolecular interactions with AlphaFold 3, Nature [PDF]
  3. J. B. Ingraham et al., Illuminating protein space with a programmable generative model, Nature [PDF]
Assignment #6 Due
Thurs 12/05/2024 Final Presentation 1
Week 15
Tues 12/10/2024 Final Presentation 2