Date |
Topics |
Course Materials |
Assignments |
Week 1 |
Thurs 09/05/2024 |
Lecture: Introduction |
slides
|
|
Week 2 |
Tues 09/10/2024 |
Reading: Modeling Image Prior |
Reading List:
-
D. Zoran and Y. Weiss, From Learning Models of Natural Image Patches to Whole Image
Restoration,
ICCV 2011
[PDF]
-
D. Zoran and Y. Weiss, Natural Images, Gaussian Mixtures and Dead Leaves, NeurIPS 2012
[PDF]
-
D. Ulyanov et al., Deep Image Prior, CVPR 2018
[PDF]
(Optional) Recommended readings:
-
A. A. Efros and T. K. Leung, Texture Synthesis by Non-parametric Sampling, 1999
[PDF]
-
C. Barnes et al., PatchMatch: A Randomized Correspondence Algorithm for
Structural
Image Editing, 2009
[PDF]
-
W. T. Freeman et al., Example-Based Super-Resolution, 2002
[PDF]
|
Assignment
#1 |
Thurs 09/12/2024 |
Lecture: Variational Autoencoder (VAE) |
slides
|
|
Week 3 |
Tues 09/17/2024 |
Reading: Normalizing Flows |
Reading List:
-
D. Rezende and S. Mohamed,
Variational Inference with Normalizing Flows,
ICML 2015
[PDF]
-
D. P. Kingma and P. Dhariwal, Natural Images, Glow: Generative Flow with Invertible 1x1
Convolutions, NeurIPS 2018
[PDF]
-
J. Behrmann et al., Invertible residual networks, ICML 2019
[PDF]
(Optional) Recommended readings:
-
Lilian Weng's post on "Flow-based Deep Generative Models"
[Link]
|
|
Thurs 09/19/2024 |
Lecture: Autoregressive (AR) Models |
slides
|
|
Week 4 |
Tues 09/24/2024 |
Reading: Autoregressive (AR) Models |
Reading List:
-
Y. Bengio and S. Bengio,
Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks,
NIPS 1999
[PDF]
-
A. Van Den Oord et al., Pixel Recurrent Neural Networks, ICML 2016
[PDF]
-
D. P. Kingma et al., Improved Variational Inference with Inverse Autoregressive Flow,
NIPS 2016
[PDF]
|
Assignment #1 Due Assignment
#2 |
Thurs 09/26/2024 |
Reading: AR and tokenizers |
Reading List:
-
L. Yu et al.,
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation,
ICLR 2024
[PDF]
-
K. Tian et al., Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
Prediction, arXiv
[PDF]
-
Q. Yu et al., An Image is Worth 32 Tokens for Reconstruction and Generation, arXiv
[PDF]
(Optional) Recommended readings:
-
M. Chen et al., Generative Pretraining from Pixels, ICML 2020
[PDF]
-
A. Ramesh et al., Zero-Shot Text-to-Image Generation (DALLE1), ICML 2021
[PDF]
-
F. Mentzer et al., Finite Scalar Quantization: VQ-VAE Made Simple, arXiv
[PDF]
|
|
Week 5 |
Tues 10/01/2024 |
Reading: AR and Diffusion |
Reading List:
-
T. Li et al.,
Autoregressive Image Generation without Vector Quantization,
arXiv
[PDF]
-
C. Zhou et al., Transfusion: Predict the Next Token and Diffuse Images with One
Multi-Modal Model, arXiv
[PDF]
-
J. Xie et al., Show-o: One Single Transformer to Unify Multimodal Understanding and
Generation,
arXiv
[PDF]
(Optional) Recommended readings:
-
E. Hoogeboom et al., Autoregressive Diffusion Models, ICLR 2022
[PDF]
-
B. Chen et al., Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion,
arXiv
[PDF]
|
|
Thurs 10/03/2024 |
Lecture: Generative Adversarial Network (GAN)
|
slides
|
|
Week 6 |
Tues 10/08/2024 |
Reading: GAN in the era of diffusion |
Reading List:
-
A. Sauer et al.,
StyleGAN-T- Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis,
ICML 2023
[PDF]
-
M. Kang et al., Scaling up GANs for Text-to-Image Synthesis, CVPR 2023
[PDF]
-
M. Kang et al., Distilling Diffusion Models into Conditional GANs,
ECCV 2024
[PDF]
|
Assignment #2 Due Assignment
#3 |
Thurs 10/10/2024 |
Reading: GAN in the era of Diffusion |
Reading List:
-
N. Huang et al.,
The GAN is dead; long live the GAN! A Modern GAN Baseline,
ICML 2024
[PDF]
-
Z. Wang et al., Diffusion-GAN: Training GANs with Diffusion, ICLR 2023
[PDF]
-
S. Asokan et al., GANs Settle Scores!,
arXiv
[PDF]
|
|
Week 7 |
Tues 10/15/2024 |
No class (student holiday) |
|
|
Thurs 10/17/2024 |
Lecture: Energy-based Models, Score matching,
Diffusion Models |
slides
|
|
Week 8 |
Tues 10/22/2024 |
Reading: Diffusion Models |
Reading List:
-
J. Ho and T. Salimans,
Classifier-Free Diffusion Guidance,
NeurIPS 2021
[PDF]
-
T. Salimans and J. Ho,
Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022
[PDF]
-
E. Hoogeboom et al.,
Simple diffusion: End-to-end diffusion for high resolution images,
ICML 2023
[PDF]
(Optional) Recommended readings:
-
R. Rombach et al., High-Resolution Image Synthesis with Latent Diffusion Models, CVPR
2022
[PDF]
-
T. Karras et al., Elucidating the Design Space of Diffusion-Based Generative Models,
NeurIPS 2022
[PDF]
-
A. Ramesh et al., Hierarchical Text-Conditional Image Generation with CLIP Latents,
arXiv
[PDF]
-
T. Chen, On the Importance of Noise Scheduling for Diffusion Models,
arXiv
[PDF]
-
Sander Dieleman's post on "Diffusion is spectral autoregression"
[Link]
|
Assignment #3 Due Assignment
#4 |
Thurs 10/24/2024 |
Reading: Diffusion beyond Denoising |
Reading List:
-
A. Bansal et al.,
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise,
NeurIPS 2023
[PDF]
-
S. Rissanen et al.,
Generative Modelling With Inverse Heat Dissipation, ICLR 2023
[PDF]
-
M. Delbracio et al.,
Inversion by Direct Iteration- An Alternative to Denoising Diffusion for Image
Restoration,
TMLR 2023
[PDF]
(Optional) Recommended readings:
-
G. Daras et al., Soft Diffusion: Score Matching for General Corruptions, TMLR 2023
[PDF]
|
|
Week 9 |
Tues 10/29/2024 |
Reading: Discrete Diffusion |
Reading List:
-
J. Austin et al.,
Structured Denoising Diffusion Models in Discrete State-Spaces,
NeurIPS 2021
[PDF]
-
S. Gong et al.,
DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models, ICLR 2023
[PDF]
-
A. Lou et al.,
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution,
ICML 2024
[PDF]
|
|
Thurs 10/31/2024 |
Reading: Flow Matching 1 |
Reading List:
-
Y. Lipman et al.,
Flow Matching for Generative Modeling,
ICLR 2023
[PDF]
-
M. S. Albergo et al.,
Building Normalizing Flows with Stochastic Interpolants, ICLR 2023
[PDF]
-
X. Liu et al.,
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow,
ICLR 2023
[PDF]
(Optional) Recommended readings:
-
T. Fjelde et al., Post on "An Introduction to Flow Matching"
[Link]
|
|
Week 10 |
Tues 11/05/2024 |
Reading: Flow Matching 2 |
Reading List:
-
P. Esser et al.,
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis,
ICML 2024
[PDF]
-
N. Ma et al.,
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant
Transformers, ECCV 2024
[PDF]
-
I. Gat et al.,
Discrete Flow Matching,
arXiv
[PDF]
|
Assignment #4 Due Assignment
#5 |
Thurs 11/07/2024 |
Guest Lecture: Jun-Yan Zhu - Ensuring Data Ownership in
Generative Models |
slides
|
|
Week 11 |
Tues 11/12/2024 |
Reading: Application - Videos |
Reading List:
-
O. Bar-Tal et al.,
Lumiere: A Space-Time Diffusion Model for Video Generation,
arXiv
[PDF]
-
J. Bruce et al.,
Genie: Generative Interactive Environments, arXiv
[PDF]
-
The Movie Gen team @ Meta,
Movie Gen: A Cast of Media Foundation Models,
arXiv
[PDF]
|
|
Thurs 11/14/2024 |
Reading: Application - 3D and Geometry |
Reading List:
-
B. Poole et al.,
DreamFusion: Text-to-3D using 2D Diffusion,
ICLR 2023
[PDF]
-
Y. Hong et al.,
LRM: Large Reconstruction Model for Single Image to 3D, ICLR 2024
[PDF]
-
Y. Siddiqui et al.,
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers,
CVPR 2024
[PDF]
(Optional) Recommended readings:
-
L. Zhang et al., CLAY: A Controllable Large-scale Generative Model for Creating
High-quality 3D Assets, SIGGRAPH 2024
[PDF]
-
X. Wei et al., MeshLRM: Large Reconstruction Model for High-Quality Meshes, arXiv
[PDF]
-
T. Shen et al., Flexible Isosurface Extraction for Gradient-Based Mesh Optimization,
SIGGRAPH 2023
[PDF]
|
|
Week 12 |
Tues 11/19/2024 |
Reading: Application - Robotics |
Reading List:
-
M. Janner et al.,
Planning with Diffusion for Flexible Behavior Synthesis,
ICML 2022
[PDF]
-
C. Chi et al.,
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion, RSS 2023
[PDF]
-
S. Yang et al.,
UniSim: Learning Interactive Real-World Simulators,
ICLR 2024
[PDF]
(Optional) Recommended readings:
-
D. Driess et al., PaLM-E: An Embodied Multimodal Language Model, ICML 2023
[PDF]
|
Assignment #5 Due Assignment #6 |
Thurs 11/21/2024 |
Guest Lecture: Yang Song - Consistency Models |
|
|
Week 13 |
Tues 11/26/2024 |
Reading: Application - Material Science |
Reading List:
-
W. Jin et al.,
Junction Tree Variational Autoencoder for Molecular Graph Generation,
ICML 2018
[PDF]
-
E. Hoogeboom et al.,
Equivariant Diffusion for Molecule Generation in 3D, ICML 2022
[PDF]
-
G. Zhou et al.,
Uni-Mol: A Universal 3D Molecular Representation Learning Framework,
ICLR 2023
[PDF]
(Optional) Recommended readings:
-
M. Xu et al., Geometric Latent Diffusion Models for 3D Molecule Generation, ICML 2023
[PDF]
-
M. Arts et al., Two for One: Diffusion Models and Force Fields for Coarse-Grained
Molecular Dynamics, arXiv
[PDF]
|
|
Thurs 11/28/2024 |
No class (Thanksgiving) |
|
|
Week 14 |
Tues 12/03/2024 |
Reading: Application - Protein and Biology
|
Reading List:
-
J. L. Watson et al.,
De novo design of protein structure and function with RFdiffusion,
Nature
[PDF]
-
J. Abramson et al.,
Accurate structure prediction of biomolecular interactions with AlphaFold 3, Nature
[PDF]
-
J. B. Ingraham et al.,
Illuminating protein space with a programmable generative model,
Nature
[PDF]
| Assignment #6 Due |
Thurs 12/05/2024 |
Final Presentation 1 |
|
|
Week 15 |
Tues 12/10/2024 |
Final Presentation 2 |
|
|