Understanding Deep Learning
by Simon J.D. Prince
To be published by MIT Press Dec 5th 2023.
Download draft PDF
Draft PDF Chapters 1-21
2023-08-06. CC-BY-NC-ND license
- Appendices and notebooks coming soon
- Report errata via github or contact me directly at udlbookmail@gmail.com
- Follow me on Twitter or LinkedIn for updates.
Table of contents
- Chapter 1 - Introduction
- Chapter 2 - Supervised learning
- Chapter 3 - Shallow neural networks
- Chapter 4 - Deep neural networks
- Chapter 5 - Loss functions
- Chapter 6 - Training models
- Chapter 7 - Gradients and initialization
- Chapter 8 - Measuring performance
- Chapter 9 - Regularization
- Chapter 10 - Convolutional networks
- Chapter 11 - Residual networks
- Chapter 12 - Transformers
- Chapter 13 - Graph neural networks
- Chapter 14 - Unsupervised learning
- Chapter 15 - Generative adversarial networks
- Chapter 16 - Normalizing flows
- Chapter 17 - Variational autoencoders
- Chapter 18 - Diffusion models
- Chapter 19 - Deep reinforcement learning
- Chapter 20 - Why does deep learning work?
- Chapter 21 - Deep learning and ethics
Resources for instructors
Instructor answer booklet available with proof of credentials via MIT Press
Figures in PDF (vector) / SVG (vector) / Powerpoint (images):
Instructions for editing figures / equations can be found here.
Resources for students
Answers to selected questions: PDF
Python notebooks:
- Notebook 1.1 - Background mathematics: ipynb/colab
- Notebook 2.1 - Supervised learning: ipynb/colab
- Notebook 3.1 - Shallow networks I: ipynb/colab
- Notebook 3.2 - Shallow networks II: ipynb/colab
- Notebook 3.3 - Shallow network regions: ipynb/colab
- Notebook 3.4 - Activation functions: ipynb/colab
- Notebook 4.1 - Composing networks: ipynb/colab
- Notebook 4.2 - Clipping functions: ipynb/colab
- Notebook 4.3 - Deep networks: ipynb/colab
- Notebook 5.1 - Least squares loss: ipynb/colab
- Notebook 5.2 - Binary cross-entropy loss: ipynb/colab
- Notebook 5.3 - Multiclass cross-entropy loss: ipynb/colab
- Notebook 6.1 - Line search: ipynb/colab
- Notebook 6.2 - Gradient descent: ipynb/colab
- Notebook 6.3 - Stochastic gradient descent: ipynb/colab
- Notebook 6.4 - Momentum: ipynb/colab
- Notebook 6.5 - Adam: ipynb/colab
- Notebook 7.1 - Backpropagtion in toy model: ipynb/colab
- Notebook 7.2 - Backpropagation: ipynb/colab
- Notebook 7.3 - Initialization: ipynb/colab
- Notebook 8.1 - MNIST-1D performance: ipynb/colab
- Notebook 8.2 - Bias-variance trade-off: ipynb/colab
- Notebook 8.3 - Double descent: ipynb/colab
- Notebook 8.4 - High-dimensional spaces: ipynb/colab
- Notebook 9.1 - L2 regularization: ipynb/colab
- Notebook 9.2 - Implicit regularization: ipynb/colab
- Notebook 9.3 - Ensembling: ipynb/colab
- Notebook 9.4 - Bayesian approach: ipynb/colab
- Notebook 9.5 - Augmentation ipynb/colab
- Notebook 10.1 - 1D convolution: ipynb/colab
- Notebook 10.2 - Convolution for MNIST-1D: ipynb/colab
- Notebook 10.3 - 2D convolution: ipynb/colab
- Notebook 10.4 - Downsampling & upsampling: ipynb/colab
- Notebook 10.5 - Convolution for MNIST: ipynb/colab
- Notebook 11.1 - Shattered gradients: (coming soon)
- Notebook 11.2 - Residual networks: (coming soon)
- Notebook 11.3 - Batch normalization: (coming soon)
- Notebook 12.1 - Self-attention: (coming soon)
- Notebook 12.2 - Multi-head self-attention: (coming soon)
- Notebook 12.3 - Tokenization: (coming soon)
- Notebook 12.4 - Decoding strategies: ipynb/colab
- Notebook 13.1 - Encoding graphs: (coming soon)
- Notebook 13.2 - Graph classification : (coming soon)
- Notebook 13.3 - Neighborhood sampling: (coming soon)
- Notebook 13.4 - Graph attention: (coming soon)
- Notebook 15.1 - GAN toy example: (coming soon)
- Notebook 15.2 - Wasserstein distance: (coming soon)
- Notebook 16.1 - 1D normalizing flows: (coming soon)
- Notebook 16.2 - Autoregressive flows: (coming soon)
- Notebook 16.3 - Contraction mappings: (coming soon)
- Notebook 17.1 - Latent variable models: (coming soon)
- Notebook 17.2 - Reparameterization trick: (coming soon)
- Notebook 17.3 - Importance sampling: (coming soon)
- Notebook 18.1 - Diffusion encoder: (coming soon)
- Notebook 18.2 - 1D diffusion model: (coming soon)
- Notebook 18.3 - Reparameterized model: (coming soon)
- Notebook 18.4 - Families of diffusion models: (coming soon)
- Notebook 19.1 - Markov decision processes: (coming soon)
- Notebook 19.2 - Dynamic programming: (coming soon)
- Notebook 19.3 - Monte-Carlo methods: (coming soon)
- Notebook 19.4 - Temporal difference methods: (coming soon)
- Notebook 19.5 - Control variates: (coming soon)
- Notebook 20.1 - Random data: (coming soon)
- Notebook 20.2 - Full-batch gradient descent: (coming soon)
- Notebook 20.3 - Lottery tickets: (coming soon)
- Notebook 20.4 - Adversarial attacks: (coming soon)
- Notebook 21.1 - Bias mitigation: (coming soon)
- Notebook 21.2 - Explainability: (coming soon)
Citation:
@book{prince2023understanding,
author = "Simon J.D. Prince",
title = "Understanding Deep Learning",
publisher = "MIT Press",
year = 2023,
url = "http://udlbook.com"
}