Deep learning

Deep learning is a statistical learning approach that aims to mimic how the brain learns data. The core of deep learning is artificial neural networks, which consist of many simple but non-linear layers that learn representations with tasks directly from data.

Sub-pages

Neuron
- Weights and biases
Activation function
- Linear activation function
- Unit step function (and sign function)
- Sigmoid
- ReLU
- Softmax
  - Temperature scaling
Neural network layer
- Decision boundary
Neural network architectures
- Feed-forward network (MLP)
- Fully-connected network
- Residual network
- Convolutional neural network (CNN)
  - Convolution
  - Pooling
  - Existing architectures
    - LeNet
    - AlexNet
    - GoogLeNet
  - Transposed convolution
- Autoencoder
  - Variational autoencoder (VAE)
  - Convolutional autoencoder
- Recurrent neural network (RNN)
  - Long short-term memory (LSTM)
  - Gated recurrent unit (GRU)
  - Sampling strategies
    - Greedy search
    - Beam search
    - Temperature scaling
- Graph neural network (GNN)
  - Deep set
  - Graph convolutional network (GCN)
  - Graph attention network (GAT)
Model performance metrics
- Error function
  - Mean squared error
  - Cross entropy (CE) and binary cross entropy (BCE)
  - Negative log likelihood (NLL)
- Classification metrics
  - Accuracy, precision, recall, F1-score, support
Gradient descent
- Vanishing and exploding gradient problem
- Backpropagation
- Optimiser
  - Stochastic gradient descent (SGD)
  - Adaptive moment estimation (Adam)
- Learning rate
Data processing
- One-hot encoding
- Dataset (splitting)
- Data augmentation
- Regularisation
  - Weight decay
  - Neuron dropout
Batch normalisation
- Layer normalisation
Transfer learning
Generative AI
- Variational autoencoders (VAE)
- KL divergence
- Generative adversarial network (GAN)
  - CycleGAN
Transformer
- Attention mechanism

Tools

Deep learning is primarily done in Python:

PyTorch
TensorFlow
Keras
Data tools
- pandas/Polars
- NumPy
- Dask
- PySpark

Resources

Dive into Deep Learning, by Zhang, et al., for an accessible introduction
Deep Learning, by Ian Goodfellow, Yoshua Bengia, and Aaron Courville, for a mathematically rigorous introduction
Ilya Sutskever’s 30u30 reading list
Alice’s Adventures in a Differentiable Wonderland, by Simone Scardapane

Limits

The consequence of deep learning’s dominance is that much of the time spent in ML work is data preparation — data cleaning, sorting, and labelling for meaningful use (see data science). One other problem is that DL models will be quite accurate, but we can’t understand how or why they got to this conclusion (which motivates explainable AI). Models can also have the tendency to latch onto correlations, which is problematic because they won’t be able to find the causes (i.e., they could go in the wrong direction).

There are also limits with adversarial attacks, where specifically engineered noise can destroy the result of the model in a way that’s imperceptible to us.

Because of biases in the model’s training data, we also run the risk of having the model itself being biased. This can cause real-world problems in discrimination, especially when DL is applied to critical applications. There’s a whole host of ethical problems associated with modern machine learning.

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

Deep learning

Sub-pages

Tools

Resources

Limits

See also

Graph View

Backlinks