Deep learning is a statistical learning approach that aims to mimic how the brain learns data. The core of deep learning is artificial neural networks, which consist of many simple but non-linear layers that learn representations with tasks directly from data.

Sub-pages

Tools

Deep learning is primarily done in Python:

Resources

  • Dive into Deep Learning, by Zhang, et al., for an accessible introduction
  • Deep Learning, by Ian Goodfellow, Yoshua Bengia, and Aaron Courville, for a mathematically rigorous introduction
  • Ilya Sutskever’s 30u30 reading list
  • Alice’s Adventures in a Differentiable Wonderland, by Simone Scardapane

Limits

The consequence of deep learning’s dominance is that much of the time spent in ML work is data preparation — data cleaning, sorting, and labelling for meaningful use (see data science). One other problem is that DL models will be quite accurate, but we can’t understand how or why they got to this conclusion (which motivates explainable AI). Models can also have the tendency to latch onto correlations, which is problematic because they won’t be able to find the causes (i.e., they could go in the wrong direction).

There are also limits with adversarial attacks, where specifically engineered noise can destroy the result of the model in a way that’s imperceptible to us.

Because of biases in the model’s training data, we also run the risk of having the model itself being biased. This can cause real-world problems in discrimination, especially when DL is applied to critical applications. There’s a whole host of ethical problems associated with modern machine learning.

See also