Unsupervised learning is a broad approach in statistical learning. It’s kind of like we’re working blind trying to find patterns or structures within the data.

The model must be able to find observations without human annotations. Most data is unlabelled by default, including Internet data and a wide variety of biological/chemical data.

Variants of unsupervised learning includes self-supervised learning and semi-supervised learning. Self-supervised learning uses the success of supervised learning without relying on human supervision (i.e., we mask part of the input and predict the masked information). Semi-supervised learning learns from data that is mostly unlabelled but with a small amount of human-labelled data.

Sub-pages

See also