Generative AI refers to machine learning models and architectures that are specially built for the purpose of generating new data. Given an input encoding, the model outputs new data for that input.
Basics
Generative learning is an unsupervised learning task, because there’s no loss function because this isn’t a task we know the ground truth to. We’re instead interested in learning the structure and distribution of the data (contrast this with discriminative models). In this case, we’re interested in maximising the joint probability by estimating to find using Bayes’ theorem.
Since generative models learn the underlying distribution, they can generate new samples. There are two types of models:
- Unconditional models:
- Take only random noise as an input.
- Outputs new data samples without a specific category or label.
- We have no control over the generated category/label. An example of this is if we generate random images of faces without specifying attributes or the person’s identity.
- Conditional models:
- Take a one-hot encoding of the target category with random noise, or an embedding generated by another model (i.e., if a CNN encoded an image to get a feature embedding).
- Outputs new data samples specific to the input category/label.
- This means the user has high level control of what the model generates, and we can generate data with a specific label (i.e., images of a specific object).
Key concepts
- Traditional models
- Deep learning models
- Autoregressive models
- Variational autoencoder
- Generative adversarial network (GANs)
- Flow-based generative models
- Diffusion models
Resources
- Generative AI Handbook: A Roadmap for Learning Resources, by William Brown