In machine learning, hyperparameters are settings that specify the behaviour (details) of the learning algorithm. They cannot be inferred/adapted during model fitting, usually because it’s not appropriate to learn the hyperparameter (may cause overfitting because they would always choose to maximise the model capacity and overfit for the training data). Choosing good hyperparameters is a key challenge of modern machine learning.
For instance, some key hyperparameters for neural networks are:
- Batch size
- Number of layers
- Layer size
- Types of activation function
- Learning rate