Weight decay

Weight decay (or $L^{2}$ regularisation) is a parametric regularisation technique that applies an $L^{2}$ parameter norm penalty. It is applicable to both traditional statistical models (like linear regression) and neural network approaches.

Essentially: we have a weight vector. We penalise the model heavier for large components of the weight vector, which biases the learning algorithm towards distributing the weight more evenly across a larger number of features.

This drives weights closer to the origin, and generalises better by lowering the variance.

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

Weight decay

Graph View

Backlinks