Negative log likelihood

The negative log-likelihood is a loss function that takes the negative logarithm of the maximum likelihood estimation (MLE):

NLL = - lo g (i : y_{i} = 1 \prod p (x_{i}) i^{'} : y_{i}^{'} = 0 \prod (1 - p (x_{i^{'}})))

Motivation and benefits

With the product of multiple probabilities, this can quickly get beyond regular machine precision. Taking the $lo g$ allows it to remain within machine precision. Taking the negative converts it from a maximisation problem (MLE) to a minimisation problem (NLL).

Note also that computing the derivative of the MLE involves the chain rule and thus requires $O (n^{2})$ computations. The NLL’s derivative is simpler and only requires $O (n)$ time, because we can chain additions of logarithms.

In code

In PyTorch, we can use torch.nn.NLLLoss().

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

Negative log likelihood

Motivation and benefits

In code

Graph View

Backlinks