Maximum likelihood estimation

Maximum likelihood estimation (MLE) is a statistical approach used for fitting non-linear learning models. We try to estimate parameters $\hat{β}_{i}$ such that a likelihood function is maximised:

l (β_{0}, β_{1}) = i : y_{i} = 1 \prod p (x_{i}) i^{'} : y_{i}^{'} = 0 \prod (1 - p (x_{i^{'}}))

The intuition behind this is that we want estimates such that the predicted probability $\overset{p}{^} (x_{i})$ for one set is as close to 1 as possible (i.e., for people who defaulted on debt) and for another as close to 0 (i.e., for those who didn’t).

MLE is used for fitting logistic regression models.

Log likelihood

We run into a problem if we have: billions of parameters/data and if the data is independent. This means that the likelihood cannot be practically computed as a product of many probabilities, especially because we lose precision. Instead, we take the log-likelihood: $lo g (l (β_{0}, β_{1}))$ .

Since many learning problems involve error functions, we can turn this computation into the minimisation of the loss by taking the negative log-likelihood. More on that over there.

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

Maximum likelihood estimation

Log likelihood

Graph View

Backlinks