Shannon entropy

In information theory, we define the uncertainty of an event $A$ as how sure we are that $A$ is going to happen. An uncertainty of 0 is that $A$ is definitely going to happen. An uncertainty of $\infty$ means it’ll never happen. Numerically, it’s given by:

uncertainty = lo g_{2} (\frac{1}{P ( A )})

For a random variable, we often want the average uncertainty, i.e., the expected value. In the discrete case, this is given by:

E (uncertainty of X) = E (lo g_{2} \frac{1}{P ( X )}) = - \forall x \sum P_{X} (x) lo g_{2} P_{X} (x)

This is the Shannon entropy, which describes how much uncertainty there is in a random variable. This refers to a theoretical lower bound on the minimum number of bits we need to use to losslessly express the data (might not be achievable practically).

We use the base-2 logarithm to keep in line with other concepts in information theory, especially when problems involve the number of bits. The $lo g$ base is determined by the number of values it can take (for bits, 2). The unit for the natural logarithm are nats.

The differential entropy is defined by:

H_{X} = - \int_{- \infty}^{\infty} f_{X} (x) ln (f_{X} (x)) d x = - E (ln (f_{X} (X)))

which describes the entropy for continuous random variables.

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

Shannon entropy

See also

Graph View

Backlinks