The probability distribution of a random variable is the function that describes the probabilities for the set of possible values. We define two main types of distributions:

  • The probability mass function (PMF) of a discrete random variable is the function that describes the probabilities for the set of possible values.
  • The probability density function (PDF) describes a continuous random variable’s probability (since its probabilities are given by the area under the curve). We define the PDF as the derivative of the CDF (at places where it’s differentiable):

We can rigorously build up the theory behind random variables and probability distributions with measure theory. Special probability distributions are determined by special random variables with particular properties.

Fundamentals

The cumulative distribution function (CDF) specifies the probability that a random variable is less than or equal to a given value:

where is a random variable, and is an index. From the fundamental axioms of probability, is monotonically increasing (since it accumulates the probabilities).

We also define:

For a probability density function:

We define the conditional CDF of given as:

The conditional PDF of given is defined as:

Computations

The mean (or expected value) of is:

The variance is:

The median is the point where:

The -th percentile (where ) is the point where:

The median is the 50th percentile.

When we compute the CDF, we should be cognisant of whether we’re able to do less work. For example, we might need to compute a big summation, but this could be simplified by computing 1 minus the complement instead.

Sub-pages