Continuous Probability Distributions

A continuous probability distribution is a representation of a variable that can take a continuous range of values.

Key Takeaways

Lebesgue measure\sigma\text{k}\text{k}\text{k}\text{k}

A continuous probability distribution is a probability distribution that has a probability density function. Mathematicians also call such a distribution “absolutely continuous,” since its cumulative distribution function is absolutely continuous with respect to the Lebesgue measure \lambda. If the distribution of \text{X} is continuous, then \text{X} is called a continuous random variable. There are many examples of continuous probability distributions: normal, uniform, chi-squared, and others.

Intuitively, a continuous random variable is the one which can take a continuous range of values—as opposed to a discrete distribution, in which the set of possible values for the random variable is at most countable. While for a discrete distribution an event with probability zero is impossible (e.g. rolling 3 and a half on a standard die is impossible, and has probability zero), this is not so in the case of a continuous random variable.

For example, if one measures the width of an oak leaf, the result of 3.5 cm is possible; however, it has probability zero because there are uncountably many other potential values even between 3 cm and 4 cm. Each of these individual outcomes has probability zero, yet the probability that the outcome will fall into the interval (3 cm, 4 cm) is nonzero. This apparent paradox is resolved given that the probability that \text{X} attains some value within an infinite set, such as an interval, cannot be found by naively adding the probabilities for individual values. Formally, each value has an infinitesimally small probability, which statistically is equivalent to zero.

The definition states that a continuous probability distribution must possess a density; or equivalently, its cumulative distribution function be absolutely continuous. This requirement is stronger than simple continuity of the cumulative distribution function, and there is a special class of distributions—singular distributions, which are neither continuous nor discrete nor a mixture of those. An example is given by the Cantor distribution. Such singular distributions, however, are never encountered in practice.

Probability Density Functions

In theory, a probability density function is a function that describes the relative likelihood for a random variable to take on a given value. The probability for the random variable to fall within a particular region is given by the integral of this variable’s density over the region. The probability density function is nonnegative everywhere, and its integral over the entire space is equal to one.

Unlike a probability, a probability density function can take on values greater than one. For example, the uniform distribution on the interval \left<0, \frac{1}{2}\right> has probability density \text{f}(\text{x}) = 2 for 0 \leq \text{x} \leq \frac{1}{2} and \text{f}(\text{x}) = 0 elsewhere. The standard normal distribution has probability density function:

\displaystyle \text{f}(\text{x}) = \frac{1}{\sqrt{2\pi}}\text{e}^{-\frac{1}{2}\text{x}^2}.


Key Takeaways

\text{U}(\text{a}, \text{b})\text{a}\text{b}\text{X} \sim \text{U}(\text{a}, \text{b})\text{a}\text{x}\text{b}\text{x}\text{u}\text{a} + (\text{b}-\text{a})\text{u}\text{a}\text{b}cumulative distribution function\text{X}\text{x}p-valueBox–Muller transformation

The continuous uniform distribution, or rectangular distribution, is a family of symmetric probability distributions such that for each member of the family all intervals of the same length on the distribution’s support are equally probable. The support is defined by the two parameters, \text{a} and \text{b}, which are its minimum and maximum values. The distribution is often abbreviated \text{U}(\text{a}, \text{b}). It is the maximum entropy probability distribution for a random variate \text{X} under no constraint other than that it is contained in the distribution’s support.

The probability that a uniformly distributed random variable falls within any interval of fixed length is independent of the location of the interval itself (but it is dependent on the interval size), so long as the interval is contained in the distribution’s support.

To see this, if \text{X} \sim \text{U}(\text{a}, \text{b}) and <\text{x}, \text{x}+\text{d}> is a subinterval of <\text{a}, \text{b}> with fixed \text{d}>0, then, the formula shown:

\displaystyle {\text{f}(\text{x}) = \begin{cases} \frac { 1 }{ \text{b}-\text{a} } &\text{for } \text{a}\le \text{x}\le \text{b} \\ 0 & \text{if } \text{x} \; \text{} \; \text{b} \end{cases}}

Is independent of \text{x}. This fact motivates the distribution’s name.

Applications of the Uniform Distribution

When a \text{p}-value is used as a test statistic for a simple null hypothesis, and the distribution of the test statistic is continuous, then the \text{p}-value is uniformly distributed between 0 and 1 if the null hypothesis is true. The \text{p}-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. One often “rejects the null hypothesis” when the \text{p}-value is less than the predetermined significance level, which is often 0.05 or 0.01, indicating that the observed result would be highly unlikely under the null hypothesis. Many common statistical tests, such as chi-squared tests or Student’s \text{t}-test, produce test statistics which can be interpreted using \text{p}-values.

Sampling from a Uniform Distribution

There are many applications in which it is useful to run simulation experiments. Many programming languages have the ability to generate pseudo-random numbers which are effectively distributed according to the uniform distribution.

If \text{u} is a value sampled from the standard uniform distribution, then the value \text{a}+(\text{b}-\text{a})\text{u} follows the uniform distribution parametrized by \text{a} and \text{b}.

Sampling from an Arbitrary Distribution

The uniform distribution is useful for sampling from arbitrary distributions. A general method is the inverse transform sampling method, which uses the cumulative distribution function (CDF) of the target random variable. This method is very useful in theoretical work. Since simulations using this method require inverting the CDF of the target variable, alternative methods have been devised for the cases where the CDF is not known in closed form. One such method is rejection sampling.

The normal distribution is an important example where the inverse transform method is not efficient. However, there is an exact method, the Box–Muller transformation, which uses the inverse transform to convert two independent uniform random variables into two independent normally distributed random variables.


Imagine that the amount of time, in minutes, that a person must wait for a bus is uniformly distributed between 0 and 15 minutes. What is the probability that a person waits fewer than 12.5 minutes?

Let \text{X} be the number of minutes a person must wait for a bus. \text{a}=0 and \text{b}=15. \text{x} \sim \text{U}(0, 15). The probability density function is written as:

\text{f}(\text{x}) = \frac{1}{15} - 0 = \frac{1}{15} for 0 \leq \text{x} \leq 15

We want to find \text{P}(\text{x}Key PointsThe exponential distribution is often concerned with the amount of time until some specific event occurs.Exponential variables can also be used to model situations where certain events occur with a constant probability per unit length, such as the distance between mutations on a DNA strand.Values for an exponential random variable occur in such a way that there are fewer large values and more small values.An important property of the exponential distribution is that it is memoryless.Key TermsErlang distribution: The distribution of the sum of several independent exponentially distributed variables.Poisson process: A stochastic process in which events occur continuously and independently of one another.

Key Takeaways

empirical rulebell curvereal number