Probability distribution
Overview

In probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

, a probability mass, probability density, or probability distribution is a function that describes the probability
Probability
Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

of a random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

taking certain values.

For a more precise definition one needs to distinguish between discrete and continuous random variables. In the discrete case, one can easily assign a probability to each possible value: when throwing a , each of the six values 1 to 6 has the probability 1/6.
Discussions
Encyclopedia
In probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

, a probability mass, probability density, or probability distribution is a function that describes the probability
Probability
Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

of a random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

taking certain values.

For a more precise definition one needs to distinguish between discrete and continuous random variables. In the discrete case, one can easily assign a probability to each possible value: when throwing a , each of the six values 1 to 6 has the probability 1/6. In contrast, when a random variable takes values from a continuum, probabilities are nonzero only if they refer to finite intervals: in quality control one might demand that the probability of a "500 g" package containing between 500 g and 510 g should be no less than 98%.

If total order
Total order
In set theory, a total order, linear order, simple order, or ordering is a binary relation on some set X. The relation is transitive, antisymmetric, and total...

is defined for the random variable, the cumulative distribution function gives the probability that the random variable is not larger than a given value; it is the integral
Integral
Integration is an important concept in mathematics and, together with its inverse, differentiation, is one of the two main operations in calculus...

of the non-cumulative distribution.

## Terminology

As probability theory is used in quite diverse applications, terminology is not uniform and sometimes confusing. The following terms are used for non-cumulative probability distribution functions:
• Probability mass, Probability mass function, p.m.f.: for discrete random variables.
• Categorical distribution: for discrete random variables with a finite set of values.
• Probability density, Probability density function, p.d.f: Most often reserved for continuous random variables.

The following terms are somewhat ambiguous as they can refer to non-cumulative or cumulative distributions, depending on authors' preferences:
• Probability distribution function: Continuous or discrete, non-cumulative or cumulative.
• Probability function: Even more ambiguous, can mean any of the above, or anything else.

Finally,
• Probability distribution: Either the same as probability distribution function. Or understood as something more fundamental underlying an actual mass or density function.

### Basic terms

• Mode: most frequently occurring value in a distribution
• Tail: region of least frequently occurring values in a distribution

## Discrete probability distribution

A discrete probability distribution shall be understood as a probability distribution characterized by a probability mass function
Probability mass function
In probability theory and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value...

. Thus, the distribution of a random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

X is discrete, and X is then called a discrete random variable, if

as u runs through the set of all possible values of X. It follows that such a random variable can assume only a finite or countably infinite number of values.

In cases more frequently considered, this set of possible values is a topologically discrete set in the sense that all its points are isolated point
Isolated point
In topology, a branch of mathematics, a point x of a set S is called an isolated point of S, if there exists a neighborhood of x not containing other points of S.In particular, in a Euclidean space ,...

s. But there are discrete random variables for which this countable set is dense
Dense set
In topology and related areas of mathematics, a subset A of a topological space X is called dense if any point x in X belongs to A or is a limit point of A...

on the real line (for example, a distribution over rational number
Rational number
In mathematics, a rational number is any number that can be expressed as the quotient or fraction a/b of two integers, with the denominator b not equal to zero. Since b may be equal to 1, every integer is a rational number...

s).

Among the most well-known discrete probability distributions that are used for statistical modeling are the Poisson distribution
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

, the Bernoulli distribution, the binomial distribution, the geometric distribution, and the negative binomial distribution
Negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...

. In addition, the discrete uniform distribution is commonly used in computer programs that make equal-probability random selections between a number of choices.

### Cumulative density

Equivalently to the above, a discrete random variable can be defined as a random variable whose cumulative distribution function
Cumulative distribution function
In probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

(cdf) increases only by jump discontinuities—that is, its cdf increases only where it "jumps" to a higher value, and is constant between those jumps. The points where jumps occur are precisely the values which the random variable may take. The number of such jumps may be finite or countably infinite. The set of locations of such jumps need not be topologically discrete; for example, the cdf might jump at each rational number
Rational number
In mathematics, a rational number is any number that can be expressed as the quotient or fraction a/b of two integers, with the denominator b not equal to zero. Since b may be equal to 1, every integer is a rational number...

.

### Delta-function representation

Consequently, a discrete probability distribution is often represented as a generalized probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

involving Dirac delta function
Dirac delta function
The Dirac delta function, or δ function, is a generalized function depending on a real parameter such that it is zero for all values of the parameter except when the parameter is zero, and its integral over the parameter from −∞ to ∞ is equal to one. It was introduced by theoretical...

s, which substantially unifies the treatment of continuous and discrete distributions. This is especially useful when dealing with probability distributions involving both a continuous and a discrete part.

### Indicator-function representation

For a discrete random variable X, let u0, u1, ... be the values it can take with non-zero probability. Denote

These are disjoint sets, and by formula (1)

It follows that the probability that X takes any value except for u0, u1, ... is zero, and thus one can write X as

except on a set of probability zero, where is the indicator function of A. This may serve as an alternative definition of discrete random variables.

## Continuous probability distribution

A continuous probability distribution shall be understood as a probability distribution that has a probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

. Mathematicians also call such distribution absolutely continuous, since its cumulative distribution function
Cumulative distribution function
In probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

is absolutely continuous
Absolute continuity
In mathematics, the relationship between the two central operations of calculus, differentiation and integration, stated by fundamental theorem of calculus in the framework of Riemann integration, is generalized in several directions, using Lebesgue integration and absolute continuity...

with respect to the Lebesgue measure
Lebesgue measure
In measure theory, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of n-dimensional Euclidean space. For n = 1, 2, or 3, it coincides with the standard measure of length, area, or volume. In general, it is also called...

λ. If the distribution of X is continuous, then X is called a continuous random variable. There are many examples of continuous probability distributions: normal, uniform
Uniform distribution (continuous)
In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by...

, chi-squared, and others.

Intuitively, a continuous random variable is the one which can take a continuous range of values — as opposed to a discrete distribution, where the set of possible values for the random variable is at most countable
Countable set
In mathematics, a countable set is a set with the same cardinality as some subset of the set of natural numbers. A set that is not countable is called uncountable. The term was originated by Georg Cantor...

. While for a discrete distribution an event
Event (probability theory)
In probability theory, an event is a set of outcomes to which a probability is assigned. Typically, when the sample space is finite, any subset of the sample space is an event...

with probability
Probability
Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

zero is impossible (e.g. rolling 3½ on a standard die is impossible, and has probability zero), this is not so in the case of a continuous random variable. For example, if one measures the width of an oak leaf, the result of 3½ cm is possible, however it has probability zero because there are uncountably many other potential values even between 3 cm and 4 cm. Each of these individual outcomes has probability zero, yet the probability that the outcome will fall into the interval
Interval (mathematics)
In mathematics, a interval is a set of real numbers with the property that any number that lies between two numbers in the set is also included in the set. For example, the set of all numbers satisfying is an interval which contains and , as well as all numbers between them...

Similar to Circular reasoning, A paradox is a seemingly true statement or group of statements that lead to a contradiction or a situation which seems to defy logic or intuition...

is resolved by the fact that the probability that X attains some value within an infinite set, such as an interval, cannot be found by naively adding
Integral
Integration is an important concept in mathematics and, together with its inverse, differentiation, is one of the two main operations in calculus...

the probabilities for individual values. Formally, each value has an infinitesimal
Infinitesimal
Infinitesimals have been used to express the idea of objects so small that there is no way to see them or to measure them. The word infinitesimal comes from a 17th century Modern Latin coinage infinitesimus, which originally referred to the "infinite-th" item in a series.In common speech, an...

ly small probability, which statistically is equivalent
Almost surely
In probability theory, one says that an event happens almost surely if it happens with probability one. The concept is analogous to the concept of "almost everywhere" in measure theory...

to zero.

Formally, if X is a continuous random variable, then it has a probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

ƒ(x), and therefore its probability to fall into a given interval, say is given by the integral

In particular, the probability for X to take any single value a (that is ) is zero, because an integral
Integral
Integration is an important concept in mathematics and, together with its inverse, differentiation, is one of the two main operations in calculus...

with coinciding upper and lower limits is always equal to zero.

The definition states that a continuous probability distribution must possess a density, or equivalently, its cumulative distribution function be absolutely continuous. This requirement is stronger than simple continuity of the cdf, and there is a special class of distributions, singular distribution
Singular distribution
In probability, a singular distribution is a probability distribution concentrated on a set of Lebesgue measure zero, where the probability of each point in that set is zero. These distributions are sometimes called singular continuous distributions...

s
, which are neither continuous nor discrete nor their mixture. An example is given by the Cantor distribution. Such singular distributions however are never encountered in practice.

Note on terminology: some authors use the term"continuous distribution" to denote the distribution with continuous cdf. Thus, their definition includes both the (absolutely) continuous and singular distributions.

By one convention, a probability distribution is called continuous if its cumulative distribution function is continuous
Continuous function
In mathematics, a continuous function is a function for which, intuitively, "small" changes in the input result in "small" changes in the output. Otherwise, a function is said to be "discontinuous". A continuous function with a continuous inverse function is called "bicontinuous".Continuity of...

and, therefore, the probability measure of singletons for all .

Another convention reserves the term continuous probability distribution for absolutely continuous
Absolute continuity
In mathematics, the relationship between the two central operations of calculus, differentiation and integration, stated by fundamental theorem of calculus in the framework of Riemann integration, is generalized in several directions, using Lebesgue integration and absolute continuity...

distributions. These distributions can be characterized by a probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

: a non-negative Lebesgue integrable
Lebesgue integration
In mathematics, Lebesgue integration, named after French mathematician Henri Lebesgue , refers to both the general theory of integration of a function with respect to a general measure, and to the specific case of integration of a function defined on a subset of the real line or a higher...

function defined on the real numbers such that

Discrete distributions and some continuous distributions (like the Cantor distribution) do not admit such a density.

### Probability distributions of real-valued random variables

Because a probability distribution Pr on the real line is determined by the probability of a real-valued random variable X being in a half-open interval (-∞, x], the probability distribution is completely characterized by its cumulative distribution function
Cumulative distribution function
In probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

:

### Terminology

The support of a distribution is the smallest closed interval/set whose complement has probability zero. It may be understood as the points or elements that are actual members of the distribution.

### Some properties

• The probability density function of the sum of two independent random variables is the convolution
Convolution
In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions. Convolution is similar to cross-correlation...

of each of their density functions.
• The probability density function of the difference of two independent random variables is the cross-correlation
Cross-correlation
In signal processing, cross-correlation is a measure of similarity of two waveforms as a function of a time-lag applied to one of them. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long-duration signal for a shorter, known feature...

of their density functions.
• Probability distributions are not a vector space
Vector space
A vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex...

– they are not closed under linear combination
Linear combination
In mathematics, a linear combination is an expression constructed from a set of terms by multiplying each term by a constant and adding the results...

s, as these do not preserve non-negativity or total integral 1 – but they are closed under convex combination
Convex combination
In convex geometry, a convex combination is a linear combination of points where all coefficients are non-negative and sum up to 1....

, thus forming a convex subset of the space of functions (or measures).

## Random number generation

A frequent problem in statistical simulations (Monte Carlo method
Monte Carlo method
Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical and mathematical systems...

) is the generation of pseudo-random numbers
Pseudorandomness
A pseudorandom process is a process that appears to be random but is not. Pseudorandom sequences typically exhibit statistical randomness while being generated by an entirely deterministic causal process...

that are distributed in a given way. Most algorithms are based on a pseudorandom number generator
Pseudorandom number generator
A pseudorandom number generator , also known as a deterministic random bit generator , is an algorithm for generating a sequence of numbers that approximates the properties of random numbers...

that produces numbers X that are uniformly distributed in the interval [0,1). These X are then transformed to some u(X) that satisfy a given distribution f(u).

## Kolmogorov definition

In the measure-theoretic formalization of probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

, a random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

is defined as a measurable function
Measurable function
In mathematics, particularly in measure theory, measurable functions are structure-preserving functions between measurable spaces; as such, they form a natural context for the theory of integration...

X from a probability space
Probability space
In probability theory, a probability space or a probability triple is a mathematical construct that models a real-world process consisting of states that occur randomly. A probability space is constructed with a specific kind of situation or experiment in mind...

to measurable space . A probability distribution is the pushforward measure
Pushforward measure
In measure theory, a pushforward measure is obtained by transferring a measure from one measurable space to another using a measurable function.-Definition:...

X*P = PX −1 on .

## Applications

The concept of the probability distribution and the random variables which they describe underlies the mathematical discipline of probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

, and the science of statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

. There is spread or variability in almost any value that can be measured in a population (e.g. height of people, durability of a metal, sales growth, traffic flow, etc.); almost all measurements are made with some intrinsic error; in physics
Physics
Physics is a natural science that involves the study of matter and its motion through spacetime, along with related concepts such as energy and force. More broadly, it is the general analysis of nature, conducted in order to understand how the universe behaves.Physics is one of the oldest academic...

many processes are described probabilistically, from the kinetic properties of gases
Kinetic theory
The kinetic theory of gases describes a gas as a large number of small particles , all of which are in constant, random motion. The rapidly moving particles constantly collide with each other and with the walls of the container...

to the quantum mechanical description of fundamental particles. For these and many other reasons, simple number
Number
A number is a mathematical object used to count and measure. In mathematics, the definition of number has been extended over the years to include such numbers as zero, negative numbers, rational numbers, irrational numbers, and complex numbers....

s are often inadequate for describing a quantity, while probability distributions are often more appropriate.

As a more specific example of an application, the cache language models
Cache language model
A cache language model is a type of statistical language model. These occur in the natural language processing subfield of computer science and assign probabilities to given sequences of words by means of a probability distribution...

and other statistical language models used in natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

to assign probabilities to the occurrence of particular words and word sequences do so by means of probability distributions.

## Common probability distributions

The following is a list of some of the most common probability distributions, grouped by the type of process that they are related to. For a more complete list, see list of probability distributions, which groups by the nature of the outcome being considered (discrete, continuous, multivariate, etc.)

Note also that all of the univariate distributions below are singly peaked; that is, it is assumed that the values cluster around a single point. In practice, actually observed quantities may cluster around multiple values. Such quantities can be modeled using a mixture distribution.

### Related to real-valued quantities that grow linearly (e.g. errors, offsets)

• Normal distribution (Gaussian distribution), for a single such quantity; the most common continuous distribution

### Related to positive real-valued quantities that grow exponentially (e.g. prices, incomes, populations)

• Log-normal distribution, for a single such quantity whose log is normally distributed
• Pareto distribution, for a single such quantity whose log is exponentially
Exponential distribution
In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

distributed; the prototypical power law
Power law
A power law is a special kind of mathematical relationship between two quantities. When the frequency of an event varies as a power of some attribute of that event , the frequency is said to follow a power law. For instance, the number of cities having a certain population size is found to vary...

distribution

### Related to real-valued quantities that are assumed to be uniformly distributed over a (possibly unknown) region

• Discrete uniform distribution, for a finite set of values (e.g. the outcome of a fair die)
• Continuous uniform distribution, for continuously distributed values

### Related to Bernoulli trials (yes/no events, with a given probability)

• Basic distributions:
• Bernoulli distribution, for the outcome of a single Bernoulli trial (e.g. success/failure, yes/no)
• Binomial distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed total number of independent occurrences
• Negative binomial distribution
Negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...

, for binomial-type observations but where the quantity of interest is the number of failures before a given number of successes occurs
• Geometric distribution, for binomial-type observations but where the quantity of interest is the number of failures before the first success; a special case of the negative binomial distribution
Negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...

• Related to sampling schemes over a finite population:
• Hypergeometric distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed number of total occurrences, using sampling without replacement
• Beta-binomial distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed number of total occurrences, sampling using a Polya urn scheme (in some sense, the "opposite" of sampling without replacement)

### Related to categorical outcomes (events with K possible outcomes, with a given probability for each outcome)

• Categorical distribution
Categorical distribution
In probability theory and statistics, a categorical distribution is a probability distribution that describes the result of a random event that can take on one of K possible outcomes, with the probability of each outcome separately specified...

, for a single categorical outcome (e.g. yes/no/maybe in a survey); a generalization of the Bernoulli distribution
• Multinomial distribution, for the number of each type of catergorical outcome, given a fixed number of total outcomes; a generalization of the binomial distribution
• Multivariate hypergeometric distribution, similar to the multinomial distribution, but using sampling without replacement; a generalization of the hypergeometric distribution

### Related to events in a Poisson process (events that occur independently with a given rate)

• Poisson distribution
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

, for the number of occurrences of a Poisson-type event in a given period of time
• Exponential distribution
Exponential distribution
In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

, for the time before the next Poisson-type event occurs

### Useful for hypothesis testing related to normally distributed outcomes

• Chi-squared distribution, the distribution of a sum of squared standard normal variables; useful e.g. for inference regarding the sample variance of normally distributed samples (see chi-squared test)
• Student's t distribution, the distribution of the ratio of a standard normal variable and the square root of a scaled chi squared variable; useful for inference regarding the mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

of normally distributed samples with unknown variance (see Student's t-test
Student's t-test
A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known...

)
• F-distribution, the distribution of the ratio of two scaled chi squared variables; useful e.g. for inferences that involve comparing variances or involving R-squared (the squared correlation coefficient
Pearson product-moment correlation coefficient
In statistics, the Pearson product-moment correlation coefficient is a measure of the correlation between two variables X and Y, giving a value between +1 and −1 inclusive...

)

### Useful as conjugate prior distributions in Bayesian inference

• Beta distribution, for a single probability (real number between 0 and 1); conjugate to the Bernoulli distribution and binomial distribution
• Gamma distribution, for a non-negative scaling parameter; conjugate to the rate parameter of a Poisson distribution
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

or exponential distribution
Exponential distribution
In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

, the precision
Precision (statistics)
In statistics, the term precision can mean a quantity defined in a specific way. This is in addition to its more general meaning in the contexts of accuracy and precision and of precision and recall....

(inverse variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...

) of a normal distribution, etc.
• Dirichlet distribution, for a vector of probabilities that must sum to 1; conjugate to the categorical distribution
Categorical distribution
In probability theory and statistics, a categorical distribution is a probability distribution that describes the result of a random event that can take on one of K possible outcomes, with the probability of each outcome separately specified...

and multinomial distribution; generalization of the beta distribution
• Wishart distribution, for a symmetric non-negative definite matrix; conjugate to the inverse of the covariance matrix
Covariance matrix
In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...

of a multivariate normal distribution; generalization of the gamma distribution

• Moment-generating function
Moment-generating function
In probability theory and statistics, the moment-generating function of any random variable is an alternative definition of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or...

• Copula (statistics)
Copula (statistics)
In probability theory and statistics, a copula can be used to describe the dependence between random variables. Copulas derive their name from linguistics....

• Histogram
Histogram
In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

• Likelihood function
Likelihood function
In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...

• List of statistical topics
• Riemann–Stieltjes integral application to probability theory