Hellinger distance - AbsoluteAstronomy.com

Probability theory

Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

and statistics

Mathematical statistics

Mathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and analysis...

, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f-divergence

F-divergence

In probability theory, an ƒ-divergence is a function Df that measures the difference between two probability distributions P and Q...

. The Hellinger distance is defined in terms of the Hellinger integral

Hellinger integral

In mathematics, the Hellinger integral is an integral introduced by that is a special case of the Kolmogorov integral. It is used to define the Hellinger distance in probability theory.-References:...

, which was introduced by Ernst Hellinger

Ernst Hellinger

Ernst David Hellinger was a German mathematician.-Early years:Ernst Hellinger was born on September 30, 1883 in Striegau, Silesia, Germany to Emil and Julie Hellinger. He grew up in Breslau, attended school and graduated from the Gymnasium there in 1902...

.

To define the Hellinger distance in terms of measure theory, let P and Q denote two probability measure

Probability measure

In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as countable additivity...

s that are absolutely continuous

Absolute continuity

In mathematics, the relationship between the two central operations of calculus, differentiation and integration, stated by fundamental theorem of calculus in the framework of Riemann integration, is generalized in several directions, using Lebesgue integration and absolute continuity...

with respect to a third probability measure λ. The square of the Hellinger distance between P and Q is defined as the quantity

Here, dP / dλ and dQ / dλ are the Radon–Nikodym derivatives of P and Q respectively. This definition does not depend on λ, so the Hellinger distance between P and Q does not change if λ is replaced with a different probability measure with respect to which both P and Q are absolutely continuous. For compactness, the above formula is often written as

To define the Hellinger distance in terms of elementary probability theory, we take λ to be Lebesgue measure

Lebesgue measure

In measure theory, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of n-dimensional Euclidean space. For n = 1, 2, or 3, it coincides with the standard measure of length, area, or volume. In general, it is also called...

, so that dP / dλ and dQ / dλ are simply probability density function

Probability density function

In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

s. If we denote the densities as f and g, respectively, the squared Hellinger distance can be expressed as a standard calculus integral

where the second form can be obtained by expanding the square and using the fact that the integral of a probability density over its domain must be one.

The Hellinger distance H(P, Q) satisfies the property (derivable from the Cauchy-Schwarz inequality)

The maximum distance 1 is achieved when P assigns probability zero to every set to which Q assigns a positive probability, and vice versa.

Sometimes the factor 1/2 in front of the integral is omitted, in which case the Hellinger distance ranges from zero to the square root of two.

The Hellinger distance is related to the Bhattacharyya coefficient

Bhattacharyya distance

In statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after A...

as it can be defined as

Examples

The squared Hellinger distance between two normal distributions

and

is:

The squared Hellinger distance between two exponential distribution

Exponential distribution

In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

and

is:

The squared Hellinger distance between two Weibull distributions

and

(where

is a common shape parameter and

are the scale parameters respectively):

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.