Hellinger distance
Encyclopedia
In probability
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

 and statistics
Mathematical statistics
Mathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and analysis...

, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f-divergence
F-divergence
In probability theory, an ƒ-divergence is a function Df that measures the difference between two probability distributions P and Q...

. The Hellinger distance is defined in terms of the Hellinger integral
Hellinger integral
In mathematics, the Hellinger integral is an integral introduced by that is a special case of the Kolmogorov integral. It is used to define the Hellinger distance in probability theory.-References:...

, which was introduced by Ernst Hellinger
Ernst Hellinger
Ernst David Hellinger was a German mathematician.-Early years:Ernst Hellinger was born on September 30, 1883 in Striegau, Silesia, Germany to Emil and Julie Hellinger. He grew up in Breslau, attended school and graduated from the Gymnasium there in 1902...

.

To define the Hellinger distance in terms of measure theory, let P and Q denote two probability measure
Probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as countable additivity...

s that are absolutely continuous
Absolute continuity
In mathematics, the relationship between the two central operations of calculus, differentiation and integration, stated by fundamental theorem of calculus in the framework of Riemann integration, is generalized in several directions, using Lebesgue integration and absolute continuity...

 with respect to a third probability measure λ. The square of the Hellinger distance between P and Q is defined as the quantity


Here, dP /  and dQ / dλ are the Radon–Nikodym derivatives of P and Q respectively. This definition does not depend on λ, so the Hellinger distance between P and Q does not change if λ is replaced with a different probability measure with respect to which both P and Q are absolutely continuous. For compactness, the above formula is often written as


To define the Hellinger distance in terms of elementary probability theory, we take λ to be Lebesgue measure
Lebesgue measure
In measure theory, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of n-dimensional Euclidean space. For n = 1, 2, or 3, it coincides with the standard measure of length, area, or volume. In general, it is also called...

, so that dP /  and dQ / dλ are simply probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

s. If we denote the densities as f and g, respectively, the squared Hellinger distance can be expressed as a standard calculus integral


where the second form can be obtained by expanding the square and using the fact that the integral of a probability density over its domain must be one.

The Hellinger distance H(PQ) satisfies the property (derivable from the Cauchy-Schwarz inequality)


The maximum distance 1 is achieved when P assigns probability zero to every set to which Q assigns a positive probability, and vice versa.

Sometimes the factor 1/2 in front of the integral is omitted, in which case the Hellinger distance ranges from zero to the square root of two.

The Hellinger distance is related to the Bhattacharyya coefficient
Bhattacharyya distance
In statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after A...

  as it can be defined as

Examples

The squared Hellinger distance between two normal distributions and is:


The squared Hellinger distance between two exponential distribution
Exponential distribution
In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

s and is:


The squared Hellinger distance between two Weibull distributions and (where is a common shape parameter and are the scale parameters respectively):
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK