
Invariant estimator
Encyclopedia
In statistics
, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimator
s for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term equivariant estimator is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of "equivariance" in more general mathematics.
, there are several approaches to estimation theory
that can be used to decide immediately what estimators should be used according to those approaches. For example, ideas from Bayesian inference
would lead directly to Bayesian estimators. Similarly, the theory of classical statistical inference can sometimes lead to strong conclusions about what estimator should be used. However, the usefulness of these theories depends on having a fully prescribed statistical model and may also depend on having a relevant loss function to determine the estimator. Thus a Bayesian analysis
might be undertaken, leading to a posterior distribution for relevant parameters, but the use of a specific utility or loss function may be unclear. Ideas of invariance can then be applied to the task of summarising the posterior distribution. In other cases, statistical analyses are undertaken without a fully defined statistical model or the classical theory of statistical inference cannot be readily applied because the family of models being considered are not amenable to such treatment. In addition to these cases where general theory does not prescribe an estimator, the concept of invariance of an estimator can be applied when seeking estimators of alternative forms, either for the sake of simplicity of application of the estimator or so that the estimator is robust
.
The concept of invariance is sometimes used on its own as a way of choosing between estimators, but this is not necessarily definitive. For example, a requirement of invariance may be incompatible with the requirement that the estimator be mean-unbiased
; on the other hand, the criterion of median-unbiasedness is defined in terms of the estimator's sampling distribution and so is invariant under many transformations.
One use of the concept of invariance is where a class or family of estimators is proposed and a particular formulation must be selected amongst these. One procedure is to impose relevant invariance properties and then to find the formulation within this class that has the best properties, leading to what is called the optimal invariant estimator.
The combination of permutation invariance and location invariance for estimating a location parameter from an independent and identically distributed dataset using a weighted average implies that the weights should be identical and sum to one. Of course, estimators other than a weighted average may be preferable.
which contains information about an unknown parameter
. The measurements
are modelled as a vector random variable having a probability density function
which depends on a parameter vector
.
The problem is to estimate
given
. The estimate, denoted by
, is a function of the measurements and belongs to a set
. The quality of the result is defined by a loss function
which determines a risk function
. The sets of possible values of
,
, and
are denoted by
,
, and
, respectively.
To define an invariant or equivariant estimator formally, some definitions related to groups of transformations are needed first. Let
denote the set of possible data-samples. A group of transformations of
, to be denoted by
, is a set of (measurable) 1:1 and onto transformations of
into itself, which satisfies the following conditions:
Datasets
and
in
are equivalent if
for some
. All the equivalent points form an equivalence class.
Such an equivalence class is called an orbit (in
). The
orbit,
, is the set
.
If
consists of a single orbit then
is said to be transitive.
A family of densities
is said to be invariant under the group
if, for every
and
there exists a unique
such that
has density
.
will be denoted
.
If
is invariant under the group
then the loss function
is said to be invariant under
if for every
and
there exists an
such that
for all
. The transformed value
will be denoted by
.
In the above,
is a group of transformations from
to itself and
is a group of transformations from
to itself.
An estimation problem is invariant(equivariant) under
if there exist three groups
as defined above.
For an estimation problem that is invariant under
, estimator
is an invariant estimator under
if, for all
and
,
For a given problem, the invariant estimator with the lowest risk is termed the "best invariant estimator". Best invariant estimator cannot always be achieved. A special case for which it can be achieved is the case when
is transitive.
is a location parameter if the density of
is of the form
. For
and
, the problem is invariant under
. The invariant estimator in this case must satisfy
thus it is of the form
(
).
is transitive on
so the risk does not vary with
: that is,
. The best invariant estimator is the one that brings the risk
to minimum.
In the case that L is the squared error
has density
, where θ is a parameter to be estimated, and where the loss function
is
. This problem is invariant with the following (additive) transformation groups:


The best invariant estimator
is the one that minimizes
and this is Pitman's estimator (1939).
For the squared error loss case, the result is
If
(i.e. a multivariate normal distribution with independent, unit-variance components) then
If
(independent components having a Cauchy distribution
with scale parameter σ) then
,. However the result is
with
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimator
Estimator
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule and its result are distinguished....
s for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term equivariant estimator is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of "equivariance" in more general mathematics.
Background
In statistical inferenceStatistical inference
In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
, there are several approaches to estimation theory
Estimation theory
Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...
that can be used to decide immediately what estimators should be used according to those approaches. For example, ideas from Bayesian inference
Bayesian inference
In statistics, Bayesian inference is a method of statistical inference. It is often used in science and engineering to determine model parameters, make predictions about unknown variables, and to perform model selection...
would lead directly to Bayesian estimators. Similarly, the theory of classical statistical inference can sometimes lead to strong conclusions about what estimator should be used. However, the usefulness of these theories depends on having a fully prescribed statistical model and may also depend on having a relevant loss function to determine the estimator. Thus a Bayesian analysis
Bayesian inference
In statistics, Bayesian inference is a method of statistical inference. It is often used in science and engineering to determine model parameters, make predictions about unknown variables, and to perform model selection...
might be undertaken, leading to a posterior distribution for relevant parameters, but the use of a specific utility or loss function may be unclear. Ideas of invariance can then be applied to the task of summarising the posterior distribution. In other cases, statistical analyses are undertaken without a fully defined statistical model or the classical theory of statistical inference cannot be readily applied because the family of models being considered are not amenable to such treatment. In addition to these cases where general theory does not prescribe an estimator, the concept of invariance of an estimator can be applied when seeking estimators of alternative forms, either for the sake of simplicity of application of the estimator or so that the estimator is robust
Robust statistics
Robust statistics provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not unduly affected by small departures from model assumptions.- Introduction :...
.
The concept of invariance is sometimes used on its own as a way of choosing between estimators, but this is not necessarily definitive. For example, a requirement of invariance may be incompatible with the requirement that the estimator be mean-unbiased
Bias of an estimator
In statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...
; on the other hand, the criterion of median-unbiasedness is defined in terms of the estimator's sampling distribution and so is invariant under many transformations.
One use of the concept of invariance is where a class or family of estimators is proposed and a particular formulation must be selected amongst these. One procedure is to impose relevant invariance properties and then to find the formulation within this class that has the best properties, leading to what is called the optimal invariant estimator.
Some classes of invariant estimators
There are several types of transformations that are usefully considered when dealing with invariant estimators. Each gives rise to a class of estimators which are invariant to those particular types of transformation.- Shift invariance: Notionally, estimates of a location parameterLocation parameterIn statistics, a location family is a class of probability distributions that is parametrized by a scalar- or vector-valued parameter μ, which determines the "location" or shift of the distribution...
should be invariant to simple shifts of the data values. If all data values are increased by a given amount, the estimate should change by the same amount. When considering estimation using a weighted average, this invariance requirement immediately implies that the weights should sum to one. While the same result is often derived from a requirement for unbiasedness, the use of "invariance" does not require that a mean value exists and makes no use of any probability distribution at all. - Scale invariance: Note that this is a topic not directly covered in scale invarianceScale invarianceIn physics and mathematics, scale invariance is a feature of objects or laws that do not change if scales of length, energy, or other variables, are multiplied by a common factor...
. - Parameter-transformation invariance: Here, the transformation applies to the parameters alone. The concept here is that essentially the same inference should be made from data and a model involving a parameter θ as would be made from the same data if the model used a parameter φ, where φ is a one-to-one transformation of θ, φ=h(θ). According to this type of invariance, results from transformation-invariant estimators should also be related by φ=h(θ). Maximum likelihood estimators have this property.
- Permutation invariance: Where a set of data values can be represented by a statistical model that they are outcomes from independent and identically distributed random variables, it is reasonable to impose the requirement that any estimator of any property of the common distribution should be permutation-invariant: specifically that the estimator, considered as a function of the set of data-values, should not change if items of data are swapped within the dataset.
The combination of permutation invariance and location invariance for estimating a location parameter from an independent and identically distributed dataset using a weighted average implies that the weights should be identical and sum to one. Of course, estimators other than a weighted average may be preferable.
Optimal invariant estimators
Under this setting, we are given a set of measurements


Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...


The problem is to estimate




Loss function
In statistics and decision theory a loss function is a function that maps an event onto a real number intuitively representing some "cost" associated with the event. Typically it is used for parameter estimation, and the event in question is some function of the difference between estimated and...

Risk function
In decision theory and estimation theory, the risk function R of a decision rule, δ, is the expected value of a loss function L:...







Definition
An invariant estimator is an estimator which obeys the following two rules:- Principle of Rational Invariance: The action taken in a decision problem should not depend on transformation on the measurement used
- Invariance Principle: If two decision problems have the same formal structure (in terms of
,
,
and
), then the same decision rule should be used in each problem.
To define an invariant or equivariant estimator formally, some definitions related to groups of transformations are needed first. Let




- If
and
then
- If
then
, where
(That is, each transformation has an inverse within the group.)
-
(i.e. there is an identity transformation
)
Datasets





Such an equivalence class is called an orbit (in




If


A family of densities









If











In the above,




An estimation problem is invariant(equivariant) under


For an estimation problem that is invariant under






Properties
- The risk function of an invariant estimator,
, is constant on orbits of
. Equivalently
for all
and
.
- The risk function of an invariant estimator with transitive
is constant.
For a given problem, the invariant estimator with the lowest risk is termed the "best invariant estimator". Best invariant estimator cannot always be achieved. A special case for which it can be achieved is the case when

Example: Location parameter
Suppose






thus it is of the form







In the case that L is the squared error

Pitman estimator
The estimation problem is that

Loss function
In statistics and decision theory a loss function is a function that maps an event onto a real number intuitively representing some "cost" associated with the event. Typically it is used for parameter estimation, and the event in question is some function of the difference between estimated and...
is




The best invariant estimator


and this is Pitman's estimator (1939).
For the squared error loss case, the result is

If


If

Cauchy distribution
The Cauchy–Lorentz distribution, named after Augustin Cauchy and Hendrik Lorentz, is a continuous probability distribution. As a probability distribution, it is known as the Cauchy distribution, while among physicists, it is known as the Lorentz distribution, Lorentz function, or Breit–Wigner...
with scale parameter σ) then


with
