Nuisance variable
Encyclopedia
In statistics
, a nuisance parameter is any parameter
which is not of immediate interest but which must be accounted for in the analysis of those parameters which are of interest. The classic example of a nuisance parameter is the variance
, σ2, of a normal distribution, when the mean
, μ, is of primary interest.
Nuisance parameters are often variances, but not always; for example in an errors-in-variables model, the unknown true location of each observation is a nuisance parameter. In general, any parameter which intrudes on the analysis of another may be considered a nuisance parameter. A parameter may also cease to be a "nuisance" if it becomes the object of study, as the variance of a distribution may be.
into components representing information about the parameters of interest and information about the other (nuisance) parameters. This can involve ideas about sufficient statistics
and ancillary statistic
s. When this partition can be achieved it may be possible to complete a Bayesian analysis for the parameters of interest by determining their joint posterior distribution algebraically. The partition allows frequentist theory to develop general estimation approaches in the presence of nuisance parameters. If the partition cannot be achieved it may still be possible to make use of an approximate partition.
In some special cases, it is possible to formulate methods that circumvent the presences of nuisance parameters. The t-test provides a practically useful test because the test statistic does not depend on the unknown variance. It is a case where use can be made of a pivotal quantity
. However, in other cases no such circumvention is known.
A general approach in a frequentist analysis can be based on maximum likelihood-ratio test
s. These provide both significance tests and confidence interval
s for the parameters of interest which are approximately valid for moderate to large sample sizes and which take account of the presence of nuisance parameters.
In Bayesian analysis, a generally applicable approach creates random samples from the joint posterior distribution of all the parameters: see Markov chain Monte Carlo
. Given these, the joint distribution of only the parameters of interest can be readily found by marginalizing over the nuisance parameters. However, this approach may not always be computationally efficient if some or all of the nuisance parameters can be eliminated on a theoretical basis.
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, a nuisance parameter is any parameter
Parameter
Parameter from Ancient Greek παρά also “para” meaning “beside, subsidiary” and μέτρον also “metron” meaning “measure”, can be interpreted in mathematics, logic, linguistics, environmental science and other disciplines....
which is not of immediate interest but which must be accounted for in the analysis of those parameters which are of interest. The classic example of a nuisance parameter is the variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
, σ2, of a normal distribution, when the mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
, μ, is of primary interest.
Nuisance parameters are often variances, but not always; for example in an errors-in-variables model, the unknown true location of each observation is a nuisance parameter. In general, any parameter which intrudes on the analysis of another may be considered a nuisance parameter. A parameter may also cease to be a "nuisance" if it becomes the object of study, as the variance of a distribution may be.
Theoretical statistics
The general treatment of nuisance parameters can be broadly similar between frequentist and Bayesian approaches to theoretical statistics. It relies on an attempt to partition the likelihood functionLikelihood function
In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...
into components representing information about the parameters of interest and information about the other (nuisance) parameters. This can involve ideas about sufficient statistics
Sufficiency (statistics)
In statistics, a sufficient statistic is a statistic which has the property of sufficiency with respect to a statistical model and its associated unknown parameter, meaning that "no other statistic which can be calculated from the same sample provides any additional information as to the value of...
and ancillary statistic
Ancillary statistic
In statistics, an ancillary statistic is a statistic whose sampling distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken...
s. When this partition can be achieved it may be possible to complete a Bayesian analysis for the parameters of interest by determining their joint posterior distribution algebraically. The partition allows frequentist theory to develop general estimation approaches in the presence of nuisance parameters. If the partition cannot be achieved it may still be possible to make use of an approximate partition.
In some special cases, it is possible to formulate methods that circumvent the presences of nuisance parameters. The t-test provides a practically useful test because the test statistic does not depend on the unknown variance. It is a case where use can be made of a pivotal quantity
Pivotal quantity
In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters whose probability distribution does not depend on unknown parameters....
. However, in other cases no such circumvention is known.
Practical statistics
Practical approaches to statistical analysis treat nuisance parameters somewhat differently in frequentist and Bayesian methodologies.A general approach in a frequentist analysis can be based on maximum likelihood-ratio test
Likelihood-ratio test
In statistics, a likelihood ratio test is a statistical test used to compare the fit of two models, one of which is a special case of the other . The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other...
s. These provide both significance tests and confidence interval
Confidence interval
In statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s for the parameters of interest which are approximately valid for moderate to large sample sizes and which take account of the presence of nuisance parameters.
In Bayesian analysis, a generally applicable approach creates random samples from the joint posterior distribution of all the parameters: see Markov chain Monte Carlo
Markov chain Monte Carlo
Markov chain Monte Carlo methods are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a large number of steps is then used as a sample of the...
. Given these, the joint distribution of only the parameters of interest can be readily found by marginalizing over the nuisance parameters. However, this approach may not always be computationally efficient if some or all of the nuisance parameters can be eliminated on a theoretical basis.