Interval estimation
Encyclopedia
In statistics
, interval estimation is the use of sample
data
to calculate an interval
of possible (or probable) values of an unknown population parameter, in contrast to point estimation
, which is a single number. Neyman (1937) identified interval estimation ("estimation by interval") as distinct from point estimation ("estimation by unique estimate"). In doing so, he recognised that then-recent work quoting results in the form of an estimate
plus-or-minus a standard deviation
indicated that interval estimation was actually the problem statisticians really had in mind.
The most prevalent forms of interval estimation are:
Other common approaches to interval estimation, which are encompassed by statistical theory, are:
There is a third approach to statistical inference
, namely fiducial inference
, that also considers interval estimation. Non-statistical methods that can lead to interval estimates include fuzzy logic
.
An interval estimate is one type of outcome of a statistical analysis. Some other types of outcome are point estimates
and decisions
.
Severini (1991) discusses conditions under which credible intervals and confidence intervals will produce similar results, and also discusses both the coverage probabilities of credible intervals and the posterior probabilities associated with confidence intervals.
approach also fails to provide an answer that can be expressed as straightforward simple formulae, but modern computational methods of Bayesian analysis do allow essentially exact solutions to be found. Thus study of the problem can be used to elucidate the differences between the frequentist and Bayesian
approaches to interval estimation.
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, interval estimation is the use of sample
Sampling (statistics)
In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....
data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
to calculate an interval
Interval (mathematics)
In mathematics, a interval is a set of real numbers with the property that any number that lies between two numbers in the set is also included in the set. For example, the set of all numbers satisfying is an interval which contains and , as well as all numbers between them...
of possible (or probable) values of an unknown population parameter, in contrast to point estimation
Point estimation
In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter....
, which is a single number. Neyman (1937) identified interval estimation ("estimation by interval") as distinct from point estimation ("estimation by unique estimate"). In doing so, he recognised that then-recent work quoting results in the form of an estimate
Estimator
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule and its result are distinguished....
plus-or-minus a standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
indicated that interval estimation was actually the problem statisticians really had in mind.
The most prevalent forms of interval estimation are:
- confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s (a frequentist method); and - credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s (a BayesianBayesian probabilityBayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...
method).
Other common approaches to interval estimation, which are encompassed by statistical theory, are:
- Tolerance intervalTolerance intervalA tolerance interval is a statistical interval within which, with some confidence level, a specified proportion of a population falls.A tolerance interval can be seen as a statistical version of a probability interval. If we knew a population's exact parameters, we would be able to compute a range...
s - Prediction intervalPrediction intervalIn statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed...
s - used mainly in Regression AnalysisRegression analysisIn statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables... - Likelihood intervals
There is a third approach to statistical inference
Statistical inference
In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
, namely fiducial inference
Fiducial inference
Fiducial inference is one of a number of different types of statistical inference. These are rules, intended for general application, by which conclusions can be drawn from samples of data. In modern statistical practice, attempts to work with fiducial inference have fallen out of fashion in...
, that also considers interval estimation. Non-statistical methods that can lead to interval estimates include fuzzy logic
Fuzzy logic
Fuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...
.
An interval estimate is one type of outcome of a statistical analysis. Some other types of outcome are point estimates
Point estimation
In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter....
and decisions
Decision theory
Decision theory in economics, psychology, philosophy, mathematics, and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision...
.
Discussion
The scientific problems associated with interval estimation may be summarised as follows:- When interval estimates are reported, they should have a commonly-held interpretation in the scientific community and more widely. In this regard, credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s are held to be most readily understood by the general public. Interval estimates derived from fuzzy logicFuzzy logicFuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...
have much more application-specific meanings. - For commonly occurring situations there should be sets of standard procedures that can be used, subject to the checking and validity of any required assumptions. This applies for both confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s and credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s. - For more novel situations there should be guidance on how interval estimates can be formulated. In this regard confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s and credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s have a similar standing but there are differences:
-
-
- credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s can readily deal with prior information, while confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s cannot. - confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s are more flexible and can be used practically in more situations than credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s: one area where credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s suffer in comparison is in dealing with non-parametric models (see non-parametric statisticsNon-parametric statisticsIn statistics, the term non-parametric statistics has at least two different meanings:The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:...
).
- credible interval
-
- There should be ways of testing the performance of interval estimation procedures. This arises because many such procedures involve approximations of various kinds and there is a need to check that the actual performance of a procedure is close to what is claimed. The use of stochastic simulations makes this is straightforward in the case of confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
s, but it is somewhat more problematic for credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s where prior information needs to taken properly into account. Checking of credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
s can be done for situations representing no-prior-information but the check involves checking the long-run frequency properties of the procedures.
Severini (1991) discusses conditions under which credible intervals and confidence intervals will produce similar results, and also discusses both the coverage probabilities of credible intervals and the posterior probabilities associated with confidence intervals.
See also
The Behrens–Fisher problem. This has played an important role in the development of the theory behind applicable statistical methodologies. This problem is one of the simplest to state but which is not easily solved. The task of specifying interval estimates for this problem is one where a frequentist approach fails to provide an exact solution, although some approximations are available. The BayesianBayesian probability
Bayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...
approach also fails to provide an answer that can be expressed as straightforward simple formulae, but modern computational methods of Bayesian analysis do allow essentially exact solutions to be found. Thus study of the problem can be used to elucidate the differences between the frequentist and Bayesian
Bayesian probability
Bayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...
approaches to interval estimation.