Coverage probability
Encyclopedia
In statistics, the coverage probability of a confidence interval
is the proportion of the time that the interval contains the true value of interest. For example, suppose our interest is in the mean
number of months that people with a particular type of cancer
remain in remission following successful treatment with chemotherapy
. The confidence interval aims to contain the unknown mean remission duration with a given probability. This is the "confidence level" or "confidence coefficient" of the constructed interval which is effectively the "nominal coverage probability" of the procedure for constructing confidence intervals. The "nominal coverage probability" is often set at 0.95. The coverage probability is the actual probability that the interval contains the true mean remission duration in this example.
If all assumptions used in deriving a confidence interval are met, the nominal coverage probability will equal the coverage probability (termed "true" or "actual" coverage probability for emphasis). If any assumptions are not met, the actual coverage probability could either be less than or greater than the nominal coverage probability. When the actual coverage probability is greater than the nominal coverage probability, the interval is termed "conservative", if it is less than the nominal coverage probability, the interval is termed "anti-conservative", or "permissive."
A discrepancy between the coverage probability and the nominal coverage probability frequently occurs when approximating a discrete distribution with a continuous one. The construction of binomial confidence intervals
is a classic example where coverage probabilities rarely equal nominal levels. For the binomial case, several techniques for constructing intervals have been created. The Wilson or Score confidence interval is one well known construction based on the normal distribution. Other constructions include the Wald, exact, Agresti-Coull, and likelihood intervals. While the Wilson interval may not be the most conservative estimate, it produces average coverage probabilities that are equal to nominal levels while still producing a comparatively narrow confidence interval.
The "probability" in coverage probability is interpreted with respect to a set of hypothetical repetitions of the entire data collection and analysis procedure. In these hypothetical repetitions, independent data sets following the same probability distribution
as the actual data are considered, and a confidence interval is computed from each of these data sets.
Confidence interval
In statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
is the proportion of the time that the interval contains the true value of interest. For example, suppose our interest is in the mean
Expected value
In probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...
number of months that people with a particular type of cancer
Cancer
Cancer , known medically as a malignant neoplasm, is a large group of different diseases, all involving unregulated cell growth. In cancer, cells divide and grow uncontrollably, forming malignant tumors, and invade nearby parts of the body. The cancer may also spread to more distant parts of the...
remain in remission following successful treatment with chemotherapy
Chemotherapy
Chemotherapy is the treatment of cancer with an antineoplastic drug or with a combination of such drugs into a standardized treatment regimen....
. The confidence interval aims to contain the unknown mean remission duration with a given probability. This is the "confidence level" or "confidence coefficient" of the constructed interval which is effectively the "nominal coverage probability" of the procedure for constructing confidence intervals. The "nominal coverage probability" is often set at 0.95. The coverage probability is the actual probability that the interval contains the true mean remission duration in this example.
If all assumptions used in deriving a confidence interval are met, the nominal coverage probability will equal the coverage probability (termed "true" or "actual" coverage probability for emphasis). If any assumptions are not met, the actual coverage probability could either be less than or greater than the nominal coverage probability. When the actual coverage probability is greater than the nominal coverage probability, the interval is termed "conservative", if it is less than the nominal coverage probability, the interval is termed "anti-conservative", or "permissive."
A discrepancy between the coverage probability and the nominal coverage probability frequently occurs when approximating a discrete distribution with a continuous one. The construction of binomial confidence intervals
Binomial proportion confidence interval
In statistics, a binomial proportion confidence interval is a confidence interval for a proportion in a statistical population. It uses the proportion estimated in a statistical sample and allows for sampling error. There are several formulas for a binomial confidence interval, but all of them rely...
is a classic example where coverage probabilities rarely equal nominal levels. For the binomial case, several techniques for constructing intervals have been created. The Wilson or Score confidence interval is one well known construction based on the normal distribution. Other constructions include the Wald, exact, Agresti-Coull, and likelihood intervals. While the Wilson interval may not be the most conservative estimate, it produces average coverage probabilities that are equal to nominal levels while still producing a comparatively narrow confidence interval.
The "probability" in coverage probability is interpreted with respect to a set of hypothetical repetitions of the entire data collection and analysis procedure. In these hypothetical repetitions, independent data sets following the same probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
as the actual data are considered, and a confidence interval is computed from each of these data sets.