Neyman-Pearson lemma
Encyclopedia
In statistics
, the Neyman-Pearson lemma
, named after Jerzy Neyman
and Egon Pearson
, states that when performing a hypothesis test
between two point hypotheses H0: θ = θ0 and H1: θ = θ1, then the likelihood-ratio test
which rejects H0 in favour of H1 when
where
is the most powerful test of size α
for a threshold η. If the test is most powerful for all , it is said to be uniformly most powerful (UMP) for alternatives in the set .
In practice, the likelihood ratio is often used directly to construct tests — see Likelihood-ratio test
. However it can also be used to suggest particular test-statistics that might be of interest or to suggest simplified tests — for this one considers algebraic manipulation of the ratio to see if there are key statistics in it is related to the size of the ratio (i.e. whether a large statistic corresponds to a small ratio or to a large one).
Any other test will have a different rejection region that we define as . Furthermore define the function of region, and parameter
where this is the probability of the data falling in region R, given parameter .
For both tests to have significance level , it must be true that
However it is useful to break these down into integrals over distinct regions, given by
and
Setting and equating the above two expression, yields that
Comparing the powers of the two tests, which are and , one can see that
Now by the definition of ,
Hence the inequality holds.
We can compute the likelihood ratio to find the key statistic in this test and its effect on the test's outcome:
This ratio only depends on the data through . Therefore, by the Neyman-Pearson lemma, the most powerful
test of this type of hypothesis
for this data will depend only on . Also, by inspection, we can see that if , then is an increasing function of . So we should reject if is sufficiently large. The rejection threshold depends on the size
of the test. In this example, the test statistic can be shown to be a scaled Chi-square distributed random variable and an exact critical value can be obtained.
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, the Neyman-Pearson lemma
Lemma (mathematics)
In mathematics, a lemma is a proven proposition which is used as a stepping stone to a larger result rather than as a statement in-and-of itself...
, named after Jerzy Neyman
Jerzy Neyman
Jerzy Neyman , born Jerzy Spława-Neyman, was a Polish American mathematician and statistician who spent most of his professional career at the University of California, Berkeley.-Life and career:...
and Egon Pearson
Egon Pearson
Egon Sharpe Pearson, CBE FRS was the only son of Karl Pearson, and like his father, a leading British statistician....
, states that when performing a hypothesis test
Statistical hypothesis testing
A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...
between two point hypotheses H0: θ = θ0 and H1: θ = θ1, then the likelihood-ratio test
Likelihood-ratio test
In statistics, a likelihood ratio test is a statistical test used to compare the fit of two models, one of which is a special case of the other . The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other...
which rejects H0 in favour of H1 when
where
is the most powerful test of size α
Type I and type II errors
In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...
for a threshold η. If the test is most powerful for all , it is said to be uniformly most powerful (UMP) for alternatives in the set .
In practice, the likelihood ratio is often used directly to construct tests — see Likelihood-ratio test
Likelihood-ratio test
In statistics, a likelihood ratio test is a statistical test used to compare the fit of two models, one of which is a special case of the other . The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other...
. However it can also be used to suggest particular test-statistics that might be of interest or to suggest simplified tests — for this one considers algebraic manipulation of the ratio to see if there are key statistics in it is related to the size of the ratio (i.e. whether a large statistic corresponds to a small ratio or to a large one).
Proof
Define the rejection region of the null hypothesis for the NP test asAny other test will have a different rejection region that we define as . Furthermore define the function of region, and parameter
where this is the probability of the data falling in region R, given parameter .
For both tests to have significance level , it must be true that
However it is useful to break these down into integrals over distinct regions, given by
and
Setting and equating the above two expression, yields that
Comparing the powers of the two tests, which are and , one can see that
Now by the definition of ,
Hence the inequality holds.
Example
Let be a random sample from the distribution where the mean is known, and suppose that we wish to test for against . The likelihood for this set of normally distributed data isWe can compute the likelihood ratio to find the key statistic in this test and its effect on the test's outcome:
This ratio only depends on the data through . Therefore, by the Neyman-Pearson lemma, the most powerful
Statistical power
The power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is actually false . The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis...
test of this type of hypothesis
Statistical hypothesis testing
A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...
for this data will depend only on . Also, by inspection, we can see that if , then is an increasing function of . So we should reject if is sufficiently large. The rejection threshold depends on the size
Type I and type II errors
In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...
of the test. In this example, the test statistic can be shown to be a scaled Chi-square distributed random variable and an exact critical value can be obtained.
External links
- Cosma Shalizi, a professor of statistics at Carnegie Mellon University, gives an intuitive derivation of the Neyman-Pearson Lemma using ideas from economics