False positive paradox
Encyclopedia
The false positive paradox is a statistical
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

 result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by the characteristics of the sampled population (see Bayes' theorem
Bayes' theorem
In probability theory and applications, Bayes' theorem relates the conditional probabilities P and P. It is commonly used in science and engineering. The theorem is named for Thomas Bayes ....

). When the incidence, the proportion of those who have a given condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive in an individual case will give more false than true positives overall. So, in a society with very few infected people—fewer proportionately than the test gives false positives—there will actually be more who test positive for a disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.

It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after having dealt with positive results drawn from a high-incidence population. If the false positive rate of the test is higher than the proportion of the new population with the condition, then a test administrator whose experience has been drawn from testing in a high-incidence population may conclude from experience
Rule of thumb
A rule of thumb is a principle with broad application that is not intended to be strictly accurate or reliable for every situation. It is an easily learned and easily applied procedure for approximately calculating or recalling some value, or for making some determination...

 that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.

Not adjusting to the scarcity of the condition in the new population, and concluding that a positive test result probably indicates a positive subject, even though population incidence is below the false positive rate is a "base rate fallacy
Base rate fallacy
The base rate fallacy, also called base rate neglect or base rate bias, is an error that occurs when the conditional probability of some hypothesis H given some evidence E is assessed without taking into account the "base rate" or "prior probability" of H and the total probability of evidence...

".

High-incidence population

Imagine running an HIV test on population A, in which 200 out of 10,000 (2%) are infected. The test has a false positive rate of .0004 (.04%) and no false negative rate. The expected outcome of a million tests on population A would be:

Unhealthy and test indicates disease (true positive)

1,000,000 × (200/10000) = 20,000 people would receive a true positive


Healthy and test indicates disease (false positive)

1,000,000 × (9800/10000) × .0004 = 392 people would receive a false positive

(The remaining 979,608 tests are correctly negative.)


So, in population A, a person receiving a positive test could be over 98% confident (20,000/20,392) that it correctly indicates infection.

Low-incidence population

Now consider the same test applied to population B, in which only 1 person in 10,000 (.01%) is infected . The expected outcome of a million tests on population B would be:


Unhealthy and test indicates disease (true positive)

1,000,000 × (1/10,000) = 100 people would receive a true positive


Healthy and test indicates disease (false positive)

1,000,000 × (9999/10,000) × .0004 ≈ 400 people would receive a false positive

(The remaining 999,500 tests are correctly negative.)


In population B, only 100 of the 500 total people with a positive test result are actually infected. So, the probability of actually being infected after you are told you are infected is only 20% (100/500) for a test that otherwise appears to be "over 99.95% accurate".

A tester with experience of group A might find it a paradox that in group B, a result that had almost always indicated infection is now usually a false positive. The confusion of the posterior probability
Posterior probability
In Bayesian statistics, the posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant evidence is taken into account...

 of infection with the prior probability
Prior probability
In Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity p is the probability distribution that would express one's uncertainty about p before the "data"...

 of receiving a false negative is a natural error
Fallacy
In logic and rhetoric, a fallacy is usually an incorrect argumentation in reasoning resulting in a misconception or presumption. By accident or design, fallacies may exploit emotional triggers in the listener or interlocutor , or take advantage of social relationships between people...

 after receiving a life-threatening test result.

See also

  • Prosecutor's fallacy
    Prosecutor's fallacy
    The prosecutor's fallacy is a fallacy of statistical reasoning made in law where the context in which the accused has been brought to court is falsely assumed to be irrelevant to judging how confident a jury can be in evidence against them with a statistical measure of doubt...

    , a mistake in reasoning that involves ignoring a low prior probability
    Prior probability
    In Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity p is the probability distribution that would express one's uncertainty about p before the "data"...

    .
  • Simpson's paradox
    Simpson's paradox
    In probability and statistics, Simpson's paradox is a paradox in which a correlation present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequencydata are hastily given causal...

    , another common error in statistical reasoning dealing with comparing groups
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK