Statistical syllogism
Encyclopedia
A statistical syllogism is a non-deductive syllogism
. It argues from a generalization true for the most part to a particular case (in contrast to induction
, which argues from particular cases to generalizations).
syllogisms may use qualifying words like "most", "frequently", "almost never", "rarely", etc., or may have a statistical generalization as one or both of their premises.
For example:
Premise 1 (the major premise) is a generalization
, and the argument attempts to draw a conclusion from that generalization. In contrast to a deductive syllogism, the premises logically support or confirm the conclusion rather than strictly implying it: it is possible for the premises to be true and the conclusion false, but it is not likely.
General form:
In the abstract form above, F is called the "reference class" and G is the "attribute class" and I is the individual object. So, in the earlier example, "(things that are) taller than 26 inches" is the attribute class and "people" is the reference class.
Unlike many other forms of syllogism, a statistical syllogism is inductive, so when evaluating this kind of argument we should be careful to stress how strong or weak it is, along with all the other rules of induction (as opposed to deduction
).
Two dicto simpliciter
fallacies can occur in statistical syllogisms. They are "accident
" and "converse accident
". Faulty generalization
fallacies can also affect any argument premise that uses a generalization. A problem with applying the statistical syllogism in real cases is the reference class problem
: given that a particular case I is a member of very many reference classes F, in which the proportion of attribute G may differ widely, how should one decide which class to use in applying the statistical syllogism?
The importance of the statistical syllogism was urged by Henry E. Kyburg, Jr.
, who argued that all statements of probability could be traced to a direct inference. For example, when taking off in an airplane, our confidence (but not certainty) that we will land safely is based on our knowledge that the vast majority of flights do land safely.
The widespread use of confidence intervals in statistics
is often justified using a statistical syllogism, in such words as "Were this procedure to be repeated on multiple samples, the calculated confidence interval (which would differ for each sample) would encompass the true population parameter 90% of the time." The inference from what would mostly happen in multiple samples to the confidence we should have in the particular sample involves a statistical syllogism.
writes "that which people know to happen or not to happen, or to be or not to be, mostly in a particular way, is likely, for example, that the envious are malevolent or that those who are loved are affectionate."
The ancient Jewish law of the Talmud
used a "follow the majority" rule to resolve cases of doubt.
From the invention of insurance
in the 14th century, insurance rates were based on estimates (often intuitive) of the frequencies of the events insured against, which involves an implicit use of a statistical syllogism. John Venn
pointed out in 1876 that this leads to a reference class problem
of deciding in what class containing the individual case to take frequencies in. He writes, “It is obvious that every single thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things”, leading to problems with how to assign probabilities to a single case, for example the probability that John Smith, a consumptive Englishman aged fifty, will live to sixty-one.
In the 20th century, clinical trials were designed to find the proportion of cases of disease cured by a drug, in order that the drug can be applied confidently to an individual patient with the disease.
in their attempt to give a logical solution to the problem of induction
. They put forward the argument, which has the form of a statistical syllogism:
If the population is, say, a large number of balls which are black or white but in an unknown proportion, and one takes a large sample and finds they are all white, then it is likely, using this statistical syllogism, that the population is all or nearly all white. That is an example of inductive reasoning.
is a sound one, but it is felt to be unjust to burden a defendant with membership of a class, without evidence that bears directly on the defendant.
Syllogism
A syllogism is a kind of logical argument in which one proposition is inferred from two or more others of a certain form...
. It argues from a generalization true for the most part to a particular case (in contrast to induction
Inductive inference
Around 1960, Ray Solomonoff founded the theory of universal inductive inference, the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols...
, which argues from particular cases to generalizations).
Introduction
StatisticalStatistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
syllogisms may use qualifying words like "most", "frequently", "almost never", "rarely", etc., or may have a statistical generalization as one or both of their premises.
For example:
- Almost all people are taller than 26 inches
- Bob is a person
- Therefore, Bob is taller than 26 inches
Premise 1 (the major premise) is a generalization
Generalization
A generalization of a concept is an extension of the concept to less-specific criteria. It is a foundational element of logic and human reasoning. Generalizations posit the existence of a domain or set of elements, as well as one or more common characteristics shared by those elements. As such, it...
, and the argument attempts to draw a conclusion from that generalization. In contrast to a deductive syllogism, the premises logically support or confirm the conclusion rather than strictly implying it: it is possible for the premises to be true and the conclusion false, but it is not likely.
General form:
- X proportion of F are G
- I is an F
- I is a G
In the abstract form above, F is called the "reference class" and G is the "attribute class" and I is the individual object. So, in the earlier example, "(things that are) taller than 26 inches" is the attribute class and "people" is the reference class.
Unlike many other forms of syllogism, a statistical syllogism is inductive, so when evaluating this kind of argument we should be careful to stress how strong or weak it is, along with all the other rules of induction (as opposed to deduction
Deductive reasoning
Deductive reasoning, also called deductive logic, is reasoning which constructs or evaluates deductive arguments. Deductive arguments are attempts to show that a conclusion necessarily follows from a set of premises or hypothesis...
).
Two dicto simpliciter
Dicto simpliciter
A dicto simpliciter or ad dictum simpliciter are Latin phrases for a type of logical fallacy.A dicto simpliciter fallacies are deductive fallacies that occur in statistical syllogisms...
fallacies can occur in statistical syllogisms. They are "accident
Accident (fallacy)
The logical fallacy of accident is a deductive fallacy occurring in statistical syllogisms when an exception to a rule of thumb is ignored. It is one of the thirteen fallacies originally identified by Aristotle...
" and "converse accident
Converse accident
The logical fallacy of converse accident is a deductive fallacy that can occur in a statistical syllogism when an exception to a generalization is wrongly called for.For example:The inductive version of this fallacy is called hasty generalization...
". Faulty generalization
Faulty generalization
A fallacy of defective induction reaches a conclusion from weak premises. Unlike fallacies of relevance, in fallacies of defective induction, the premises are related to the conclusions yet only weakly buttress the conclusions. A faulty generalization is thus produced...
fallacies can also affect any argument premise that uses a generalization. A problem with applying the statistical syllogism in real cases is the reference class problem
Reference class problem
In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case...
: given that a particular case I is a member of very many reference classes F, in which the proportion of attribute G may differ widely, how should one decide which class to use in applying the statistical syllogism?
The importance of the statistical syllogism was urged by Henry E. Kyburg, Jr.
Henry E. Kyburg, Jr.
Henry E. Kyburg, Jr. was Gideon Burbank Professor of Moral Philosophy and Professor of Computer Science at the University of Rochester, New York, and Pace Eminent Scholar at the Institute for Human and Machine Cognition, Pensacola, Florida...
, who argued that all statements of probability could be traced to a direct inference. For example, when taking off in an airplane, our confidence (but not certainty) that we will land safely is based on our knowledge that the vast majority of flights do land safely.
The widespread use of confidence intervals in statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
is often justified using a statistical syllogism, in such words as "Were this procedure to be repeated on multiple samples, the calculated confidence interval (which would differ for each sample) would encompass the true population parameter 90% of the time." The inference from what would mostly happen in multiple samples to the confidence we should have in the particular sample involves a statistical syllogism.
History
Ancient writers on logic and rhetoric approved arguments from "what happens for the most part". For example AristotleAristotle
Aristotle was a Greek philosopher and polymath, a student of Plato and teacher of Alexander the Great. His writings cover many subjects, including physics, metaphysics, poetry, theater, music, logic, rhetoric, linguistics, politics, government, ethics, biology, and zoology...
writes "that which people know to happen or not to happen, or to be or not to be, mostly in a particular way, is likely, for example, that the envious are malevolent or that those who are loved are affectionate."
The ancient Jewish law of the Talmud
Talmud
The Talmud is a central text of mainstream Judaism. It takes the form of a record of rabbinic discussions pertaining to Jewish law, ethics, philosophy, customs and history....
used a "follow the majority" rule to resolve cases of doubt.
From the invention of insurance
Insurance
In law and economics, insurance is a form of risk management primarily used to hedge against the risk of a contingent, uncertain loss. Insurance is defined as the equitable transfer of the risk of a loss, from one entity to another, in exchange for payment. An insurer is a company selling the...
in the 14th century, insurance rates were based on estimates (often intuitive) of the frequencies of the events insured against, which involves an implicit use of a statistical syllogism. John Venn
John Venn
Donald A. Venn FRS , was a British logician and philosopher. He is famous for introducing the Venn diagram, which is used in many fields, including set theory, probability, logic, statistics, and computer science....
pointed out in 1876 that this leads to a reference class problem
Reference class problem
In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case...
of deciding in what class containing the individual case to take frequencies in. He writes, “It is obvious that every single thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things”, leading to problems with how to assign probabilities to a single case, for example the probability that John Smith, a consumptive Englishman aged fifty, will live to sixty-one.
In the 20th century, clinical trials were designed to find the proportion of cases of disease cured by a drug, in order that the drug can be applied confidently to an individual patient with the disease.
The problem of induction
The statistical syllogism was used by Donald Cary Williams and David StoveDavid Stove
David Charles Stove , was an Australian philosopher of science.His work in philosophy of science included detailed criticisms of David Hume's inductive skepticism, as well as what he regarded as the irrationalism of his disciplinary contemporaries Karl Popper, Thomas Kuhn, Imre Lakatos, and Paul...
in their attempt to give a logical solution to the problem of induction
Problem of induction
The problem of induction is the philosophical question of whether inductive reasoning leads to knowledge. That is, what is the justification for either:...
. They put forward the argument, which has the form of a statistical syllogism:
- The great majority of large samples of a population approximately match the population (in proportion)
- This is a large sample from a population
- Therefore, this sample approximately matches the population
If the population is, say, a large number of balls which are black or white but in an unknown proportion, and one takes a large sample and finds they are all white, then it is likely, using this statistical syllogism, that the population is all or nearly all white. That is an example of inductive reasoning.
Legal examples
Statistical syllogisms may be used as legal evidence but it is usually believed that a legal decision should not be based solely on them. For example, in L. Jonathan Cohen's "gatecrasher paradox", 499 tickets to a rodeo have been sold and 1000 people are observed in the stands. The rodeo operator sues a random attendee for non-payment of the entrance fee. The statistical syllogism:- 501 of the 1000 attendees have not paid
- The defendant is an attendee
- Therefore, on the balance of probabilities the defendant has not paid
is a sound one, but it is felt to be unjust to burden a defendant with membership of a class, without evidence that bears directly on the defendant.