Philosophy of statistics
Encyclopedia
The philosophy of statistics involves the meaning, justification
, utility
, use and abuse of statistics
and its methodology
, and ethical and epistemological issues involved in the consideration of choice and interpretation of data and methods of Statistics.
Justification
Justification may refer to:*Theory of justification, a part of epistemology that attempts to understand the justification of propositions and beliefs*Justification , defence in a prosecution for a criminal offense...
, utility
Utility
In economics, utility is a measure of customer satisfaction, referring to the total satisfaction received by a consumer from consuming a good or service....
, use and abuse of statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
and its methodology
Methodology
Methodology is generally a guideline for solving a problem, with specificcomponents such as phases, tasks, methods, techniques and tools . It can be defined also as follows:...
, and ethical and epistemological issues involved in the consideration of choice and interpretation of data and methods of Statistics.
- Foundations of statisticsFoundations of statisticsFoundations of statistics is the usual name for the epistemological debate in statistics over how one should conduct inductive inference from data...
involves issues in theoretical statistics, its goals and optimizationOptimization (mathematics)In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
methods to meet these goals, parametricParametricParametric may refer to:*Parametric equation*Parametric statistics*Parametric derivative*Parametric plot*Parametric model*Parametric oscillator *Parametric contract*Parametric insurance*Parametric feature based modeler...
assumptions or lack thereof considered in nonparametric statistics, model selectionModel selectionModel selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered...
for the underlying probability distributionProbability distributionIn probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
, and interpretation of the meaning of inferences made using statistics, related to the philosophy of probability and the philosophy of sciencePhilosophy of scienceThe philosophy of science is concerned with the assumptions, foundations, methods and implications of science. It is also concerned with the use and merit of science and sometimes overlaps metaphysics and epistemology by exploring whether scientific results are actually a study of truth...
. Discussion of the selection of the goals and the meaning of optimization, in foundations of statistics, are the subject of the philosophy of statistics. Selection of distribution models, and of the means of selection, is the subject of the philosophy of statistics, whereas the mathematics of optimization is the subject of nonparametric statistics.
- David Cox makes the point that any kind of interpretation of evidence is in fact a statistical model, although it is known through Ian Hacking's work that many are ignorant of this subtlety.
- Issues arise involving sample sizeSample sizeSample size determination is the act of choosing the number of observations to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample...
, such as cost and efficiency, are common, such as in polling and pharmaceutical research.
- Extra-mathematical considerations in the design of experiments and accommodating these issues arise in most actual experiments.
- The motivation and justification of data analysisData analysisAnalysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making...
and experimental design, as part of the scientific methodScientific methodScientific method refers to a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge. To be termed scientific, a method of inquiry must be based on gathering empirical and measurable evidence subject to specific principles of...
are considered.
- Distinctions between inductionInductive reasoningInductive reasoning, also known as induction or inductive logic, is a kind of reasoning that constructs or evaluates propositions that are abstractions of observations. It is commonly construed as a form of reasoning that makes generalizations based on individual instances...
and logical deduction relevant to inferences from dataDataThe term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
and evidenceEvidenceEvidence in its broadest sense includes everything that is used to determine or demonstrate the truth of an assertion. Giving or procuring evidence is the process of using those things that are either presumed to be true, or were themselves proven via evidence, to demonstrate an assertion's truth...
arise, such as when frequentist interpretations are compared with degrees of certaintyDegrees of CertaintyDegrees of Certainty may refer to:*Rudolf Carnap*Bayesian analysis*Confidence Interval...
derived from Bayesian inferenceBayesian inferenceIn statistics, Bayesian inference is a method of statistical inference. It is often used in science and engineering to determine model parameters, make predictions about unknown variables, and to perform model selection...
. However, the difference between induction and ordinary reasoning is not generally appreciated
- Leo Breiman exposed the diversity of thinking in his article on 'The Two Cultures', making the point that statistics has several kinds of inference to make, modelling and prediction amongst them.
- Issues in the philosophy of statistics arise throughout the history of statisticsHistory of statisticsThe history of statistics can be said to start around 1749 although, over time, there have been changes to the interpretation of what the word statistics means. In early times, the meaning was restricted to information about states...
. CausalityCausalityCausality is the relationship between an event and a second event , where the second event is understood as a consequence of the first....
considerations arise with interpretations of, and definitions of, correlationCorrelationIn statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
, and in the theory of measurement.
- Objectivity in statistics is often confused with truth whereas it is better understood as replicability, which then needs to be definied in the particular case. Theodore Porter develops this as being the path pursued when trust has evaporated, being replaced with criteria.
- EthicsEthicsEthics, also known as moral philosophy, is a branch of philosophy that addresses questions about morality—that is, concepts such as good and evil, right and wrong, virtue and vice, justice and crime, etc.Major branches of ethics include:...
associated with epistemology and medical applications arise from potential abuse of statistics, such as selection of method or transformationsData transformation (statistics)In statistics, data transformation refers to the application of a deterministic mathematical function to each point in a data set — that is, each data point zi is replaced with the transformed value yi = f, where f is a function...
of the data to arrive at different probability conclusions for the same data set. For example, the meaning of applications of a statistical inferenceStatistical inferenceIn statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
to a single person, such as one single cancer patient, when there is no frequentist interpretation for that patient to adopt.
- Campaigns for statistical literacyStatistical literacyStatistical literacy is a term used to describe an individual's or group's ability to understand statistics. Statistical literacy is necessary for citizens to understand material presented in publications such as newspapers, television, and the Internet. Numeracy is a prerequisite to being...
must wrestle with the problem that most interesting questions around individual risk are very difficult to determine or interpret, even with the computer power currently available.
See also
- Philosophy of probability
- Philosophy of sciencePhilosophy of scienceThe philosophy of science is concerned with the assumptions, foundations, methods and implications of science. It is also concerned with the use and merit of science and sometimes overlaps metaphysics and epistemology by exploring whether scientific results are actually a study of truth...
- Scientific methodScientific methodScientific method refers to a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge. To be termed scientific, a method of inquiry must be based on gathering empirical and measurable evidence subject to specific principles of...
- History of statisticsHistory of statisticsThe history of statistics can be said to start around 1749 although, over time, there have been changes to the interpretation of what the word statistics means. In early times, the meaning was restricted to information about states...
Further reading
- Efron, BradleyBradley EfronBradley Efron is an American statistician best known for proposing the bootstrap resampling technique, which has had a major impact in the field of statistics and virtually every area of statistical application...
; Morris, Carl (1977) Stein's Paradox in Statistics. Scientific AmericanScientific AmericanScientific American is a popular science magazine. It is notable for its long history of presenting science monthly to an educated but not necessarily scientific public, through its careful attention to the clarity of its text as well as the quality of its specially commissioned color graphics...
, 236 (5), 119—127. - Efron, Bradley, (1979). Computer and the theory of statistics: thinking the unthinkable. SIAM Review, 21 (4), 460–480
- Good, I. J.I. J. GoodIrving John Good was a British mathematician who worked as a cryptologist at Bletchley Park with Alan Turing. After World War II, Good continued to work with Turing on the design of computers and Bayesian statistics at the University of Manchester...
(1988). The Interface Between Statistics and Philosophy of Science., Statistical ScienceStatistical ScienceStatistical Science is a review journal published by the Institute of Mathematical Statistics. The founding editor was Morris H. DeGroot.-External links:*...
, 3 (4), 386–397. - Hacking, IanIan HackingIan Hacking, CC, FRSC, FBA is a Canadian philosopher, specializing in the philosophy of science.- Life and works :...
(2006)The Emergence of Probability, 2nd Ed, Cambridge University Press, ISBN 0521685575 - Hacking, Ian (1964)"On the Foundations of Statistics", The British Journal for the Philosophy of Science, 15 (57), 1–26
- Hacking, IanIan HackingIan Hacking, CC, FRSC, FBA is a Canadian philosopher, specializing in the philosophy of science.- Life and works :...
(1990)The Taming of Chance, Cambridge University Press, ISBN 0521388849 - Mayo, Deborah (1996) Error and the growth of experimental knowledge, University Of Chicago Press. ISBN 0226511987
- Porter, Theodore M (1995)Trust in Numbers, Princeton University Press, ISBN 0691037760
- Savage, L.J.Leonard Jimmie SavageLeonard Jimmie Savage was an American mathematician and statistician. Nobel Prize-winning economist Milton Friedman said Savage was "one of the few people I have met whom I would unhesitatingly call a genius."...
, (1972) The Foundations of Statistics, Dover (2003 Edition) ISBN 0486623491