Unsolved problems in statistics
Encyclopedia
There are many longstanding unsolved problems in mathematics
Unsolved problems in mathematics
This article lists some unsolved problems in mathematics. See individual articles for details and sources.- Millennium Prize Problems :Of the seven Millennium Prize Problems set by the Clay Mathematics Institute, six have yet to be solved:* P versus NP...

 for which a solution has still not yet been found. The unsolved problems in statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

are generally of a different flavor; according to John Tukey
John Tukey
John Wilder Tukey ForMemRS was an American statistician.- Biography :Tukey was born in New Bedford, Massachusetts in 1915, and obtained a B.A. in 1936 and M.Sc. in 1937, in chemistry, from Brown University, before moving to Princeton University where he received a Ph.D...

, "difficulties in identifying problems have delayed statistics far more than difficulties in solving problems." A list of "one or two open problems" (in fact 22 of them) was given by David Cox
David Cox (statistician)
Sir David Roxbee Cox FRS is a prominent British statistician.-Early years:Cox studied mathematics at St. John's College, Cambridge and obtained his PhD from the University of Leeds in 1949, advised by Henry Daniels and Bernard Welch.-Career:He was employed from 1944 to 1946 at the Royal Aircraft...

.

Inference and testing

  • How to detect and correct for systematic error
    Systematic error
    Systematic errors are biases in measurement which lead to the situation where the mean of many separate measurements differs significantly from the actual value of the measured attribute. All measurements are prone to systematic errors, often of several different types...

    s
    , especially in sciences where random error
    Random error
    Random errors are errors in measurement that lead to measurable values being inconsistent when repeated measures of a constant attribute or quantity are taken...

    s are large (a situation Tukey termed uncomfortable science
    Uncomfortable science
    Uncomfortable science is the term coined by statistician John Tukey for cases in which there is a need to draw an inference from a limited sample of data, where further samples influenced by the same cause system will not be available...

    ).
  • The Graybill-Deal estimator is often used to estimate the common mean of two normal populations with unknown and possibly unequal variances. Though this estimator is generally unbiased, its admissibility
    Admissible decision rule
    In statistical decision theory, an admissible decision rule is a rule for making a decision such that there isn't any other rule that is always "better" than it, in a specific sense defined below....

     remains to be shown.
  • Meta-analysis
    Meta-analysis
    In statistics, a meta-analysis combines the results of several studies that address a set of related research hypotheses. In its simplest form, this is normally by identification of a common measure of effect size, for which a weighted average might be the output of a meta-analyses. Here the...

    : Though independent p-value
    P-value
    In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. One often "rejects the null hypothesis" when the p-value is less than the significance level α ,...

    s can be combined using Fisher's method
    Fisher's Method
    In statistics, Fisher's method, also known as Fisher's combined probability test, is a technique for data fusion or "meta-analysis" . It was developed by and named for Ronald Fisher...

    , techniques are still being developed to handle the case of dependent p-values.
  • Behrens–Fisher problem: Yuri Linnik
    Yuri Linnik
    Yuri Vladimirovich Linnik was a Soviet mathematician active in number theory, probability theory and mathematical statistics.Linnik was born in Bila Tserkva, in present-day Ukraine. He went to St Petersburg University where his supervisor was Vladimir Tartakovski, and later worked at that...

     showed in 1966 that there is no uniformly most powerful test
    Uniformly most powerful test
    In statistical hypothesis testing, a uniformly most powerful test is a hypothesis test which has the greatest power 1 − β among all possible tests of a given size α...

     for the difference of two means when the variances are unknown and possibly unequal. That is, there is no exact test
    Exact test
    In statistics, an exact test is a test where all assumptions upon which the derivation of the distribution of the test statistic is based are met, as opposed to an approximate test, in which the approximation may be made as close as desired by making the sample size big enough...

     (meaning that, if the means are in fact equal, one that rejects the null hypothesis
    Null hypothesis
    The practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...

     with probability exactly α) that is also the most powerful for all values of the variances (which are thus nuisance parameters). Though there are many approximate solutions (such as Welch's t-test), the problem continues to attract attention as one of the classic problems in statistics.
  • Multiple comparisons
    Multiple comparisons
    In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly...

    : There are various ways to adjust p-values to compensate for the simultaneous or sequential testing of hypothesis. Of particular interest is how to simultaneously control the overall error rate, preserve statistical power, and incorporate the dependence between tests into the adjustment. These issues are especially relevant when the number of simultaneous tests can be very large, as is increasingly the case in the analysis of data from DNA microarray
    DNA microarray
    A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome...

    s.

Experimental design

  • As the theory of Latin square
    Latin square
    In combinatorics and in experimental design, a Latin square is an n × n array filled with n different symbols, each occurring exactly once in each row and exactly once in each column...

    s is a cornerstone in the design of experiments
    Design of experiments
    In general usage, design of experiments or experimental design is the design of any information-gathering exercises where variation is present, whether under the full control of the experimenter or not. However, in statistics, these terms are usually used for controlled experiments...

    , solving the problems in Latin squares
    Problems in Latin squares
    In mathematics, the theory of Latin squares is an active research area with many open problems. As in other areas of mathematics, such problems are often made public at professional conferences and meetings...

    could have immediate applicability to experimental design.

Problems of a more philosophical nature

  • Sunrise problem
    Sunrise problem
    The sunrise problem can be expressed as follows: "What is the probability that the sun will rise tomorrow?"The sunrise problem illustrates the difficulty of using probability theory when evaluating the plausibility of statements or beliefs....

    : What is the probability that the sun will rise tomorrow?
  • Doomsday argument
    Doomsday argument
    The Doomsday argument is a probabilistic argument that claims to predict the number of future members of the human species given only an estimate of the total number of humans born so far...

    : How valid is the probabilistic argument
    Probabilistic argument
    Probabilistic argument can refer to the following:* In some contexts, probabilistic argument means any argument involving probability theory...

     that claims to predict the future
    Future
    The future is the indefinite time period after the present. Its arrival is considered inevitable due to the existence of time and the laws of physics. Due to the nature of the reality and the unavoidability of the future, everything that currently exists and will exist is temporary and will come...

     lifetime of the human race
    Human Race
    Human Race refers to the Human species.Human race may also refer to:*The Human Race, 79th episode of YuYu Hakusho* Human Race Theatre Company of Dayton Ohio* Human Race Machine, a computer graphics device...

     given only an estimate of the total number of humans born so far?
  • Exchange paradox: within the subjectivistic interpretation
    Bayesian probability
    Bayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...

     of probability theory
    Probability
    Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

    ; more specifically within Bayesian decision theory. This is still an open problem among the subjectivists as no consensus has been reached yet. Examples include:
    • The two envelopes problem
      Two envelopes problem
      The two envelopes problem, also known as the exchange paradox, is a brain teaser, puzzle or paradox in logic, philosophy, probability and recreational mathematics, of special interest in decision theory and for the Bayesian interpretation of probability theory...

    • The Necktie Paradox
      Necktie paradox
      The necktie paradox is a puzzle or paradox within the subjectivistic interpretation of probability theory. It is a variation of the two-envelope paradox....

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK