List of statistical topics
Encyclopedia

0–9

  • 1.96
    1.96
    1.96 is the approximate value of the 97.5 percentile point of the normal distribution used in probability and statistics. 95% of the area under a normal curve lies within roughly 1.96 standard deviations of the mean, and due to the central limit theorem, this number is therefore used in the...

  • 2SLS (two-stage least squares) — redirects to instrumental variable
    Instrumental variable
    In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....

  • 3SLS — redirects to Three-stage least squares
  • 68-95-99.7 rule
    68-95-99.7 rule
    In statistics, the 68-95-99.7 rule, or three-sigma rule, or empirical rule, states that for a normal distribution, nearly all values lie within 3 standard deviations of the mean....

  • 100-year flood
    100-year flood
    A one-hundred-year flood is calculated to be the level of flood water expected to be equaled or exceeded every 100 years on average. The 100-year flood is more accurately referred to as the 1% annual exceedance probability flood, since it is a flood that has a 1% chance of being equaled or exceeded...


A

  • A posteriori probability (disambiguation)
  • A priori probability
    A priori probability
    The term a priori probability is used in distinguishing the ways in which values for probabilities can be obtained. In particular, an "a priori probability" is derived purely by deductive reasoning...

  • A priori (statistics)
    A priori (statistics)
    In statistics, a priori knowledge is prior knowledge about a population, rather than that estimated by recent observation. It is common in Bayesian inference to make inferences conditional upon this knowledge, and the integration of a priori knowledge is the central difference between the Bayesian...

  • Abductive reasoning
    Abductive reasoning
    Abduction is a kind of logical inference described by Charles Sanders Peirce as "guessing". The term refers to the process of arriving at an explanatory hypothesis. Peirce said that to abduce a hypothetical explanation a from an observed surprising circumstance b is to surmise that a may be true...

  • Absolute deviation
    Absolute deviation
    In statistics, the absolute deviation of an element of a data set is the absolute difference between that element and a given point. Typically the point from which the deviation is measured is a measure of central tendency, most often the median or sometimes the mean of the data set.D_i = |x_i-m|...

  • Absolute risk reduction
    Absolute risk reduction
    In epidemiology, the absolute risk reduction or risk difference is the decrease in risk of a given activity or treatment in relation to a control activity or treatment. It is the inverse of the number needed to treat....

  • ABX test
    ABX test
    An ABX test is a method of comparing two kinds of sensory stimuli to identify detectable differences. A subject is presented with two known samples , and one unknown sample X, for three samples total. X is randomly selected from A and B, and the subject identifies X as being either A or B...

  • Accelerated failure time model
    Accelerated failure time model
    In the statistical area of survival analysis, an accelerated failure time model is a parametric model that provides an alternative to the commonly used proportional hazards models...

  • Acceptable quality limit
    Acceptable quality limit
    The acceptable quality limit is the worst tolerable process average in percentage or ratio, that is still considered acceptable: that is, it is at an acceptable quality level...

  • Acceptance sampling
    Acceptance sampling
    Acceptance sampling uses statistical sampling to determine whether to accept or reject a production lot of material. It has been a common quality control technique used in industry and particularly the military for contracts and procurement. It is usually done as products leave the factory, or in...

  • Accidental sampling
    Accidental sampling
    Accidental sampling is a type of nonprobability sampling which involves the sample being drawn from that part of the population which is close to hand. That is, a sample population selected because it is readily available and convenient...

  • Accuracy and precision
    Accuracy and precision
    In the fields of science, engineering, industry and statistics, the accuracy of a measurement system is the degree of closeness of measurements of a quantity to that quantity's actual value. The precision of a measurement system, also called reproducibility or repeatability, is the degree to which...

  • Accuracy paradox
    Accuracy paradox
    The accuracy paradox for predictive analytics states that predictive models with a given level of accuracy may have greater predictive power than models with higher accuracy...

  • Acquiescence bias
    Acquiescence bias
    Acquiescence bias is a category of response bias in which respondents to a survey have a tendency to agree with all the questions or to indicate a positive connotation. Acquiescence is sometimes referred to as "yah-saying" and is the tendency of a respondent to agree with a statement when in doubt...

  • Actuarial science
    Actuarial science
    Actuarial science is the discipline that applies mathematical and statistical methods to assess risk in the insurance and finance industries. Actuaries are professionals who are qualified in this field through education and experience...

  • ADAPA
    ADAPA
    ADAPA is intrinsically a predictive decisioning platform. It combines the power of predictive analytics and business rules to facilitate the tasks of managing and designing automated decisioning systems.-Automated decisions:...

     – software
  • Adapted process
    Adapted process
    In the study of stochastic processes, an adapted process is one that cannot "see into the future". An informal interpretation is that X is adapted if and only if, for every realisation and every n, Xn is known at time n...

  • Adaptive estimator
    Adaptive estimator
    In statistics, an adaptive estimator is an estimator in a parametric or semiparametric model with nuisance parameters such that the presence of these nuisance parameters does not affect efficiency of estimation.-Definition:...

  • Additive Markov chain
    Additive Markov chain
    In probability theory, an additive Markov chain is a Markov chain with an additive conditional probability function. Here the process is a discrete-time Markov chain of order m and the transition probability to a state at the next time is a sum of functions, each depending on the next state and one...

  • Additive model
    Additive model
    In statistics, an additive model is a nonparametric regression method. It was suggested by Jerome H. Friedman and Werner Stuetzle and is an essential part of the ACE algorithm. The AM uses a one dimensional smoother to build a restricted class of nonparametric regression models. Because of this,...

  • Additive smoothing
    Additive smoothing
    In statistics, additive smoothing, also called Laplace smoothing , or Lidstone smoothing, is a technique used to smooth categorical data...

  • Additive white Gaussian noise
    Additive white Gaussian noise
    Additive white Gaussian noise is a channel model in which the only impairment to communication is a linear addition of wideband or white noise with a constant spectral density and a Gaussian distribution of amplitude. The model does not account for fading, frequency selectivity, interference,...

  • Adjusted Rand index — redirects to Rand index
    Rand index
    The Rand index or Rand measure in statistics, and in particular in data clustering, is a measure of the similarity between two data clusterings...

     (subsection)
  • ADMB
    ADMB
    ADMB or AD Model Builder is a free and open source software suite for non-linear statistical modeling. It was created by David Fournier and now being developed by the ADMB Project, a creation of the non-profit ADMB Foundation...

     – software
  • Admissible decision rule
    Admissible decision rule
    In statistical decision theory, an admissible decision rule is a rule for making a decision such that there isn't any other rule that is always "better" than it, in a specific sense defined below....

  • Age adjustment
    Age adjustment
    In epidemiology and demography, age adjustment, also called age standardisation, is a technique used to better allow populations to be compared when the age profiles of the populations are quite different....

  • Age-standardized mortality rate
  • Age stratification
    Age stratification
    In critical sociology, age stratification refers to the hierarchical ranking of people into age groups within a society.Age stratification which is based on an ascribed status is a major source inequality, and thus may lead to ageism.-External links:* *...

  • Aggregate data
    Aggregate data
    In statistics, aggregate data describes data combined from several measurements.In economics, aggregate data or data aggregates describes high-level data that is composed of a multitude or combination of other more individual data....

  • Aggregate pattern
    Aggregate pattern
    An Aggregate pattern can refer to concepts in either statistics or computer programming. Both uses deal with considering a large case as composed of smaller, simpler, pieces.- Statistics :...

  • Akaike information criterion
    Akaike information criterion
    The Akaike information criterion is a measure of the relative goodness of fit of a statistical model. It was developed by Hirotsugu Akaike, under the name of "an information criterion" , and was first published by Akaike in 1974...

  • Algebra of random variables
    Algebra of random variables
    In the algebraic axiomatization of probability theory, the primary concept is not that of probability of an event, but rather that of a random variable. Probability distributions are determined by assigning an expectation to each random variable...

  • Algebraic statistics
    Algebraic statistics
    Algebraic statistics is the use of algebra to advance statistics. Algebra has been useful for experimental design, parameter estimation, and hypothesis testing....

  • Algorithmic inference
    Algorithmic inference
    Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst...

  • Algorithms for calculating variance
    Algorithms for calculating variance
    Algorithms for calculating variance play a major role in statistical computing. A key problem in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical instability as well as to arithmetic overflow when dealing with...

  • All-pairs testing
    All-pairs testing
    All-pairs testing or pairwise testing is a combinatorial software testing method that, for each pair of input parameters to a system , tests all possible discrete combinations of those parameters...

  • Allan variance
    Allan variance
    The Allan variance , also known as two-sample variance, is a measure of frequency stability in clocks, oscillators and amplifiers. It is named after David W. Allan. It is expressed mathematically as\sigma_y^2. \,...

  • Alignments of random points
    Alignments of random points
    Alignments of random points, as shown by statistics, can be found when a large number of random points are marked on a bounded flat surface. This might be used to show that ley lines exist due to chance alone .One precise definition which expresses the generally accepted meaning of "alignment"...

  • Almost surely
    Almost surely
    In probability theory, one says that an event happens almost surely if it happens with probability one. The concept is analogous to the concept of "almost everywhere" in measure theory...

  • Alpha beta filter
    Alpha beta filter
    An alpha beta filter is a simplified form of observer for estimation, data smoothing and control applications. It is closely related to Kalman filters and to linear state observers used in control theory...

  • Alternative hypothesis
  • Analyse-it
    Analyse-it
    Analyse-it is a statistical analysis add-in for Microsoft Excel. Analyse-it is the successor to Astute, developed in 1992 for Excel 4 and the first statistical analysis add-in for Microsoft Excel...

     – software
  • Analysis of categorical data
    Analysis of categorical data
    This a list of statistical procedures which can be used for the analysis of categorical data, also known as data on the nominal scale and as categorical variables* Categorical distribution, general model* Stratified analysis* Chi-squared test...

  • Analysis of covariance
  • Analysis of molecular variance
    Analysis of molecular variance
    Analysis of molecular variance , is a statistical model for the molecular variation in a single species, typically biological. The name and model are inspired by ANOVA. The method was developed by Laurent Excoffier, Peter Smouse and Joseph Quattro at Rutgers University in 1992.Since developing...

  • Analysis of rhythmic variance
    Analysis of rhythmic variance
    In statistics, analysis of rhythmic variance is a method for detecting rhythms in biological time series, published by Peter Celec . It is a procedure for detecting cyclic variations in biological time series and quantification of their probability...

  • Analysis of variance
    Analysis of variance
    In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation...

  • Analytic and enumerative statistical studies
    Analytic and enumerative statistical studies
    Analytic and enumerative statistical studies are two types of scientific studies:In any statistical study the ultimate aim is to provide a rational basis for action. Enumerative and analytic studies differ by where the action is taken...

  • Ancestral graph
    Ancestral graph
    An ancestral graph is a graph with three types of edges: directed edge, bidirected edge, and undirected edge such that it can be decomposed into three parts: an undirected subgraph, a directed subgraph, and directed edges pointing from the undirected subgraph to the directed subgraph.An ancestral...

  • Anchor test
    Anchor test
    In psychometrics, an anchor test is a common set of test items administered in combination with two or more alternative forms of the test with the aim of establishing the equivalence of the test scores on the alternative forms. The purpose of the anchor test is to provide a baseline for an...

  • Ancillary statistic
    Ancillary statistic
    In statistics, an ancillary statistic is a statistic whose sampling distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken...

  • ANCOVA
    ANCOVA
    In statistics, analysis of covariance is a general linear model with a continuous outcome variable and two or more predictor variables where at least one is continuous and at least one is categorical . ANCOVA is a merger of ANOVA and regression for continuous variables...

     – redirects to Analysis of covariance
  • Anderson–Darling test
  • ANOVA
  • ANOVA on ranks
    ANOVA on ranks
    In statistics, one purpose for the analysis of variance is to analyze differences in means between groups. The test statistic, F, assumes independence of observations, homogeneous variances, and population normality...

  • ANOVA-simultaneous component analysis
    ANOVA-simultaneous component analysis
    ASCA, ANOVA-SCA, or analysis of variance – simultaneous component analysis is a method that partitions variation and enables interpretation of these partitions by SCA, a method that is similar to PCA. This method is a multi or even megavariate extension of ANOVA. The variation partitioning is...

  • Anomaly detection
    Anomaly detection
    Anomaly detection, also referred to as outlier detection refers to detecting patterns in a given data set that do not conform to an established normal behavior....

  • Anomaly time series
    Anomaly time series
    In atmospheric sciences and some other applications of statistics, an anomaly time series is the time series of deviations of a quantity from some mean. Similarly a standardized anomaly series contains values of deviations divided by a standard deviation...

  • Anscombe transform
    Anscombe transform
    In statistics, the Anscombe transform, named after Francis Anscombe, is a variance-stabilizing transformation that transforms a random variable with a Poisson distribution into one with an approximately standard Gaussian distribution. The Anscombe transform is widely used in photon-limited imaging ...

  • Anscombe's quartet
    Anscombe's quartet
    Anscombe's quartet comprises four datasets that have identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven points. They were constructed in 1973 by the statistician F.J...

  • Antecedent variable
    Antecedent variable
    In statistics and social sciences, an antecedent variable is a variable that can help to explain the apparent relationship between other variables that are nominally in a cause and effect relationship...

  • Antithetic variates
    Antithetic variates
    The antithetic variates method is a variance reduction technique used in Monte Carlo methods. Considering that the error reduction in the simulated signal has a square root convergence , a very large number of sample paths is required to obtain an accurate result.-Underlying principle:The...

  • Approximate Bayesian computation
    Approximate Bayesian computation
    Approximate Bayesian computation is a family of computational techniques in Bayesian statistics. These simulation techniques operate on summary data to make broad inferences with less computation than might be required if all available data were analyzed in detail...

  • Arcsine distribution
  • Area chart
    Area chart
    An area chart or area graph displays graphically quantitive data. It is based on the line chart. The area between axis and line are commonly emphasized with colors, textures and hatchings...

  • Area compatibility factor
  • ARGUS distribution
  • Arithmetic mean
    Arithmetic mean
    In mathematics and statistics, the arithmetic mean, often referred to as simply the mean or average when the context is clear, is a method to derive the central tendency of a sample space...

  • Armitage–Doll multistage model of carcinogenesis
  • Arrival theorem
  • Artificial neural network
    Artificial neural network
    An artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...

  • Ascertainment bias
  • ASReml
    ASReml
    ASReml is a statistical software package for fitting linear mixed models using restricted maximum likelihood, a technique commonly used in plant and animal breeding and quantitative genetics as well as other fields...

      – software
  • Association (statistics)
    Association (statistics)
    In statistics, an association is any relationship between two measured quantities that renders them statistically dependent. The term "association" refers broadly to any such relationship, whereas the narrower term "correlation" refers to a linear relationship between two quantities.There are many...

  • Association mapping
    Association mapping
    Association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes .-Theory:Association mapping is based on the idea that traits that have entered a population...

  • Association scheme
    Association scheme
    The theory of association schemes arose in statistics, in the theory of experimental design for the analysis of variance. In mathematics, association schemes belong to both algebra and combinatorics. Indeed, in algebraic combinatorics, association schemes provide a unified approach to many topics,...

  • Assumed mean
    Assumed mean
    In statistics the assumed mean is a method for calculating the arithmetic mean and standard deviation of a data set. It simplifies calculating accurate values by hand. Its interest today is chiefly historical but it can be used to quickly estimate these statistics...

  • Asymptotic distribution
    Asymptotic distribution
    In mathematics and statistics, an asymptotic distribution is a hypothetical distribution that is in a sense the "limiting" distribution of a sequence of distributions...

  • Asymptotic equipartition property
    Asymptotic equipartition property
    In information theory the asymptotic equipartition property is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of compression....

     (information theory)
  • Asymptotic normality – redirects to Asymptotic distribution
    Asymptotic distribution
    In mathematics and statistics, an asymptotic distribution is a hypothetical distribution that is in a sense the "limiting" distribution of a sequence of distributions...

  • Asymptotic relative efficiency redirects to Efficiency (statistics)
    Efficiency (statistics)
    In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors...

  • Asymptotic theory (statistics)
    Asymptotic theory (statistics)
    In statistics, asymptotic theory, or large sample theory, is a generic framework for assessment of properties of estimators and statistical tests...

  • Atkinson index
    Atkinson index
    The Atkinson index is a measure of income inequality developed by British economist Anthony Barnes Atkinson...

  • Attack rate
    Attack rate
    In epidemiology, an attack rate is the cumulative incidence of infection in a group of people observed over a period of time during an epidemic, usually in relation to foodborne illness....

  • Augmented Dickey–Fuller test
  • Aumann's agreement theorem
    Aumann's agreement theorem
    Aumann's agreement theorem says that two people acting rationally and with common knowledge of each other's beliefs cannot agree to disagree...

  • Autocorrelation
    Autocorrelation
    Autocorrelation is the cross-correlation of a signal with itself. Informally, it is the similarity between observations as a function of the time separation between them...

    • Autocorrelation plot redirects to Correlogram
      Correlogram
      In the analysis of data, a correlogram is an image of correlation statistics. For example, in time series analysis, a correlogram, also known as an autocorrelation plot, is a plot of the sample autocorrelations r_h\, versus h\, ....

  • Autocovariance
    Autocovariance
    In statistics, given a real stochastic process X, the autocovariance is the covariance of the variable with itself, i.e. the variance of the variable against a time-shifted version of itself...

  • Autoregressive conditional duration
    Autoregressive conditional duration
    In financial econometrics, an autoregressive conditional duration model considers irregularly spaced and autocorrelated intertrade durations. ACD is analogous to GARCH...

  • Autoregressive conditional heteroskedasticity
    Autoregressive conditional heteroskedasticity
    In econometrics, AutoRegressive Conditional Heteroskedasticity models are used to characterize and model observed time series. They are used whenever there is reason to believe that, at any point in a series, the terms will have a characteristic size, or variance...

  • Autoregressive fractionally integrated moving average
    Autoregressive fractionally integrated moving average
    In statistics, autoregressive fractionally integrated moving average models are time series models that generalize ARIMA models by allowing non-integer values of the differencing parameter and are useful in modeling time series with long memory...

  • Autoregressive integrated moving average
    Autoregressive integrated moving average
    In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average model is a generalization of an autoregressive moving average model. These models are fitted to time series data either to better understand the data or to predict future points...

  • Autoregressive model
    Autoregressive model
    In statistics and signal processing, an autoregressive model is a type of random process which is often used to model and predict various types of natural phenomena...

  • Autoregressive moving average model
    Autoregressive moving average model
    In statistics and signal processing, autoregressive–moving-average models, sometimes called Box–Jenkins models after the iterative Box–Jenkins methodology usually used to estimate them, are typically applied to autocorrelated time series data.Given a time series of data Xt, the ARMA model is a...

  • Auxiliary particle filter
    Auxiliary particle filter
    The auxiliary particle filter is a particle filtering algorithm introduced by Pitt and Shephard in 1999 to improve some deficiencies of the sequential importance resampling algorithm when dealing with tailed observation densities....

  • Average
    Average
    In mathematics, an average, or central tendency of a data set is a measure of the "middle" value of the data set. Average is one form of central tendency. Not all central tendencies should be considered definitions of average....

  • Average treatment effect
  • Averaged one-dependence estimators
  • Azuma's inequality

B

  • BA model
    BA model
    The Barabási–Albert model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Scale-free networks are widely observed in natural and man-made systems, including the Internet, the world wide web, citation networks, and some social...

     – model for a random network
  • Backfitting algorithm
    Backfitting algorithm
    In statistics, the backfitting algorithm is a simple iterative procedure used to fit a generalized additive model. It was introduced in 1985 by Leo Breiman and Jerome Friedman along with generalized additive models...

  • Balance equation
    Balance equation
    In probability theory, a balance equation is an equation that describes the probability flux associated with a Markov chain in and out of states or set of states.-Global balance:...

  • Balanced incomplete block design redirects to Block design
  • Balanced repeated replication
    Balanced repeated replication
    Balanced repeated replication is a statistical technique for estimating the sampling variability of a statistic obtained by stratified sampling.- Outline of the technique :# Select balanced half-samples from the full sample....

  • Balding–Nichols model
  • Banburismus
    Banburismus
    Banburismus was a cryptanalytic process developed by Alan Turing at Bletchley Park in England during the Second World War. It was used by Bletchley Park's Hut 8 to help break German Kriegsmarine messages enciphered on Enigma machines. The process used sequential conditional probability to infer...

     — related to Bayesian networks
  • Bapat–Beg theorem
  • Bar chart
    Bar chart
    A bar chart or bar graph is a chart with rectangular bars with lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally....

  • Barabási–Albert model
  • Barber–Johnson diagram
  • Barnard's test
    Barnard's test
    In statistics, Barnard's test is an exact test of the null hypothesis of independence of rows and columns in a contingency table. It is an alternative to Fisher's exact test but is more time-consuming to compute...

  • Barnardisation
    Barnardisation
    Barnardisation is a method of disclosure control for tables of counts that involves randomly adding or subtracting 1 from some cells in the table....

  • Barnes interpolation
    Barnes interpolation
    Barnes interpolation, named after Stanley L. Barnes, is the interpolation of unstructured data points from a set of measurements of an unknown function in two dimensions into an analytic function of two variables...

  • Bartlett's method
  • Bartlett's test
    Bartlett's test
    In statistics, Bartlett's test is used to test if k samples are from populations with equal variances. Equal variances across samples is called homoscedasticity or homogeneity of variances. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups...

  • Base rate
    Base rate
    In probability and statistics, base rate generally refers to the class probabilities unconditioned on featural evidence, frequently also known as prior probabilities...

  • Baseball statistics
    Baseball statistics
    Statistics play an important role in summarizing baseball performance and evaluating players in the sport.Since the flow of a baseball game has natural breaks to it, and normally players act individually rather than performing in clusters, the sport lends itself to easy record-keeping and statistics...

  • Basu's theorem
    Basu's theorem
    In statistics, Basu's theorem states that any boundedly complete sufficient statistic is independent of any ancillary statistic. This is a 1955 result of Debabrata Basu....

  • Bates distribution
    Bates distribution
    In probability and statistics, the Bates distribution, is a probability distribution of the mean of a number of statistically independent uniformly distributed random variables on the unit interval...

  • Baum–Welch algorithm
  • Bayes' rule
    Bayes' rule
    In probability theory and applications, Bayes' rule relates the odds of event A_1 to event A_2, before and after conditioning on event B. The relationship is expressed in terms of the Bayes factor, \Lambda. Bayes' rule is derived from and closely related to Bayes' theorem...

  • Bayes' theorem
    Bayes' theorem
    In probability theory and applications, Bayes' theorem relates the conditional probabilities P and P. It is commonly used in science and engineering. The theorem is named for Thomas Bayes ....

    • Evidence under Bayes theorem
      Evidence under Bayes theorem
      Bayes' theorem provides a way of updating the probability of an event in the light of new information. In the evidence law context, for example, it could be used as a way of updating the probability that a genetic sample found at the scene of the crime came from the defendant in light of a genetic...

  • Bayes estimator
    Bayes estimator
    In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function . Equivalently, it maximizes the posterior expectation of a utility function...

  • Bayes factor
    Bayes factor
    In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing. Bayesian model comparison is a method of model selection based on Bayes factors.-Definition:...

  • Bayes linear statistics
  • Bayesian — disambiguation
  • Bayesian additive regression kernels
    Bayesian additive regression kernels
    Bayesian additive regression kernels is a non-parametric statistical model for regression and statistical classification.The unknown mean function is represented as a weighted sum of kernel functions, which is constructed by a prior using...

  • Bayesian average
    Bayesian average
    A Bayesian average is a method of estimating the mean of a population consistent with Bayesian interpretation, where instead of estimating the mean strictly from the available data set, other existing information related to that data set may also be incorporated into the calculation in order to...

  • Bayesian brain
    Bayesian brain
    Bayesian brain is a term that is used to refer to the ability of the nervous system to operate in situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics. This term is used in behavioural sciences and neuroscience and studies associated with this term...

  • Bayesian econometrics
    Bayesian econometrics
    Bayesian econometrics is a branch of econometrics which applies Bayesian principles to economic modelling. Bayesianism is based on a degree-of-belief interpretation of probability, as opposed to a relative-frequency interpretation....

  • Bayesian experimental design
    Bayesian experimental design
    Bayesian experimental design provides a general probability-theoretical framework from which other theories on experimental design can be derived. It is based on Bayesian inference to interpret the observations/data acquired during the experiment...

  • Bayesian game
    Bayesian game
    In game theory, a Bayesian game is one in which information about characteristics of the other players is incomplete. Following John C. Harsanyi's framework, a Bayesian game can be modelled by introducing Nature as a player in a game...

  • Bayesian inference
    Bayesian inference
    In statistics, Bayesian inference is a method of statistical inference. It is often used in science and engineering to determine model parameters, make predictions about unknown variables, and to perform model selection...

  • Bayesian inference in phylogeny
    Bayesian inference in phylogeny
    Bayesian inference in phylogeny generates a posterior distribution for a parameter, composed of a phylogenetic tree and a model of evolution, based on the prior for that parameter and the likelihood of the data, generated by a multiple alignment. The Bayesian approach has become more popular due...

  • Bayesian information criterion
  • Bayesian linear regression
    Bayesian linear regression
    In statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within the context of Bayesian inference...

  • Bayesian model comparison — redirects to Bayes factor
    Bayes factor
    In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing. Bayesian model comparison is a method of model selection based on Bayes factors.-Definition:...

  • Bayesian multivariate linear regression
  • Bayesian network
    Bayesian network
    A Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph . For example, a Bayesian network could represent the probabilistic...

  • Bayesian probability
    Bayesian probability
    Bayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...

  • Bayesian search theory
    Bayesian search theory
    Bayesian search theory is the application of Bayesian statistics to the search for lost objects. It has been used several times to find lost sea vessels, for example the USS Scorpion.-Procedure:The usual procedure is as follows:...

  • Bayesian spam filtering
    Bayesian spam filtering
    Bayesian spam filtering is a statistical technique of e-mail filtering. It makes use of a naive Bayes classifier to identify spam e-mail.Bayesian classifiers work by correlating the use of tokens , with spam and non spam e-mails and then using Bayesian inference to calculate a probability that an...

  • Bayesian statistics
    Bayesian statistics
    Bayesian statistics is that subset of the entire field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief or, more specifically, Bayesian probabilities...

  • Bayesian VAR
    Bayesian VAR
    Bayesian Vector Autoregression is a term which indicates that Bayesian methods are used to estimate a vector autoregression . In that respect, the difference with standard VAR models lies on the fact that the model parameters are treated as random variables, and prior probabilities are assigned to...

     — Bayesian Vector Autoregression
  • BCMP network
    BCMP network
    In queueing theory, a discipline within the mathematical theory of probability, a BCMP network is a class of queueing network for which a product form equilibrium distribution exists. It is named after the authors of the paper where the network was first described: Baskett, Chandy, Muntz and Palacios...

     – queueing theory
  • Bean machine
    Bean machine
    The bean machine, also known as the quincunx or Galton box, is a device invented by Sir Francis Galton to demonstrate the central limit theorem, in particular that the normal distribution is approximate to the binomial distribution....

  • Behrens–Fisher problem
  • Belief propagation
    Belief propagation
    Belief propagation is a message passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for each unobserved node, conditional on any observed nodes...

  • Belt transect
    Belt transect
    Belt transects are used in biology to investigate the distribution of organisms in relation to a certain area, such as the seashore or a meadow. It records all the species found between two lines and how far they are for a certain place or area and how many of them there are...

  • Benford's law
    Benford's law
    Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is distributed in a specific, non-uniform way...

  • Bennett's inequality
    Bennett's inequality
    In probability theory, Bennett's inequality provides an upper bound on the probability that the sum of independent random variables deviates from its expected value by more than any specified amount...

  • Berkson error model
    Berkson error model
    The Berkson error model is a description of random error in measurement. Unlike classical error, Berkson error causes little or no bias in the measurement. It was proposed by Joseph Berkson in a paper entitled Are there two regressions?, published in 1950.An example of Berkson error arises in...

  • Berkson's paradox
    Berkson's paradox
    Berkson's paradox or Berkson's fallacy is a result in conditional probability and statistics which is counter-intuitive for some people, and hence a veridical paradox. It is a complicating factor arising in statistical tests of proportions...

  • Berlin procedure
    Berlin procedure
    The so-called Berlin procedure is a mathematical procedure for time series decomposition and seasonal adjustment of monthly and quarterly economic time series. The mathematical foundations of the procedure were developed in 1960's at the Technical University of Berlin and the German Institute...

  • Bernoulli distribution
  • Bernoulli process
    Bernoulli process
    In probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variables Xi are identical and independent...

  • Bernoulli sampling
    Bernoulli sampling
    In the theory of finite population sampling, Bernoulli sampling is a sampling process where each element of the population that is sampled is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample during the drawing of a single sample...

  • Bernoulli scheme
    Bernoulli scheme
    In mathematics, the Bernoulli scheme or Bernoulli shift is a generalization of the Bernoulli process to more than two possible outcomes. Bernoulli schemes are important in the study of dynamical systems, as most such systems exhibit a repellor that is the product of the Cantor set and a smooth...

  • Bernoulli trial
    Bernoulli trial
    In the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure"....

  • Bernstein inequalities (probability theory)
  • Bernstein–von Mises theorem
    Bernstein–von Mises theorem
    In Bayesian inference, the Bernstein–von Mises theorem provides the basis for the important result that the posterior distribution for unknown quantities in any problem is effectively independent of the prior distribution once the amount of information supplied by a sample of data is large...

  • Berry–Esseen theorem
    Berry–Esséen theorem
    The central limit theorem in probability theory and statistics states that under certain circumstances the sample mean, considered as a random quantity, becomes more normally distributed as the sample size is increased...

  • Bertrand's ballot theorem
  • Bertrand's box paradox
    Bertrand's box paradox
    Bertrand's box paradox is a classic paradox of elementary probability theory. It was first posed by Joseph Bertrand in his Calcul des probabilités, published in 1889.There are three boxes:# a box containing two gold coins,...

  • Bessel process
    Bessel process
    In mathematics, a Bessel process, named after Friedrich Bessel, is a type of stochastic process. The n-dimensional Bessel process is the real-valued process X given byX_t = \| W_t \|,...

  • Bessel's correction
    Bessel's correction
    In statistics, Bessel's correction, named after Friedrich Bessel, is the use of n − 1 instead of n in the formula for the sample variance and sample standard deviation, where n is the number of observations in a sample: it corrects the bias in the estimation of the population variance,...

  • Best linear unbiased prediction
    Best linear unbiased prediction
    In statistics, best linear unbiased prediction is used in linear mixed models for the estimation of random effects. BLUP was derived by Charles Roy Henderson in 1950 but the term "best linear unbiased predictor" seems not to have been used until 1962...

  • Beta (finance)
  • Beta-binomial distribution
  • Beta-binomial model
  • Beta distribution
  • Beta function – for incomplete beta function
  • Beta negative binomial distribution
  • Beta prime distribution
  • Beverton–Holt model
  • Bhatia–Davis inequality
    Bhatia–Davis inequality
    In mathematics, the Bhatia–Davis inequality, named after Rajendra Bhatia and Chandler Davis, is an upper bound on the variance of any bounded probability distribution on the real line....

  • Bhattacharya coefficient redirects to Bhattacharyya distance
    Bhattacharyya distance
    In statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after A...

  • Bias (statistics)
    Bias (statistics)
    A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter of interest. The following lists some types of, or aspects of, bias which should not be considered mutually exclusive:...

  • Bias of an estimator
    Bias of an estimator
    In statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...

  • Biased random walk (biochemistry)
    Biased random walk (biochemistry)
    In cell biology, a biased random walk enables bacteria to search for food and flee from harm. Bacteria propel themselves with the aid of flagella in a process called chemotaxis, and a typical bacteria trajectory has many characteristics of a random walk. They move forward for a certain distance,...

  • Biased sample – redirects to Sampling bias
  • Biclustering
    Biclustering
    Biclustering, co-clustering, or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix....

  • Big O in probability notation
    Big O in probability notation
    The order in probability notation is used in probability theory and statistical theory in direct parallel to the big-O notation which is standard in mathematics...

  • Bienaymé–Chebyshev inequality
    Chebyshev's inequality
    In probability theory, Chebyshev’s inequality guarantees that in any data sample or probability distribution,"nearly all" values are close to the mean — the precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean...

  • Bills of Mortality
    Bills of Mortality
    The London Bills of Mortality were the main source of mortality statistics, designed to monitor deaths from the plague from the 17th century-1830s. They were used mainly as a way of warning about plague epidemics....

  • Bimodal distribution
    Bimodal distribution
    In statistics, a bimodal distribution is a continuous probability distribution with two different modes. These appear as distinct peaks in the probability density function, as shown in Figure 1....

  • Binary classification
    Binary classification
    Binary classification is the task of classifying the members of a given set of objects into two groups on the basis of whether they have some property or not. Some typical binary classification tasks are...

  • Bingham distribution
    Bingham distribution
    In statistics, the Bingham distribution, named after Christopher Bingham, is an antipodally symmetric probability distribution on the n-sphere...

  • Binomial distribution
  • Binomial proportion confidence interval
    Binomial proportion confidence interval
    In statistics, a binomial proportion confidence interval is a confidence interval for a proportion in a statistical population. It uses the proportion estimated in a statistical sample and allows for sampling error. There are several formulas for a binomial confidence interval, but all of them rely...

  • Binomial regression
    Binomial regression
    In statistics, binomial regression is a technique in which the response is the result of a series of Bernoulli trials, or a series of one of two possible disjoint outcomes...

  • Binomial test
    Binomial test
    In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories.-Common use:...

  • Bioinformatics
    Bioinformatics
    Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...

  • Biometrics (statistics) — redirects to Biostatistics
    Biostatistics
    Biostatistics is the application of statistics to a wide range of topics in biology...

  • Biostatistics
    Biostatistics
    Biostatistics is the application of statistics to a wide range of topics in biology...

  • Biplot
    Biplot
    Biplots are a type of exploratory graph used in statistics, a generalization of the simple two-variable scatterplot. A biplot allows information on both samples and variables of a data matrix to be displayed graphically. Samples are displayed as points while variables are displayed either as...

  • Birnbaum–Saunders distribution
  • Birth-death process
    Birth-death process
    The birth–death process is a special case of continuous-time Markov process where the states represent the current size of a population and where the transitions are limited to births and deaths...

  • Bispectrum
    Bispectrum
    In mathematics, in the area of statistical analysis, the bispectrum is a statistic used to search for nonlinear interactions. The Fourier transform of the second-order cumulant, i.e., the autocorrelation function, is the traditional power spectrum...

  • Bivariate analysis
    Bivariate analysis
    Bivariate analysis is one of the simplest forms of the quantitative analysis. It involves the analysis of two variables , for the purpose of determining the empirical relationship between them...

  • Bivariate von Mises distribution
    Bivariate von Mises distribution
    In probability theory and statistics, the bivariate von Mises distribution is a probability distribution describing values on a torus. It may be thought of as an analogue on the torus of the bivariate normal distribution. The distribution belongs to the field of directional statistics. The general...

  • Black–Scholes
  • Bland–Altman plot
  • Blind deconvolution
    Blind deconvolution
    In image processing and applied mathematics, blind deconvolution is a deconvolution technique that permits recovery of the target scene from a single or set of "blurred" images in the presence of a poorly determined or unknown point spread function ....

  • Blind experiment
  • Block design
  • Blocking (statistics)
    Blocking (statistics)
    In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups that are similar to one another. For example, an experiment is designed to test a new drug on patients. There are two levels of the treatment, drug, and placebo, administered to male...

  • BMDP
    BMDP
    BMDP is a statistical package developed in 1961 at UCLA. Based on the older BIMED program for biomedical applications, it used keyword parameters in the input instead of fixed-format cards, so the letter P was added to the letters BMD, although the name was later defined as being an abbreviation...

     – software
  • Bochner's theorem
    Bochner's theorem
    In mathematics, Bochner's theorem characterizes the Fourier transform of a positive finite Borel measure on the real line.- Background :...

  • Bonferroni correction
    Bonferroni correction
    In statistics, the Bonferroni correction is a method used to counteract the problem of multiple comparisons. It was developed and introduced by Italian mathematician Carlo Emilio Bonferroni...

  • Bonferroni inequalities – redirects to Boole's inequality
    Boole's inequality
    In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events...

  • Boole's inequality
    Boole's inequality
    In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events...

  • Boolean analysis
    Boolean analysis
    Boolean analysis was introduced by Flament . The goal of a Boolean analysis is to detect deterministic dependencies between the items of a questionnaire or similar data-structures in observed response patterns. These deterministic dependencies have the form of logical formulas connecting the items...

  • Bootstrap aggregating
    Bootstrap aggregating
    Bootstrap aggregating is a machine learning ensemble meta-algorithm to improve machine learning of statistical classification and regression models in terms of stability and classification accuracy. It also reduces variance and helps to avoid overfitting. Although it is usually applied to decision...

  • Bootstrap error-adjusted single-sample technique
    Bootstrap error-adjusted single-sample technique
    In statistics, the bootstrap error-adjusted single-sample technique is a non-parametric method that is intended to allow an assessment to be made of the validity of a single sample. It is based on estimating a probability distribution representing what can be expected from valid samples...

  • Bootstrapping (statistics)
    Bootstrapping (statistics)
    In statistics, bootstrapping is a computer-based method for assigning measures of accuracy to sample estimates . This technique allows estimation of the sample distribution of almost any statistic using only very simple methods...

  • Bootstrapping populations
  • Borel–Cantelli lemma
  • Bose–Mesner algebra
    Bose–Mesner algebra
    In mathematics, a Bose–Mesner algebra is a set of matrices, together with set of rules for combining those matrices, such that certain conditions apply...

  • Box–Behnken design
  • Box–Cox distribution
  • Box–Cox transformation – redirects to Power transform
    Power transform
    In statistics, the power transform is from a family of functions that are applied to create a rank-preserving transformation of data using power functions. This is a useful data processing technique used to stabilize variance, make the data more normal distribution-like, improve the correlation...

  • Box–Jenkins
  • Box–Muller transform
  • Box–Pierce test
  • Box plot
    Box plot
    In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation , lower quartile , median , upper quartile , and largest observation...

  • Branching process
    Branching process
    In probability theory, a branching process is a Markov process that models a population in which each individual in generation n produces some random number of individuals in generation n + 1, according to a fixed probability distribution that does not vary from individual to...

  • Bregman divergence
    Bregman divergence
    In mathematics, the Bregman divergence or Bregman distance is similar to a metric, but does not satisfy the triangle inequality nor symmetry. There are two ways in which Bregman divergences are important. Firstly, they generalize squared Euclidean distance to a class of distances that all share...

  • Breusch–Godfrey test
    Breusch–Godfrey test
    In statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series...

  • Breusch–Pagan statistic – redirects to Breusch–Pagan test
  • Breusch–Pagan test
  • Brown–Forsythe test
  • Brownian bridge
    Brownian bridge
    A Brownian bridge is a continuous-time stochastic process B whose probability distribution is the conditional probability distribution of a Wiener process W given the condition that B = B = 0.The expected value of the bridge is zero, with variance t, implying that the most...

  • Brownian excursion
    Brownian excursion
    In probability theory a Brownian excursion process is a stochastic processes that is closely related to a Wiener process . Realisations of Brownian excursion processes are essentially just realisations of a Weiner process seleced to satisfy certain conditions...

  • Brownian motion
    Brownian motion
    Brownian motion or pedesis is the presumably random drifting of particles suspended in a fluid or the mathematical model used to describe such random movements, which is often called a particle theory.The mathematical model of Brownian motion has several real-world applications...

  • Brownian tree
    Brownian tree
    A Brownian tree, whose name is derived from Robert Brown via Brownian motion, is a form of computer art that was briefly popular in the 1990s, when home computers started to have sufficient power to simulate Brownian motion...

  • Bruck–Ryser–Chowla theorem
  • Burke's theorem
    Burke's theorem
    In probability theory, Burke's theorem is a theorem in queueing theory by Paul J. Burke while working at Bell Telephone Laboratories that states for an M/M/1, M/M/m or M/M/∞ queue in the steady state with arrivals a Poisson process with rate parameter λ then:# The departure process is a Poisson...

  • Burr distribution
  • Business statistics
    Business statistics
    Business statistics is the science of good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement, and marketing research....

  • Bühlmann model
    Bühlmann model
    The Bühlmann model is a random effects model used in credibility theory in actuarial science to determine the appropriate premium for a group of insurance contracts....

  • Buzen's algorithm
    Buzen's algorithm
    In queueing theory, a discipline within the mathematical theory of probability, Buzen's algorithm is an algorithm for calculating the normalization constant G in the Gordon–Newell theorem. This method was first proposed by Jeffrey P. Buzen in 1973. Once G is computed the probability distributions...

  • BV4.1 (software)
    BV4.1 (software)
    The application software BV4.1 is a user-friendly tool for decomposing and seasonally adjusting monthly or quarterly economic time series by the so-called Berlin procedure. It is being developed by the Federal Statistical Office of Germany...


C

  • c-chart
  • Càdlàg
    Càdlàg
    In mathematics, a càdlàg , RCLL , or corlol function is a function defined on the real numbers that is everywhere right-continuous and has left limits everywhere...

  • Calculating demand forecast accuracy
    Calculating Demand Forecast Accuracy
    Calculating demand forecast accuracy is the process of determining the accuracy of forecasts made regarding customer demand for a product.-Importance of forecasts:...

  • Calculus of predispositions
    Calculus of predispositions
    Calculus of predispositions is a basic part of predispositioning theory and belongs to the indeterministic procedures.-Introduction:“The key component of any indeterministic procedure is the evaluation of a position...

  • CalEst
    CalEst
    CalEst is a statistics package which also includes probability functions as well as tutorials to enhance the learning of Statistics and Probability...

     – software
  • Calibrated probability assessment
    Calibrated probability assessment
    Calibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty. In other words, when a calibrated person says they are "80% confident" in each of 100 predictions they...

  • Calibration (probability) – subjective probability, redirects to Calibrated probability assessment
    Calibrated probability assessment
    Calibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty. In other words, when a calibrated person says they are "80% confident" in each of 100 predictions they...

  • Calibration (statistics)
    Calibration (statistics)
    There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. Thus "calibration" can mean...

     – the statistical calibration problem
  • Cancer cluster
    Cancer cluster
    Cancer cluster is a term used by epidemiologists, statisticians, and public health workers to define an occurrence of a greater-than-expected number of cancer cases within a group of people in a geographic area over a period of time....

  • Candlestick chart
    Candlestick chart
    A candlestick chart is a style of bar-chart used primarily to describe price movements of a security, derivative, or currency over time.It is a combination of a line-chart and a bar-chart, in that each bar represents the range of price movement over a given time interval. It is most often used in...

  • Canonical analysis
    Canonical analysis
    In statistics, canonical analysis belongs to the family of regression methods for data analysis. Regression analysis quantifies a relationship between a predictor variable and a criterion variable by the coefficient of correlation r, coefficient of determination r², and the standard regression...

  • Canonical correlation
    Canonical correlation
    In statistics, canonical correlation analysis, introduced by Harold Hotelling, is a way of making sense of cross-covariance matrices. If we have two sets of variables, x_1, \dots, x_n and y_1, \dots, y_m, and there are correlations among the variables, then canonical correlation analysis will...

  • Canopy clustering algorithm
    Canopy clustering algorithm
    The canopy clustering algorithm is an unsupervised pre-clustering algorithm, often used as preprocessing step for the K-means algorithm or the Hierarchical clustering algorithm....

  • Cantor distribution
  • Carpet plot
    Carpet plot
    A carpet plot is any of a few different specific types of diagram.- Interaction of two independent variables :Probably the more common plot referred to as a carpet plot is one that illustrates the interacting behaviour of two independent variables, which among other things facilitates interpolation...

  • Cartogram
    Cartogram
    A cartogram is a map in which some thematic mapping variable – such as travel time or Gross National Product – is substituted for land area or distance. The geometry or space of the map is distorted in order to convey the information of this alternate variable...

  • Case-control
    Case-control
    A case-control study is a type of study design in epidemiology. Case-control studies are used to identify factors that may contribute to a medical condition by comparing subjects who have that condition with patients who do not have the condition but are otherwise similar .Case-control studies are...

     – redirects to Case-control study
  • Case-control study
  • Catastro of Ensenada
    Catastro of Ensenada
    In 1749 a large-scale census and statistical investigation was conducted in the Crown of Castile . It included population, territorial properties, buildings, cattle, offices, all kinds of revenue and trades, and even geographical information from each place...

     – a census of part of Spain
  • Categorical data
    Categorical data
    In statistics, categorical data is that part of an observed dataset that consists of categorical variables, or for data that has been converted into that form, for example as grouped data...

  • Categorical distribution
    Categorical distribution
    In probability theory and statistics, a categorical distribution is a probability distribution that describes the result of a random event that can take on one of K possible outcomes, with the probability of each outcome separately specified...

  • Categorical variable
  • Cauchy distribution
    Cauchy distribution
    The Cauchy–Lorentz distribution, named after Augustin Cauchy and Hendrik Lorentz, is a continuous probability distribution. As a probability distribution, it is known as the Cauchy distribution, while among physicists, it is known as the Lorentz distribution, Lorentz function, or Breit–Wigner...

  • Cauchy–Schwarz inequality
    Cauchy–Schwarz inequality
    In mathematics, the Cauchy–Schwarz inequality , is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, and other areas...

  • Causal Markov condition
    Causal Markov condition
    The Markov condition for a Bayesian network states that any node in a Bayesian network is conditionally independent of its nondescendents, given its parents.A node is conditionally independent of the entire network, given its Markov blanket....

  • Ceiling effect
    Ceiling effect
    The term ceiling effect has two distinct meanings, referring to the level at which an independent variable no longer has an effect on a dependent variable, or to the level above which variance in an independent variable is no longer measured or estimated...

  • Censored regression model
    Censored regression model
    Censored regression models commonly arise in econometrics in cases where the variable ofinterest is only observable under certain conditions. A common example is labor supply. Data are frequently available on the hours worked by employees, and a labor supply model estimates the relationship between...

  • Censoring (clinical trials)
    Censoring (clinical trials)
    The term censoring is used in clinical trials to refer to mathematically removing a patient from the survival curve at the end of their follow-up time. Censoring a patient will reduce the sample size for analyzing after the time of the censorship...

  • Censoring (statistics)
    Censoring (statistics)
    In statistics, engineering, and medical research, censoring occurs when the value of a measurement or observation is only partially known.For example, suppose a study is conducted to measure the impact of a drug on mortality. In such a study, it may be known that an individual's age at death is at...

  • Centering matrix
    Centering matrix
    In mathematics and multivariate statistics, the centering matrix is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component.- Definition :...

  • Centerpoint (geometry)
    Centerpoint (geometry)
    In statistics and computational geometry, the notion of centerpoint is a generalization of the median to data in higher-dimensional Euclidean space...

     — Tukey median redirects here
  • Central composite design
    Central composite design
    In statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order model for the response variable without needing to use a complete three-level factorial experiment....

  • Central limit theorem
    Central limit theorem
    In probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. The central limit theorem has a number of variants. In its common...

    • Central limit theorem (illustration) — redirects to Illustration of the central limit theorem
      Illustration of the central limit theorem
      This article gives two concrete illustrations of the central limit theorem. Both involve the sum of independent and identically-distributed random variables and show how the probability distribution of the sum approaches the normal distribution as the number of terms in the sum increases.The first...

    • Central limit theorem for directional statistics
      Central limit theorem for directional statistics
      In probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed....

    • Lyapunov's central limit theorem
    • Martingale central limit theorem
      Martingale central limit theorem
      In probability theory, the central limit theorem says that, under certain conditions, the sum of many independent identically-distributed random variables, when scaled appropriately, converges in distribution to a standard normal distribution...

  • Central moment
    Central moment
    In probability theory and statistics, central moments form one set of values by which the properties of a probability distribution can be usefully characterised...

  • Central tendency
    Central tendency
    In statistics, the term central tendency relates to the way in which quantitative data is clustered around some value. A measure of central tendency is a way of specifying - central value...

  • Census
    Census
    A census is the procedure of systematically acquiring and recording information about the members of a given population. It is a regularly occurring and official count of a particular population. The term is used mostly in connection with national population and housing censuses; other common...

  • Cepstrum
    Cepstrum
    A cepstrum is the result of taking the Fourier transform of the logarithm of the spectrum of a signal. There is a complex cepstrum, a real cepstrum, a power cepstrum, and phase cepstrum....

  • CHAID
    CHAID
    CHAID is a type of decision tree technique, based upon adjusted significance testing . The technique was developed in South Africa and was published in 1980 by Gordon V. Kass, who had completed a PhD thesis on this topic...

     — CHi-squared Automatic Interaction Detector
  • Chain rule for Kolmogorov complexity
    Chain rule for Kolmogorov complexity
    The chain rule for Kolmogorov complexity is an analogue of the chain rule for information entropy, which states:H = H + HThat is, the combined randomness of two sequences X and Y is the sum of the randomness of X plus whatever randomness is left in Y once we know X.This follows immediately from the...

  • Challenge-dechallenge-rechallenge
    Challenge-dechallenge-rechallenge
    Challenge-dechallenge-rechallenge is a medical testing protocol in which a medicine or drug is administered, withdrawn, then re-administered, while being monitored for adverse effects at each stage...

  • Change detection
    Change detection
    In statistical analysis, change detection tries to identify changes in the probability distribution of a stochastic process or time series. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times...

    • Change detection (GIS)
      Change detection (GIS)
      Change detection for GIS is a process that measures how the attributes of a particular area have changed between two or more time periods. Change detection often involves comparing aerial photographs or satellite imagery of the area taken at different times...

  • Chapman–Kolmogorov equation
  • Chapman–Robbins bound
    Chapman–Robbins bound
    In statistics, the Chapman–Robbins bound or Hammersley–Chapman–Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. It is a generalization of the Cramér–Rao bound; compared to the Cramér–Rao bound, it is both tighter and applicable to a wider range of problems...

  • Characteristic function (probability theory)
    Characteristic function (probability theory)
    In probability theory and statistics, the characteristic function of any random variable completely defines its probability distribution. Thus it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative...

  • Chauvenet's criterion
    Chauvenet's criterion
    In statistical theory, the Chauvenet's criterion is a means of assessing whether one piece of experimental data — an outlier — from a set of observations, is likely to be spurious....

  • Chebyshev center
    Chebyshev center
    In geometry, the Chebyshev center of a bounded set Q having non-empty interior is the center of the minimal-radius ball enclosing the entire set Q, or, alternatively, the center of largest inscribed ball of Q ....

  • Chebyshev's inequality
    Chebyshev's inequality
    In probability theory, Chebyshev’s inequality guarantees that in any data sample or probability distribution,"nearly all" values are close to the mean — the precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean...

  • Checking if a coin is biased — redirects to Checking whether a coin is fair
  • Checking whether a coin is fair
  • Cheeger bound
    Cheeger bound
    In mathematics, the Cheeger bound is a bound of the second largest eigenvalue of the transition matrix of a finite-state, discrete-time, reversible stationary Markov chain. It can be seen as a special case of Cheeger inequalities in expander graphs....

  • Chemometrics
    Chemometrics
    Chemometrics is the science of extracting information from chemical systems by data-driven means. It is a highly interfacial discipline, using methods frequently employed in core data-analytic disciplines such as multivariate statistics, applied mathematics, and computer science, in order to...

  • Chernoff bound
    Chernoff bound
    In probability theory, the Chernoff bound, named after Herman Chernoff, gives exponentially decreasing bounds on tail distributions of sums of independent random variables...

     – a special case of Chernoff's inequality
  • Chernoff face
  • Chernoff's distribution
    Chernoff's distribution
    In probability theory, Chernoff's distribution, named after Herman Chernoff, is the probability distribution of the random variablewhere W is a "two-sided" Wiener process satisfying W = 0.If...

  • Chernoff's inequality
  • Chi distribution
  • Chi-squared distribution
  • Chi-squared test
  • Chinese restaurant process
  • Choropleth map
    Choropleth map
    A choropleth map A choropleth map A choropleth map (Greek χώρος + πληθαίν:, ("area/region" + "multiply") is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita...

  • Chow test
    Chow test
    The Chow test is a statistical and econometric test of whether the coefficients in two linear regressions on different data sets are equal. The Chow test was invented by economist Gregory Chow. In econometrics, the Chow test is most commonly used in time series analysis to test for the presence of...

  • Chronux
    Chronux
    Chronux is an open-source software package developed for the loading, visualization and analysis of a variety of modalities / formats of neurobiological time series data...

     software
  • Circular distribution
  • Circular error probable
    Circular error probable
    In the military science of ballistics, circular error probable is an intuitive measure of a weapon system's precision...

  • Circular statistics
    Circular statistics
    Directional statistics is the subdiscipline of statistics that deals with directions , axes or rotations in Rn...

     – redirects to Directional statistics
  • Circular uniform distribution
    Circular uniform distribution
    In probability theory and directional statistics, a circular uniform distribution is a probability distribution on the unit circle whose density is uniform for all angles.- Description :The pdf of the circular uniform distribution is:...

  • Clark–Ocone theorem
  • Class membership probabilities
    Class membership probabilities
    In general proplems of classification, class membership probabilities reflect the uncertainty with which a given indivual item can be assigned to any given class. Although statistical classification methods by definition generate such probabilities, applications of classification in machine...

  • Classic data sets
  • Classical definition of probability
    Classical definition of probability
    The classical definition of probability is identified with the works of Pierre-Simon Laplace. As stated in his Théorie analytique des probabilités,This definition is essentially a consequence of the principle of indifference...

  • Classical test theory
    Classical test theory
    Classical test theory is a body of related psychometric theory that predict outcomes of psychological testing such as the difficulty of items or the ability of test-takers. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological...

      - psychometrics
  • Classification rule
    Classification rule
    Given a population whose members can be potentially separated into a number of different sets or classes, a classification rule is a procedure in which the elements of the population set are each assigned to one of the classes. A perfect test is such that every element in the population is assigned...

  • Classifier (mathematics)
  • Climate ensemble
    Climate ensemble
    In physics, a statistical ensemble is a large set of copies of a system, considered all at once; each copy of the system representing a different possible detailed realisation of the system, consistent with the system's observed macroscopic properties....

  • Clinical significance
    Clinical significance
    In medicine and psychology, clinical significance refers to either of two related but slightly dissimilar concepts whereby certain findings or differences, even if measurable or statistically confirmed, either may or may not have additional significance, either by being of a magnitude that conveys...

  • Clinical study design
  • Clinical trial
    Clinical trial
    Clinical trials are a set of procedures in medical research and drug development that are conducted to allow safety and efficacy data to be collected for health interventions...

  • Clinical utility of diagnostic tests
    Clinical utility of diagnostic tests
    The clinical utility of a diagnostic test is its capacity to rule diagnosis in and/or out and to make a decision possible to adopt or to reject a therapeutic action. It can be integrated into clinical prediction rules for specific diseases or outcomes....

  • Cliodynamics
    Cliodynamics
    thumb|Clio—detail from [[The Art of Painting|The Allegory of Painting]] by [[Johannes Vermeer]]Cliodynamics is a new multidisciplinary area of research focused at mathematical modeling of historical dynamics.-Origins:The term was originally coined by Peter...

  • Closed testing procedure
    Closed testing procedure
    In statistics, the closed testing procedure is a general method for performing more than one hypothesis test simultaneously.-The closed testing principle:...

  • Cluster analysis
  • Cluster analysis (in marketing)
    Cluster analysis (in marketing)
    Cluster analysis is a class of statistical techniques that can be applied to data that exhibit “natural” groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group of relatively homogeneous cases or observations. Objects in a cluster are similar to...

  • Cluster randomised controlled trial
    Cluster randomised controlled trial
    A cluster randomised controlled trial is a type of randomised controlled trial in which groups of subjects are randomised...

  • Cluster sampling
    Cluster sampling
    Cluster Sampling is a sampling technique used when "natural" groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into these groups and a sample of the groups is selected. Then the required information is...

  • Cluster-weighted modeling
    Cluster-weighted modeling
    In statistics, cluster-weighted modeling is an algorithm-based approach to non-linear prediction of outputs from inputs based on density estimation using a set of models that are each notionally appropriate in a sub-region of the input space...

  • Clustering high-dimensional data
    Clustering high-dimensional data
    Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional data spaces are often encountered in areas such as medicine, where DNA microarray technology can produce a large number of measurements at once, and...

  • CMA-ES
    CMA-ES
    CMA-ES stands for Covariance Matrix Adaptation Evolution Strategy. Evolution strategies are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation...

     (Covariance Matrix Adaptation Evolution Strategy)
  • Coalescent theory
    Coalescent theory
    In genetics, coalescent theory is a retrospective model of population genetics. It attempts to trace all alleles of a gene shared by all members of a population to a single ancestral copy, known as the most recent common ancestor...

  • Cochran's C test
    Cochran's C test
    In statistics, Cochran's C test , named after William G. Cochran, is a one-sided upper limit variance outlier test. The C test is used to decide if a single estimate of a variance is significantly larger than a group of variances with which the single estimate is supposed to be comparable...

  • Cochran's Q test
  • Cochran's theorem
    Cochran's theorem
    In statistics, Cochran's theorem, devised by William G. Cochran, is a theorem used in to justify results relating to the probability distributions of statistics that are used in the analysis of variance.- Statement :...

  • Cochran-Armitage test for trend
    Cochran-Armitage test for trend
    The Cochran-Armitage test for trend, named for William Cochran and Peter Armitage, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and a variable with k categories. It modifies the chi-squared test to...

  • Cochran–Mantel–Haenszel statistics
    Cochran–Mantel–Haenszel statistics
    In statistics, the Cochran–Mantel–Haenszel statistics are a collection of test statistics used in the analysis of stratified categorical data.. They are named after William G Cochran, Nathan Mantel and William Haenszel. One of these test statistics is the Cochran–Mantel–Haenszel test, which allows...

  • Cochrane–Orcutt estimation
  • Coding (social sciences)
    Coding (social sciences)
    Coding refers to an analytical process in which data, in both quantitative form or qualitative are categorised to facilitate analysis....

  • Coefficient of coherence — redirects to Coherence (statistics)
    Coherence (statistics)
    In probability theory and statistics, coherence can have two meanings.*When dealing with personal probability assessments, or supposed probabilities derived in nonstandard ways, it is a property of self-consistency across a whole set of such assessments...

  • Coefficient of determination
    Coefficient of determination
    In statistics, the coefficient of determination R2 is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model...

  • Coefficient of dispersion
    Coefficient of dispersion
    In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, or variance-to-mean ratio , like the coefficient of variation, is a normalized measure of the dispersion of a probability distribution: it is a measure used to quantify whether a set of...

  • Coefficient of variation
    Coefficient of variation
    In probability theory and statistics, the coefficient of variation is a normalized measure of dispersion of a probability distribution. It is also known as unitized risk or the variation coefficient. The absolute value of the CV is sometimes known as relative standard deviation , which is...

  • Cognitive pretesting
    Cognitive pretesting
    Cognitive interviewing is a field research method used primarily in pre-testing survey instruments developed in collaboration by psychologists and survey researchers. It allows survey researchers to collect verbal information regarding survey responses and is used in evaluating whether the...

  • Cohen's class distribution function
    Cohen's class distribution function
    Bilinear time–frequency distributions, or quadratic time–frequency distributions, arise in a sub-field field of signal analysis and signal processing called time–frequency signal processing, and, in the statistical analysis of time series data...

     – a time-frequency distribution function
  • Cohen's kappa
    Cohen's kappa
    Cohen's kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative items. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance. Some...

  • Coherence (statistics)
    Coherence (statistics)
    In probability theory and statistics, coherence can have two meanings.*When dealing with personal probability assessments, or supposed probabilities derived in nonstandard ways, it is a property of self-consistency across a whole set of such assessments...

  • Cohort (statistics)
    Cohort (statistics)
    In statistics and demography, a cohort is a group of subjects who have shared a particular time together during a particular time span . Cohorts may be tracked over extended periods in a cohort study. The cohort can be modified by censoring, i.e...

  • Cohort effect
    Cohort effect
    The term cohort effect is used in social science to describe variations in the characteristics of an area of study over time among individuals who are defined by some shared temporal experience or common life experience, such as year of birth, or year of exposure to radiation.Cohort effects are...

  • Cohort study
    Cohort study
    A cohort study or panel study is a form of longitudinal study used in medicine, social science, actuarial science, and ecology. It is an analysis of risk factors and follows a group of people who do not have the disease, and uses correlations to determine the absolute risk of subject contraction...

  • Cointegration
    Cointegration
    Cointegration is a statistical property of time series variables. Two or more time series are cointegrated if they share a common stochastic drift.-Introduction:...

  • Collectively exhaustive events
  • Collider (epidemiology)
    Collider (epidemiology)
    In epidemiology, a collider is a variable which is the effect of two other variables. It is known as collider because, in graphical models, the other variables lead to the collider in a way that their arrow heads appear to collide on the same node that is the collider e.g.M \rightarrow P...

  • Combinatorial data analysis
    Combinatorial data analysis
    Combinatorial data analysis is the study of data sets where the arrangement of objects is important. CDA can be used either to determine how well a given combinatorial construct reflects the observed data, or to search for a suitable combinatorial construct that does fit the data.-See...

  • Combinatorial design
    Combinatorial design
    Combinatorial design theory is the part of combinatorial mathematics that deals with the existence and construction of systems of finite sets whose intersections have specified numerical properties....

  • Combinatorial meta-analysis
  • Common mode failure
  • Common-cause and special-cause
    Common-cause and special-cause
    Common- and special-causes are the two distinct origins of variation in a process, as defined in the statistical thinking and methods of Walter A. Shewhart and W. Edwards Deming...

  • Comparing means
    Comparing means
    The following tables provide guidance to the selection of the proper parametric or non-parametric statistical tests for a given data set.-Is there a difference ?:...

  • Comparison of general and generalized linear models
  • Comparison of statistical packages
    Comparison of statistical packages
    The following tables compare general and technical information for a number of statistical analysis packages.-General information:Basic information about each product...

  • Comparisonwise error rate
  • Complementary event
    Complementary event
    In probability theory, the complement of any event A is the event [not A], i.e. the event that A does not occur. The event A and its complement [not A] are mutually exclusive and exhaustive. Generally, there is only one event B such that A and B are both mutually exclusive and...

  • Complete-linkage clustering
    Complete-linkage clustering
    In cluster analysis, complete linkage or farthest neighbour is a method of calculating distances between clusters in agglomerative hierarchical clustering...

  • Complete spatial randomness
    Complete spatial randomness
    Complete spatial randomness describes a point process whereby point events occur within a given study area in a completely random fashion. Such a process is often modeled using only one parameter, i.e. the density of points, \rho within the defined area...

  • Completely randomized design
    Completely randomized design
    In the design of experiments, completely randomized designs are for studying the effects of one primary factor without the need to take other nuisance variables into account. This article describes completely randomized designs that have one primary factor. The experiment compares the values of a...

  • Completeness (statistics)
    Completeness (statistics)
    In statistics, completeness is a property of a statistic in relation to a model for a set of observed data. In essence, it is a condition which ensures that the parameters of the probability distribution representing the model can all be estimated on the basis of the statistic: it ensures that the...

  • Compositional data
    Compositional data
    In statistics, compositional data are quantitative descriptions of the parts of some whole, conveying exclusively relative information.This definition, given by John Aitchison has several consequences:...

  • Composite bar chart
    Composite bar chart
    Composite bar charts are bar charts which always total 100, but each element is shown as a percentage of the bar allowing different sample sizes to be more easily compared.-External links:...

  • Compound Poisson distribution
    Compound Poisson distribution
    In probability theory, a compound Poisson distribution is the probability distribution of the sum of a "Poisson-distributed number" of independent identically-distributed random variables...

  • Compound Poisson process
    Compound Poisson process
    A compound Poisson process is a continuous-time stochastic process with jumps. The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution...

  • Compound probability distribution
    Compound probability distribution
    In probability theory, a compound probability distribution is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution F with an unknown parameter θ that is distributed according to some other distribution G, and then...

  • Computational formula for the variance
  • Computational learning theory
    Computational learning theory
    In theoretical computer science, computational learning theory is a mathematical field related to the analysis of machine learning algorithms.-Overview:Theoretical results in machine learning mainly deal with a type of...

  • Computational statistics
    Computational statistics
    Computational statistics, or statistical computing, is the interface between statistics and computer science. It is the area of computational science specific to the mathematical science of statistics....

  • Computer experiment
    Computer experiment
    In the scientific context, a computer experiment refer to mathematical modeling using computer simulation. It has become common to call such experiments in silico...

  • Concordance correlation coefficient
    Concordance correlation coefficient
    In statistics, the concordance correlation coefficient measures the agreement between two variables, e.g., to evaluate reproducibility or for inter-rater reliability.-Definition:...

  • Concordant pair
  • Concrete illustration of the central limit theorem
  • Concurrent validity
    Concurrent validity
    Concurrent validity is a parameter used in sociology, psychology, and other psychometric or behavioral sciences. Concurrent validity is demonstrated where a test correlates well with a measure that has previously been validated. The two measures may be for the same construct, or for different, but...

  • Conditional change model
    Conditional change model
    The conditional change model in statistics is the analytic procedure in which change scores are regressed on baseline values, together with the explanatory variables of interest . The method has some substantial advantages over the usual two-sample t-test recommended in textbooks.-References:*...

  • Conditional distribution — redirects to Conditional probability distribution
  • Conditional expectation
    Conditional expectation
    In probability theory, a conditional expectation is the expected value of a real random variable with respect to a conditional probability distribution....

  • Conditional independence
    Conditional independence
    In probability theory, two events R and B are conditionally independent given a third event Y precisely if the occurrence or non-occurrence of R and the occurrence or non-occurrence of B are independent events in their conditional probability distribution given Y...

  • Conditional probability
    Conditional probability
    In probability theory, the "conditional probability of A given B" is the probability of A if B is known to occur. It is commonly notated P, and sometimes P_B. P can be visualised as the probability of event A when the sample space is restricted to event B...

  • Conditional probability distribution
  • Conditional random field
    Conditional random field
    A conditional random field is a statistical modelling method often applied in pattern recognition.More specifically it is a type of discriminative undirected probabilistic graphical model. It is used to encode known relationships between observations and construct consistent interpretations...

  • Conditional variance
    Conditional variance
    In probability theory and statistics, a conditional variance is the variance of a conditional probability distribution. Particularly in econometrics, the conditional variance is also known as the scedastic function or skedastic function...

  • Conditionality principle
    Conditionality principle
    The conditionality principle is a Fisherian principle of statistical inference that Allan Birnbaum formally defined and studied in his 1962 JASA article. Together with the sufficiency principle, Birnbaum's version of the principle implies the famous likelihood principle...

  • Confidence band
    Confidence band
    A confidence band is used in statistical analysis to represent the uncertainty in an estimate of a curve or function based on limited or noisy data. Confidence bands are often used as part of the graphical presentation of results in a statistical analysis...

  • Confidence distribution
    Confidence distribution
    In statistics, the concept of a confidence distribution has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest...

  • Confidence interval
    Confidence interval
    In statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...

  • Confidence region
    Confidence region
    In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.The confidence region is...

  • Configural frequency analysis
    Configural frequency analysis
    Configural frequency analysis is a method of exploratory data analysis. The goal of a configural frequency analysis is to detect patterns in the data that occur significantly more or significantly less often than expected by chance...

  • Confirmation bias
    Confirmation bias
    Confirmation bias is a tendency for people to favor information that confirms their preconceptions or hypotheses regardless of whether the information is true.David Perkins, a geneticist, coined the term "myside bias" referring to a preference for "my" side of an issue...

  • Confirmatory factor analysis
    Confirmatory factor analysis
    In statistics, confirmatory factor analysis is a special form of factor analysis. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct . In contrast to exploratory factor analysis, where all loadings are free to vary,...

  • Confounding
    Confounding
    In statistics, a confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable...

  • Confounding factor
  • Confusion of the inverse
    Confusion of the inverse
    Confusion of the inverse, also called the conditional probability fallacy, is a logical fallacy whereupon a conditional probability is equivocated with its inverse: That is, given two events A and B, the probability Pr is assumed to be approximately equal to Pr.-Example 1:In one study, physicians...

  • Conjoint analysis
    Conjoint analysis
    Conjoint analysis, also called multi-attribute compositional models or stated preference analysis, is a statistical technique that originated in mathematical psychology. Today it is used in many of the social sciences and applied sciences including marketing, product management, and operations...

    • Conjoint analysis (in healthcare)
      Conjoint analysis (in healthcare)
      -Why conjoint in healthcare market research?:Pharmaceutical manufacturers need deeper and deeper market information they can rely on to make the right decisions and to identify the most promising market opportunities[1][6] . They can obtain great benefits from understanding physicians’ prescription...

    • Conjoint analysis (in marketing)
      Conjoint analysis (in marketing)
      Conjoint analysis is a statistical technique used in market research to determine how people value different features that make up an individual product or service....

  • Conjugate prior
    Conjugate prior
    In Bayesian probability theory, if the posterior distributions p are in the same family as the prior probability distribution p, the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood...

  • Consensus-based assessment
  • Consensus forecast
  • Consistency (statistics)
    Consistency (statistics)
    In statistics, consistency of procedures such as confidence intervals or hypothesis tests involves their behaviour as the number of items in the data-set to which they are applied increases indefinitely...

     (disambiguation)
  • Consistent estimator
    Consistent estimator
    In statistics, a sequence of estimators for parameter θ0 is said to be consistent if this sequence converges in probability to θ0...

  • Constant elasticity of substitution
    Constant Elasticity of Substitution
    In economics, Constant elasticity of substitution is a property of some production functions and utility functions.More precisely, it refers to a particular type of aggregator function which combines two or more types of consumption, or two or more types of productive inputs into an aggregate...

  • Constant false alarm rate
    Constant false alarm rate
    Constant false alarm rate detection refers to a common form of adaptive algorithm used in radar systems to detect target returns against a background of noise, clutter and interference.Other detection algorithms are not adaptive...

  • Constraint (information theory)
    Constraint (information theory)
    Constraint in information theory refers to the degree of statistical dependence between or among variables.Garner provides a thorough discussion of various forms of constraint with application to pattern recognition and psychology....

  • Consumption distribution
    Consumption distribution
    In economics, the consumption distribution is an alternative to the income distribution for judging economic inequality, comparing levels of consumption rather than income or wealth.-See also:* Economic inequality* Wealth condensation* Lorenz curve* Asset...

  • Contact process (mathematics)
  • Content validity
    Content validity
    In psychometrics, content validity refers to the extent to which a measure represents all facets of a given social construct. For example, a depression scale may lack content validity if it only assesses the affective dimension of depression but fails to take into account the behavioral dimension...

  • Contiguity (probability theory)
    Contiguity (probability theory)
    In probability theory, two sequences of probability measures are said to be contiguous if asymptotically they share the same support. Thus the notion of contiguity extends the concept of absolute continuity to the sequences of measures....

  • Contingency table
    Contingency table
    In statistics, a contingency table is a type of table in a matrix format that displays the frequency distribution of the variables...

  • Continuity correction
    Continuity correction
    In probability theory, if a random variable X has a binomial distribution with parameters n and p, i.e., X is distributed as the number of "successes" in n independent Bernoulli trials with probability p of success on each trial, then...

  • Continuous distribution — redirects to Continuous probability distribution
  • Continuous mapping theorem
    Continuous mapping theorem
    In probability theory, the continuous mapping theorem states that continuous functions are limit-preserving even if their arguments are sequences of random variables. A continuous function, in Heine’s definition, is such a function that maps convergent sequences into convergent sequences: if xn → x...

  • Continuous probability distribution
  • Continuous stochastic process
    Continuous stochastic process
    In the probability theory, a continuous stochastic process is a type of stochastic process that may be said to be "continuous" as a function of its "time" or index parameter. Continuity is a nice property for a process to have, since it implies that they are well-behaved in some sense, and,...

  • Continuous-time Markov process
  • Continuous-time stochastic process
    Continuous-time stochastic process
    In probability theory and statistics, a continuous-time stochastic process, or a continuous-space-time stochastic process is a stochastic process for which the index variable takes a continuous set of values, as contrasted with a discrete-time process for which the index variable takes only...

  • Contrast (statistics)
    Contrast (statistics)
    In statistics, particularly analysis of variance, a contrast is a linear combination of two or more factor level means whose coefficients add up to zero. A simple contrast is the difference between two means...

  • Control chart
    Control chart
    Control charts, also known as Shewhart charts or process-behaviour charts, in statistical process control are tools used to determine whether or not a manufacturing or business process is in a state of statistical control.- Overview :...

  • Control event rate
    Control event rate
    In epidemiology and biostatistics, the control event rate is a measure of how often a particular statistical event occurs within the scientific control group of an experiment ....

  • Control limits
    Control limits
    Control limits, also known as natural process limits, are horizontal lines drawn on a statistical process control chart, usually at a distance of ±3 standard deviations of the plotted statistic from the statistic's mean....

  • Control variate
    Control variate
    The control variates method is a variance reduction technique used in Monte Carlo methods. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.-Underlying principle:...

  • Controlling for a variable
    Controlling for a variable
    Controlling for a variable refers to the deliberate varying of the experimental conditions in order to see the impact of a specific variable when predicting the outcome variable . Controlling tends to reduce the experimental error...

  • Convergence of measures
    Convergence of measures
    In mathematics, more specifically measure theory, there are various notions of the convergence of measures. Three of the most common notions of convergence are described below.-Total variation convergence of measures:...

  • Convergence of random variables
    Convergence of random variables
    In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes...

  • Convex hull
    Convex hull
    In mathematics, the convex hull or convex envelope for a set of points X in a real vector space V is the minimal convex set containing X....

  • Convolution of probability distributions
    Convolution of probability distributions
    The convolution of probability distributions arises in probability theory and statistics as the operation in terms of probability distributions that corresponds to the addition of independent random variables and, by extension, to forming linear combinations of random variables...

  • Convolution random number generator
  • Conway–Maxwell–Poisson distribution
  • Cook's distance
    Cook's distance
    In statistics, Cook's distance is a commonly used estimate of the influence of a data point when doing least squares regression analysis. In a practical ordinary least squares analysis, Cook's distance can be used in several ways: to indicate data points that are particularly worth checking for...

  • Cophenetic correlation
    Cophenetic correlation
    In statistics, and especially in biostatistics, cophenetic correlation is a measure of how faithfully a dendrogram preserves the pairwise distances between the original unmodeled data points...

  • Copula (statistics)
    Copula (statistics)
    In probability theory and statistics, a copula can be used to describe the dependence between random variables. Copulas derive their name from linguistics....

  • Correct sampling
    Correct sampling
    During sampling of particulate materials, correct sampling is defined in Gy's sampling theory as a sampling scenario in which all particles in a population have the same probability of ending up in the sample ....

  • Correction for attenuation
    Correction for attenuation
    Correction for attenuation is a statistical procedure, due to Spearman , to "rid a correlation coefficient from the weakening effect of measurement error" , a phenomenon also known as regression dilution. In measurement and statistics, it is also called disattenuation...

  • Correlate summation analysis
    Correlate summation analysis
    Correlate summation analysis is a data mining method. It is designed to find the variables that are most covariant with all of the other variables being studied, relative to clustering. Aggregate correlate summation is the product of the totaled negative logarithm of the p-values for all of the...

  • Correlation
    Correlation
    In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....

  • Correlation and dependence
  • Correlation does not imply causation
    Correlation does not imply causation
    "Correlation does not imply causation" is a phrase used in science and statistics to emphasize that correlation between two variables does not automatically imply that one causes the other "Correlation does not imply causation" (related to "ignoring a common cause" and questionable cause) is a...

  • Correlation clustering
    Correlation clustering
    In machine learning, correlation clustering or cluster editing operates in a scenario where the relationship between the objects are known instead of the actual representation of the objects...

  • Correlation function
    Correlation function
    A correlation function is the correlation between random variables at two different points in space or time, usually as a function of the spatial or temporal distance between the points...

    • Correlation function (astronomy)
      Correlation function (astronomy)
      In astronomy, a correlation function describes the distribution of galaxies in the universe. By default, correlation function refers to the two-point autocorrelation function. For a given distance, the two-point autocorrelation function is a function of one variable which describes the...

    • Correlation function (quantum field theory)
      Correlation function (quantum field theory)
      In quantum field theory, the matrix element computed by inserting a product of operators between two states, usually the vacuum states, is called a correlation function....

    • Correlation function (statistical mechanics)
      Correlation function (statistical mechanics)
      In statistical mechanics, the correlation function is a measure of the order in a system, as characterized by a mathematical correlation function, and describes how microscopic variables at different positions are correlated....

  • Correlation implies causation
    Correlation implies causation
    "Correlation does not imply causation" is a phrase used in science and statistics to emphasize that correlation between two variables does not automatically imply that one causes the other "Correlation does not imply causation" (related to "ignoring a common cause" and questionable cause) is a...

  • Correlation inequality
    Correlation inequality
    In probability and statistics, a correlation inequality is one of a number of inequalities satisfied by the correlation functions of a model. Such inequalities are of particular use in statistical mechanics and in percolation theory.Examples include:...

  • Correlation ratio
    Correlation ratio
    In statistics, the correlation ratio is a measure of the relationship between the statistical dispersion within individual categories and the dispersion across the whole population or sample. The measure is defined as the ratio of two standard deviations representing these types of variation...

  • Correlogram
    Correlogram
    In the analysis of data, a correlogram is an image of correlation statistics. For example, in time series analysis, a correlogram, also known as an autocorrelation plot, is a plot of the sample autocorrelations r_h\, versus h\, ....

  • Correspondence analysis
    Correspondence analysis
    Correspondence analysis is a multivariate statistical technique proposed by Hirschfeld and later developed by Jean-Paul Benzécri. It is conceptually similar to principal component analysis, but applies to categorical rather than continuous data...

  • Cosmic variance
    Cosmic variance
    Cosmic variance is the statistical uncertainty inherent in observations of the universe at extreme distances. It is based on the idea that it is only possible to observe part of the universe at one particular time, so it is difficult to make statistical statements about cosmology on the scale of...

  • Cost-of-living index
    Cost-of-living index
    Cost of living is the cost of maintaining a certain standard of living. Changes in the cost of living over time are often operationalized in a cost of living index. Cost of living calculations are also used to compare the cost of maintaining a certain standard of living in different geographic areas...

  • Count data
  • Counternull
    Counternull
    In statistics, and especially in the statistical analysis of psychological data, the counternull is a statistic used to aid the understanding and presentation of research results...

  • Counting process
  • Covariance
    Covariance
    In probability theory and statistics, covariance is a measure of how much two variables change together. Variance is a special case of the covariance when the two variables are identical.- Definition :...

  • Covariance and correlation
    Covariance and correlation
    In probability theory and statistics, the mathematical descriptions of covariance and correlation are very similar. Both describe the degree of similarity between two random variables or sets of random variables....

  • Covariance intersection
    Covariance intersection
    Covariance intersection is an algorithm for combining two or more estimates of state variables in a Kalman filter when the correlation between them is unknown.-Specification:...

  • Covariance matrix
    Covariance matrix
    In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...

  • Covariance function
    Covariance function
    In probability theory and statistics, covariance is a measure of how much two variables change together and the covariance function describes the variance of a random variable process or field...

  • Covariate
    Covariate
    In statistics, a covariate is a variable that is possibly predictive of the outcome under study. A covariate may be of direct interest or it may be a confounding or interacting variable....

  • Cover's theorem
    Cover's Theorem
    Cover's Theorem is a statement in computational learning theory and is one of the primary theoretical motivations for the use of non-linear kernel methods in machine learning applications...

  • Coverage probability
    Coverage probability
    In statistics, the coverage probability of a confidence interval is the proportion of the time that the interval contains the true value of interest. For example, suppose our interest is in the mean number of months that people with a particular type of cancer remain in remission following...

  • Cox process
    Cox process
    A Cox process , also known as a doubly stochastic Poisson process or mixed Poisson process, is a stochastic process which is a generalization of a Poisson process...

  • Cox's theorem
    Cox's theorem
    Cox's theorem, named after the physicist Richard Threlkeld Cox, is a derivation of the laws of probability theory from a certain set of postulates. This derivation justifies the so-called "logical" interpretation of probability. As the laws of probability derived by Cox's theorem are applicable to...

  • Cox–Ingersoll–Ross model
  • Cramér–Rao bound
  • Cramér–von Mises criterion
  • Cramér's theorem
    Cramér's theorem
    In mathematical statistics, Cramér's theorem is one of several theorems of Harald Cramér, a Swedish statistician and probabilist.- Normal random variables :...

  • Cramér's V
    Cramér's V
    In statistics, Cramér's V is a popular measure of association between two nominal variables, giving a value between 0 and +1...

  • Craps principle
    Craps principle
    In probability theory, the craps principle is a theorem about event probabilities under repeated iid trials. Let E_1 and E_2 denote two mutually exclusive events which might occur on a given trial...

  • Credible interval
    Credible interval
    In Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...

  • Cricket statistics
    Cricket statistics
    Cricket is a sport that generates a large number of statistics.Statistics are recorded for each player during a match, and aggregated over a career. At the professional level, statistics for Test cricket, one-day internationals, and first-class cricket are recorded separately...

  • Crime statistics
    Crime statistics
    Crime statistics attempt to provide statistical measures of the crime in societies. Given that crime is usually secretive by nature, measurements of it are likely to be inaccurate....

  • Critical region — redirects to Statistical hypothesis testing
    Statistical hypothesis testing
    A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...

  • Cromwell's rule
    Cromwell's rule
    Cromwell's rule, named by statistician Dennis Lindley, states that one should avoid using prior probabilities of 0 or 1, except when applied to statements that are logically true or false...

  • Cronbach's α
    Cronbach's alpha
    Cronbach's \alpha is a coefficient of reliability. It is commonly used as a measure of the internal consistency or reliability of a psychometric test score for a sample of examinees. It was first named alpha by Lee Cronbach in 1951, as he had intended to continue with further coefficients...

  • Cross-correlation
    Cross-correlation
    In signal processing, cross-correlation is a measure of similarity of two waveforms as a function of a time-lag applied to one of them. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long-duration signal for a shorter, known feature...

  • Cross-covariance
  • Cross-entropy method
    Cross-entropy method
    The cross-entropy method attributed to Reuven Rubinstein is a general Monte Carlo approach tocombinatorial and continuous multi-extremal optimization and importance sampling.The method originated from the field of rare event simulation, where...

  • Cross-sectional data
    Cross-sectional data
    Cross-sectional data or cross section in statistics and econometrics is a type of one-dimensional data set. Cross-sectional data refers to data collected by observing many subjects at the same point of time, or without regard to differences in time...

  • Cross-sectional regression
    Cross-sectional regression
    A Cross-sectional regression is a type of regression model in which the explained and explanatory variables are associated with one period or point in time...

  • Cross-sectional study
    Cross-sectional study
    Cross-sectional studies form a class of research methods that involve observation of all of a population, or a representative subset, at one specific point in time...

  • Cross-spectrum
    Cross-spectrum
    In time series analysis, the cross-spectrum is used as part of a frequency domain analysis of the cross correlation or cross covariance between two time series.- Definition :...

  • Cross tabulation
    Cross tabulation
    Cross tabulation is the process of creating a contingency table from the multivariate frequency distribution of statistical variables. Heavily used in survey research, cross tabulations can be produced by a range of statistical packages, including some that are specialised for the task. Survey...

  • Cross-validation (statistics)
  • Crystal Ball function
    Crystal Ball function
    The Crystal Ball function, named after the Crystal Ball Collaboration , is a probability density function commonly used to model various lossy processes in high-energy physics. It consists of a Gaussian core portion and a power-law low-end tail, below a certain threshold...

     - a probability distribution
  • Cumulant
    Cumulant
    In probability theory and statistics, the cumulants κn of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. The moments determine the cumulants in the sense that any two probability distributions whose moments are identical will have...

  • Cumulant generating function — redirects to cumulant
    Cumulant
    In probability theory and statistics, the cumulants κn of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. The moments determine the cumulants in the sense that any two probability distributions whose moments are identical will have...

  • Cumulative distribution function
    Cumulative distribution function
    In probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

  • Cumulative frequency analysis
    Cumulative frequency analysis
    Cumulative frequency analysis is the applcation of estimation theory to exceedance probability . The complement, the non-exceedance probability concerns the frequency of occurrence of values of a phenomenon staying below a reference value. The phenomenon may be time or space dependent...

  • Cumulative incidence
    Cumulative incidence
    Cumulative incidence or incidence proportion is a measure of frequency, as in epidemiology, where it is a measure of disease frequency during a period of time...

  • Cunningham function
    Cunningham function
    In statistics, the Cunningham function or Pearson–Cunningham function ωm,n is a generalisation of a special function introduced by and studied in the form here by...

  • CURE data clustering algorithm
    CURE data clustering algorithm
    CURE is an efficient data clustering algorithm for large databases that is more robust to outliers and identifies clusters having non-spherical shapes and wide variances in size.- Drawbacks of traditional algorithms :...

  • Curve fitting
    Curve fitting
    Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a "smooth" function...

  • CUSUM
    CUSUM
    In statistical quality control, the CUSUM is a sequential analysis technique due to E. S. Page of the University of Cambridge. It is typically used for monitoring change detection...

  • Cuzick–Edwards test
  • Cyclostationary process

D

  • d'
    D'
    The sensitivity index or d' is a statistic used in signal detection theory. It provides the separation between the means of the signal and the noise distributions, in units of the standard deviation of the noise distribution....

  • d-separation
  • D'Agostino's K-squared test
    D'Agostino's K-squared test
    In statistics, D’Agostino’s K2 test is a goodness-of-fit measure of departure from normality, that is the test aims to establish whether or not the given sample comes from a normally distributed population...

  • Dagum distribution
  • DAP
    DAP (software)
    Dap is a statistics and graphics program, that performs data management, analysis, and graphical visualization tasks which are commonly required in statistical consulting practice....

     — open source software
  • Data analysis
    Data analysis
    Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making...

  • Data assimilation
    Data assimilation
    Applications of data assimilation arise in many fields of geosciences, perhaps most importantly in weather forecasting and hydrology. Data assimilation proceeds by analysis cycles...

  • Data binning
    Data binning
    Data binning is a data pre-processing technique used to reduce the effects of minor observation errors. The original data values which fall in a given small interval, a bin, are replaced by a value representative of that interval, often the central value...

  • Data classification (business intelligence)
    Data classification (Business Intelligence)
    In business intelligence, data classification has close ties to data clustering, but where data clustering is descriptive, data classification is predictive. In essence data classification consists of using variables with known values to predict the unknown or future values of other variables. It...

  • Data cleansing
    Data cleansing
    Data cleansing, data cleaning, or data scrubbing is the process of detecting and correcting corrupt or inaccurate records from a record set, table, or database. Used mainly in databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant, etc...

  • Data clustering
    Data clustering
    Cluster analysis or clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters....

  • Data collection
    Data collection
    Data collection is a term used to describe a process of preparing and collecting data, for example, as part of a process improvement or similar project. The purpose of data collection is to obtain information to keep on record, to make decisions about important issues, to pass information on to...

  • Data Desk
    Data Desk
    Data Desk is a software program for visual data analysis, visual data exploration, and statistics. It carries out Exploratory Data Analysis and standard statistical analyses by means of dynamically linked graphic data displays that update any change simultaneously.-History:Data Desk was developed...

     – software
  • Data dredging
    Data dredging
    Data dredging is the inappropriate use of data mining to uncover misleading relationships in data. Data-snooping bias is a form of statistical bias that arises from this misuse of statistics...

  • Data generating process (disambiguation)
  • Data mining
    Data mining
    Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

  • Data reduction
    Data reduction
    Data Reduction is the transformation of numerical or alphabetical digital information derived empirical or experimentally into a corrected, ordered, and simplified form....

  • Data point
    Data point
    In statistics, a data point is a set of measurements on a single member of a statistical population, or a subset of those measurements for a given individual...

  • Data quality assurance
    Data quality assurance
    Data quality assurance is the process of profiling the data to discover inconsistencies, and other anomalies in the data and performing data cleansing activities Data quality assurance is the process of profiling the data to discover inconsistencies, and other anomalies in the data and performing...

  • Data set
    Data set
    A data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. Its values for each of the variables, such as height and weight of an object or values of random numbers. Each...

  • Data-snooping bias
  • Data transformation (statistics)
    Data transformation (statistics)
    In statistics, data transformation refers to the application of a deterministic mathematical function to each point in a data set — that is, each data point zi is replaced with the transformed value yi = f, where f is a function...

  • Data visualization
    Data visualization
    Data visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information"....

  • DataDetective
    DataDetective
    DataDetective is a data mining platform developed by Sentient Information Systems. Since 1992, this software is being applied in organizations that have the need for retrieving patterns and relations in their typically large databases...

      – software
  • Dataplot
    Dataplot
    Dataplot is a public-domain software system for scientific visualization and statistical analysis. It was developed at the National Institute of Standards and Technology.-External links:*...

      – software
  • Davies–Bouldin index
    Davies–Bouldin index
    The Davies–Bouldin index in 1979 is a metric for evaluating clustering algorithms. This is an internal evaluation scheme, where the validation of how well the clustering has been done is made using quantities and features inherent to the dataset...

  • Davis distribution
  • De Finetti's game
  • De Finetti's theorem
    De Finetti's theorem
    In probability theory, de Finetti's theorem explains why exchangeable observations are conditionally independent given some latent variable to which an epistemic probability distribution would then be assigned...

  • de Moivre's law
    De Moivre's law
    De Moivre's Law is a survival model applied in actuarial science, named for Abraham de Moivre. It is a simple law of mortality based on a linear survival function.-Definition:De Moivre's law has a singleparameter \omega called the ultimate age...

  • De Moivre–Laplace theorem
    De Moivre–Laplace theorem
    In probability theory, the de Moivre–Laplace theorem is a normal approximation to the binomial distribution. It is a special case of the central limit theorem...

  • Decision boundary
    Decision boundary
    In a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class...

  • Decision theory
    Decision theory
    Decision theory in economics, psychology, philosophy, mathematics, and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision...

  • Decomposition of time series
  • Deep sampling
    Deep sampling
    Deep sampling is a variation of statistical sampling in which precision is sacrificed for insight. Small numbers of samples are taken, with each sample containing much information. The samples are taken approximately uniformly over the resource of interest, such as time or space...

  • Degenerate distribution
  • Degrees of freedom (statistics)
    Degrees of freedom (statistics)
    In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the...

  • Delphi method
    Delphi method
    The Delphi method is a structured communication technique, originally developed as a systematic, interactive forecasting method which relies on a panel of experts.In the standard version, the experts answer questionnaires in two or more rounds...

  • Delta method
    Delta method
    In statistics, the delta method is a method for deriving an approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator...

  • Demand forecasting
    Demand forecasting
    Demand forecasting is the activity of estimating the quantity of a product or service that consumers will purchase. Demand forecasting involves techniques including both informal methods, such as educated guesses, and quantitative methods, such as the use of historical sales data or current data...

  • Deming regression
    Deming regression
    In statistics, Deming regression, named after W. Edwards Deming, is an errors-in-variables model which tries to find the line of best fit for a two-dimensional dataset...

  • Demographics
    Demographics
    Demographics are the most recent statistical characteristics of a population. These types of data are used widely in sociology , public policy, and marketing. Commonly examined demographics include gender, race, age, disabilities, mobility, home ownership, employment status, and even location...

  • Demography
    Demography
    Demography is the statistical study of human population. It can be a very general science that can be applied to any kind of dynamic human population, that is, one that changes over time or space...

    • Demographic statistics
      Demographic statistics
      Among the kinds of data that national leaders need are the demographic statistics of their population. Records of births, deaths, marriages, immigration and emigration and a regular census of population provide information that is key to making sound decisions about national policy.A useful summary...

  • Dendrogram
    Dendrogram
    A dendrogram is a tree diagram frequently used to illustrate the arrangement of the clusters produced by hierarchical clustering...

  • Density estimation
    Density estimation
    In probability and statistics,density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function...

  • Dependent and independent variables
    Dependent and independent variables
    The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...

  • Descriptive research
    Descriptive research
    Descriptive research, also known as statistical research, describes data and characteristics about the population or phenomenon being studied. Descriptive research answers the questions who, what, where, when, "why" and how......

  • Descriptive statistics
    Descriptive statistics
    Descriptive statistics quantitatively describe the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics , in that descriptive statistics aim to summarize a data set, rather than use the data to learn about the population that the data are...

  • Design effect
    Design effect
    In statistics, the design effect is an adjustment used in some kinds of studies, such as cluster randomised trials, to allow for the design structure. The adjustment inflates the variance of parameter estimates, and therefore their standard errors, which is necessary to allow for correlations among...

  • Design matrix
    Design matrix
    In statistics, a design matrix is a matrix of explanatory variables, often denoted by X, that is used in certain statistical models, e.g., the general linear model....

  • Design of experiments
    Design of experiments
    In general usage, design of experiments or experimental design is the design of any information-gathering exercises where variation is present, whether under the full control of the experimenter or not. However, in statistics, these terms are usually used for controlled experiments...

    • The Design of Experiments
      The Design of Experiments
      The Design of Experiments is a 1935 book by the British statistician R.A. Fisher, which effectively founded the field of design of experiments. The book has been highly influential.-References:...

       (book by Fisher)
  • Detailed balance
    Detailed balance
    The principle of detailed balance is formulated for kinetic systems which are decomposed into elementary processes : At equilibrium, each elementary process should be equilibrated by its reverse process....

  • Detection theory
    Detection theory
    Detection theory, or signal detection theory, is a means to quantify the ability to discern between information-bearing energy patterns and random energy patterns that distract from the information Detection theory, or signal detection theory, is a means to quantify the ability to discern between...

  • Determining the number of clusters in a data set
    Determining the number of clusters in a data set
    Determining the number of clusters in a data set, a quantity often labeled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem....

  • Detrended correspondence analysis
    Detrended Correspondence Analysis
    Detrended correspondence analysis is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify ecological community data. For example, Hill and Gauch analyse the data of a vegetation...

  • Detrended fluctuation analysis
    Detrended fluctuation analysis
    In stochastic processes, chaos theory and time series analysis, detrended fluctuation analysis is a method for determining the statistical self-affinity of a signal. It is useful for analysing time series that appear to be long-memory processes In stochastic processes, chaos theory and time series...

  • Deviance (statistics)
  • Deviance information criterion
    Deviance information criterion
    The deviance information criterion is a hierarchical modeling generalization of the AIC and BIC . It is particularly useful in Bayesian model selection problems where the posterior distributions of the models have been obtained by Markov chain Monte Carlo simulation...

  • Deviation (statistics)
    Deviation (statistics)
    In mathematics and statistics, deviation is a measure of difference for interval and ratio variables between the observed value and the mean. The sign of deviation , reports the direction of that difference...

  • Deviation analysis (disambiguation)
  • DFFITS — a regression diagnostic
  • Dickey–Fuller test
  • Difference in differences
    Difference in differences
    Difference in differences is a quasi-experimental technique used in econometrics that measures the effect of a treatment at a given period in time. It is often used to measure the change induced by a particular treatment or event, though may be subject to certain biases...

  • Differential entropy
    Differential entropy
    Differential entropy is a concept in information theory that extends the idea of entropy, a measure of average surprisal of a random variable, to continuous probability distributions.-Definition:...

  • Diffusion process
    Diffusion process
    In probability theory, a branch of mathematics, a diffusion process is a solution to a stochastic differential equation. It is a continuous-time Markov process with continuous sample paths....

  • Diffusion-limited aggregation
    Diffusion-limited aggregation
    Diffusion-limited aggregation is the process whereby particles undergoing a random walk due to Brownian motion cluster together to form aggregates of such particles. This theory, proposed by Witten and Sander in 1981, is applicable to aggregation in any system where diffusion is the primary means...

  • Dimension reduction
  • Dilution assay
    Dilution assay
    The term dilution assay is generally used to designate a special type of bioassay in which one or more preparations are administered to experimental units at different dose levels inducing a measurable biological response. The dose levels are prepared by dilution in a diluent that is inert in...

  • Direct relationship
    Direct relationship
    In mathematics and statistics, a positive or direct relationship is a relationship between two variables in which change in one variable is associated with a change in the other variable in the same direction. For example all linear relationships with a positive slope are direct relationships...

  • Directional statistics
  • Dirichlet distribution
  • Dirichlet process
    Dirichlet process
    In probability theory, a Dirichlet process is a stochastic process that can be thought of as a probability distribution whose domain is itself a random distribution...

  • Disattenuation
  • Discrepancy function
    Discrepancy function
    A discrepancy function is a mathematical function which describes how closely a structural model conforms to observed data. Larger values of the discrepancy function indicate a poor fit of the model to data. In general, the parameter estimates for a given model are chosen so as to make the...

  • Discrete choice
    Discrete choice
    In economics, discrete choice problems involve choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Such choices contrast with standard consumption models in which the quantity of each good consumed is assumed...

  • Discrete choice analysis
  • Discrete distribution
  • Discrete phase-type distribution
    Discrete phase-type distribution
    The discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process...

  • Discrete probability distribution
  • Discrete time
    Discrete time
    Discrete time is the discontinuity of a function's time domain that results from sampling a variable at a finite interval. For example, consider a newspaper that reports the price of crude oil once every day at 6:00AM. The newspaper is described as sampling the cost at a frequency of once per 24...

  • Discretization of continuous features
    Discretization of continuous features
    In statistics and machine learning, discretization refers to the process of converting or partitioning continuous attributes, features or variables to discretized or nominal attributes/features/variables/intervals. This can be useful when creating probability mass functions – formally, in density...

  • Discriminant function analysis
    Discriminant function analysis
    Discriminant function analysis is a statistical analysis to predict a categorical dependent variable by one or more continuous or binary independent variables. It is different from an ANOVA or MANOVA, which is used to predict one or multiple continuous dependent variables by one or more...

  • Discriminative model
    Discriminative model
    Discriminative models are a class of models used in machine learning for modeling the dependence of an unobserved variable y on an observed variable x...

  • Disorder problem
    Disorder problem
    In the study of stochastic processes in mathematics, a disorder problem has been formulated by Kolmogorov. Specifically, the problem is use ongoing observations on a stochastic process to decide whether or not to raise an alarm that the probabilistic properties of the process have changed.An...

  • Distance correlation
    Distance correlation
    In statistics and in probability theory, distance correlation is a measure of statistical dependence between two random variables or two random vectors of arbitrary, not necessarily equal dimension. Its important property is that this measure of dependence is zero if and only if the random...

  • Distributed lag
    Distributed lag
    In statistics and econometrics, a distributed lag model is a model for time series data in which a regression equation is used to predict current values of a dependent variable based on both the current values of an explanatory variable and the lagged values of this explanatory variable.The...

  • Divergence (statistics)
    Divergence (statistics)
    In statistics and information geometry, divergence or a contrast function is a function which establishes the “distance” of one probability distribution to the other on a statistical manifold...

  • Diversity index
    Diversity index
    A diversity index is a statistic which is intended to measure the local members of a set consisting of various types of objects. Diversity indices can be used in many fields of study to assess the diversity of any population in which each member belongs to a unique group, type or species...

  • Divisia index
    Divisia index
    A Divisia index is a theoretical construct to create index number series for continuous-time data on prices and quantities of goods exchanged.It is designed to incorporate quantity and price changes over time from subcomponents which are measured in different units -- e.g...

  • Divisia monetary aggregates index
  • Dixon's Q test
  • Dominating decision rule
    Dominating decision rule
    In decision theory, a decision rule is said to dominate another if the performance of the former is sometimes better, and never worse, than that of the latter....

  • Donsker's theorem
    Donsker's theorem
    In probability theory, Donsker's theorem, named after M. D. Donsker, identifies a certain stochastic process as a limit of empirical processes. It is sometimes called the functional central limit theorem....

  • Doob decomposition theorem
    Doob decomposition theorem
    In the theory of discrete time stochastic processes, a part of the mathematical theory of probability, the Doob decomposition theorem gives a unique decomposition of any submartingale as the sum of a martingale and an increasing predictable process. The theorem was proved by and is named for J. L....

  • Doob martingale
    Doob martingale
    A Doob martingale is a mathematical construction of a stochastic process which approximates a given random variable and has the martingale property with respect to the given filtration...

  • Doob's martingale convergence theorems
    Doob's martingale convergence theorems
    In mathematics — specifically, in stochastic analysis — Doob's martingale convergence theorems are a collection of results on the long-time limits of supermartingales, named after the American mathematician Joseph Leo Doob....

  • Doob's martingale inequality
    Doob's martingale inequality
    In mathematics, Doob's martingale inequality is a result in the study of stochastic processes. It gives a bound on the probability that a stochastic process exceeds any given value over a given interval of time...

  • Doob–Meyer decomposition theorem
  • Doomsday argument
    Doomsday argument
    The Doomsday argument is a probabilistic argument that claims to predict the number of future members of the human species given only an estimate of the total number of humans born so far...

  • Dot plot (bioinformatics)
    Dot plot (bioinformatics)
    A dot plot is a graphical method that allows the comparison of two biological sequences and identify regions of close similarity between them. It is a kind of recurrence plot.-Introduction:...

  • Dot plot (statistics)
    Dot plot (statistics)
    A dot chart or dot plot is a statistical chart consisting of data points plotted on a simple scale, typically using filled in circles. There are two common, yet very different, versions of the dot chart. The first is described by Wilkinson as a graph that has been used in hand-drawn graphs to...

  • Double counting (fallacy)
    Double counting (fallacy)
    Double counting is a fallacy in which, when counting events or occurrences in probability or in other areas, a solution counts events two or more times, resulting in an erroneous number of events or occurrences which is higher than the true result...

  • Double exponential distribution — disambiguation
  • Double mass analysis
    Double mass analysis
    Double mass analysis is a commonly used data analysis approach for investigating the behaviour of records made of hydrological or meteorological data at a number of locations. It is used to determine whether there is a need for corrections to the data to account for changes in data collection...

  • Doubly stochastic model
    Doubly stochastic model
    In statistics, a doubly stochastic model is a type of model that can arise in many contexts, but in particular in modelling time-series and stochastic processes....

  • Drift rate — redirects to Stochastic drift
    Stochastic drift
    In probability theory, stochastic drift is the change of the average value of a stochastic process. A related term is the drift rate which is the rate at which the average changes. This is in contrast to the random fluctuations about this average value...

  • Dudley's theorem
    Dudley's theorem
    In probability theory, Dudley’s theorem is a result relating the expected upper bound and regularity properties of a Gaussian process to its entropy and covariance structure. The result was proved in a landmark 1967 paper of Richard M...

  • Dummy variable (statistics)
  • Duncan's new multiple range test
    Duncan's new multiple range test
    In statistics, Duncan's new multiple range test is a multiple comparison procedure developed by David B. Duncan in 1955. Duncan's MRT belongs to the general class of multiple comparison procedures that use the studentized range statistic qr to compare sets of means.Duncan's new multiple range test...

  • Durbin test
    Durbin test
    In the analysis of designed experiments, the Friedman test is the most common non-parametric test for complete block designs. The Durbin test is a nonparametric test for balanced incomplete designs that reduces to the Friedman test in the case of a complete block design.-Background:In a randomized...

  • Durbin–Watson statistic
  • Dutch book
    Dutch book
    In gambling a Dutch book or lock is a set of odds and bets which guarantees a profit, regardless of the outcome of the gamble. It is associated with probabilities implied by the odds not being coherent....

  • Dvoretzky–Kiefer–Wolfowitz inequality
    Dvoretzky–Kiefer–Wolfowitz inequality
    In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz inequality predicts how close an empirically determined distribution function will be to the distribution function from which the empirical samples are drawn...

  • Dyadic distribution
    Dyadic distribution
    A dyadic distribution is a specific type of discrete or categorical probability distribution that is of some theoretical importance in data compression.-Definition:...

  • Dynamic Bayesian network
    Dynamic Bayesian network
    A dynamic Bayesian network is a Bayesian network that represents sequences of variables. These sequences are often time-series or sequences of symbols . The hidden Markov model can be considered as a simple dynamic Bayesian network.- References :* , Zoubin Ghahramani, Lecture Notes In Computer...

  • Dynamic factor

E

  • E-statistic
  • Earth mover's distance
    Earth Mover's Distance
    In computer science, the earth mover's distance is a measure of the distance between two probability distributions over a region D. In mathematics, this is known as the Wasserstein metric...

  • Ecological correlation
    Ecological correlation
    In statistics, an ecological correlation is a correlation between two variables that are group means, in contrast to a correlation between two variables that describe individuals. For example, one might study the correlation between physical activity and weight among sixth-grade children...

  • Ecological fallacy
    Ecological fallacy
    An ecological fallacy is a logical fallacy in the interpretation of statistical data in an ecological study, whereby inferences about the nature of specific individuals are based solely upon aggregate statistics collected for the group to which those individuals belong...

  • Ecological study
    Ecological study
    An ecological study is an epidemiological study in which the unit of analysis is a population rather than an individual. For instance, an ecological study may look at the association between smoking and lung cancer deaths in different countries...

  • Econometrics
    Econometrics
    Econometrics has been defined as "the application of mathematics and statistical methods to economic data" and described as the branch of economics "that aims to give empirical content to economic relations." More precisely, it is "the quantitative analysis of actual economic phenomena based on...

  • Econometric model
    Econometric model
    Econometric models are statistical models used in econometrics. An econometric model specifies the statistical relationship that is believed to hold between the various economic quantities pertaining to a particular economic phenomenon under study...

  • Econometric software – a list of software articles
  • Economic data
    Economic data
    Economic data or economic statistics may refer to data describing an actual economy, past or present. These are typically found in time-series form, that is, covering more than one time period or in cross-sectional data in one time period Economic data or economic statistics may refer to data...

  • Economic epidemiology
    Economic epidemiology
    Economic epidemiology is a field at the intersection of epidemiology and economics. Its premise is to incorporate incentives for healthy behavior and their attendant behavioral responses into an epidemiological context to better understand how diseases are transmitted...

  • Economic statistics
    Economic statistics
    Economic statistics is a topic in applied statistics that concerns the collection, processing, compilation, dissemination, and analysis of economic data. It is also common to call the data themselves 'economic statistics', but for this usage see economic data. The data of concern to economic ...

  • Eddy covariance
    Eddy covariance
    The eddy covariance technique is a key atmospheric flux measurement technique to measure and calculate vertical turbulent fluxes within atmospheric boundary layers...

  • Edgeworth series
    Edgeworth series
    The Gram–Charlier A series , and the Edgeworth series are series that approximate a probability distribution in terms of its cumulants...

  • Effect size
    Effect size
    In statistics, an effect size is a measure of the strength of the relationship between two variables in a statistical population, or a sample-based estimate of that quantity...

  • Efficiency (statistics)
    Efficiency (statistics)
    In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors...

  • Efficient estimator
  • Ehrenfest model
    Ehrenfest model
    The Ehrenfest model of diffusion was proposed by Paul Ehrenfest to explain the second law of thermodynamics. The model considers N particles in two containers. Particles independently change container at a rate λ...

  • Eigenpoll
    Eigenpoll
    An eigenpoll is a type of statistical survey which gathers knowledge from the community. It differs from opinion polls by finding the best solution, rather than finding the most popular opinion.- Methodology :...

  • Elastic map
    Elastic map
    Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are system of elastic springs embedded in the dataspace. This system approximates a low-dimensional manifold...

  • Elliptical distribution
    Elliptical distribution
    In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution and inherit some of its properties.-Definition:...

  • Ellsberg paradox
    Ellsberg paradox
    The Ellsberg paradox is a paradox in decision theory and experimental economics in which people's choices violate the expected utility hypothesis.An alternate viewpoint is that expected utility theory does not properly describe actual human choices...

  • Elston–Stewart algorithm
  • Empirical
    Empirical
    The word empirical denotes information gained by means of observation or experimentation. Empirical data are data produced by an experiment or observation....

  • Empirical Bayes method
    Empirical Bayes method
    Empirical Bayes methods are procedures for statistical inference in which the prior distribution is estimated from the data. This approach stands in contrast to standardBayesian methods, for which the prior distribution is fixed before any data are observed...

  • Empirical distribution function
    Empirical distribution function
    In statistics, the empirical distribution function, or empirical cdf, is the cumulative distribution function associated with the empirical measure of the sample. This cdf is a step function that jumps up by 1/n at each of the n data points. The empirical distribution function estimates the true...

  • Empirical measure
    Empirical measure
    In probability theory, an empirical measure is a random measure arising from a particular realization of a sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical statistics....

  • Empirical orthogonal functions
    Empirical orthogonal functions
    In statistics and signal processing, the method of empirical orthogonal function analysis is a decomposition of a signal or data set in terms of orthogonal basis functions which are determined from the data. It is the same as performing a principal components analysis on the data, except that the...

  • Empirical probability
    Empirical probability
    Empirical probability, also known as relative frequency, or experimental probability, is the ratio of the number of "favorable" outcomes to the total number of trials, not in a sample space but in an actual sequence of experiments...

  • Empirical process
    Empirical process
    The study of empirical processes is a branch of mathematical statistics and a sub-area of probability theory. It is a generalization of the central limit theorem for empirical measures...

  • Empirical statistical laws
    Empirical statistical laws
    An empirical statistical law or a law of statistics represents a type of behaviour that has been found across a number of datasets and, indeed, across a range of types of data sets. Many of these observances have been formulated and proved as statistical or probabilistic theorems and the term...

  • Endogeneity (economics)
    Endogeneity (economics)
    In an econometric model, a parameter or variable is said to be endogenous when there is a correlation between the parameter or variable and the error term. Endogeneity can arise as a result of measurement error, autoregression with autocorrelated errors, simultaneity, omitted variables, and sample...

  • End point of clinical trials
    End point of clinical trials
    An endpoint is something which is measured in a clinical trial or study. Measuring the selected endpoints is the goal of a trial. The response rate and survival are examples of the endpoints....

  • Energy distance
    Energy distance
    Energy distance is a statistical distance between probability distributions. If X and Y are independent random vectors in Rd, with cumulative distribution functions F and G respectively, then the energy distance between the distributions F and G is definedwhere X, X' are independent and identically...

  • Energy statistics
    Energy statistics
    Energy statistics refers to collecting, compiling, analyzing and disseminating data on commodities such as coal, crude oil, natural gas, electricity, or renewable energy sources , when they are used for the energy they contain...

  • Encyclopedia of Statistical Sciences
    Encyclopedia of Statistical Sciences
    The Encyclopedia of Statistical Sciences is the largest-ever encyclopaedia of statistics. It is published by John Wiley & Sons.The first edition, in nine volumes, was edited by Norman Lloyd Johnson and Samuel Kotz and appeared in 1982. The second edition, in 16 volumes, was published in 2006. ...

     (book)
  • Engineering statistics
    Engineering statistics
    Engineering statistics combines engineering and statistics:# Design of Experiments is a methodology for formulating scientific and engineering problems using statistical models. The protocol specifies a randomization procedure for the experiment and specifies the primary data-analysis,...

  • Engineering tolerance
  • Engset calculation
  • Ensemble forecasting
    Ensemble forecasting
    Ensemble forecasting is a numerical prediction method that is used to attempt to generate a representative sample of the possible future states of a dynamical system...

  • Ensemble Kalman filter
    Ensemble Kalman filter
    The ensemble Kalman filter is a recursive filter suitable for problems with a large number of variables, such as discretizations of partial differential equations in geophysical models...

  • Entropy (information theory)
  • Entropy estimation
    Entropy estimation
    Estimating the differential entropy of a system or process, given some observations, is useful in various science/engineering applications, such as Independent Component Analysis, image analysis, genetic analysis, speech recognition, manifold learning, and time delay estimation...

  • Entropy power inequality
    Entropy power inequality
    In mathematics, the entropy power inequality is a result in probability theory that relates to so-called "entropy power" of random variables. It shows that the entropy power of suitably well-behaved random variables is a superadditive function. The entropy power inequality was proved in 1948 by...

  • Environmental statistics
    Environmental statistics
    Environmental statistics is the application of statistical methods to environmental science. It covers procedures for dealing with questions concerning both the natural environment in its undistrurbed state and the interaction of humanity with the environment...

  • Epi Info
    Epi Info
    Epi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....

     — software
  • Epidata
    Epidata
    EpiData refers to a group of applications used in combination for creating documented data structures and analysis of quantitative data. The EpiData Association, which created the software, was created in 1999 and is based in Denmark...

     — software
  • Epidemic model
    Epidemic model
    An Epidemic model is a simplified means of describing the transmission of communicable disease through individuals.-Introduction:The outbreak and spread of disease has been questioned and studied for many years...

  • Epidemiological methods
    Epidemiological methods
    The science of epidemiology has matured significantly from the times of Hippocrates and John Snow. The techniques for gathering and analyzing epidemiological data vary depending on the type of disease being monitored but each study will have overarching similarities....

  • Epilogism
    Epilogism
    Epilogism is a style of Inference invented by the ancient Empiric school of medicine. It is a theory-free method of looking at history by accumulating fact with minimal generalization and being conscious of the side effects of making causal claims .Epilogism is an inference which moves entirely...

  • Epitome (image processing)
    Epitome (image processing)
    In image processing, an epitome is a condensed digital representation of the essential statistical properties of ordered datasets, such as matrices representing images, audio signals, videos, or genetic sequences...

  • Epps effect
    Epps effect
    In econometrics and time series analysis, the Epps effect, named after T. W. Epps, is the phenomenon that the empirical correlation between the returns of two different stocks decreases as the sampling frequency of data increases. The phenomenon is caused by non-synchronous/asynchronous...

  • Equating
    Equating
    Test equating traditionally refers to the statistical process of determining comparable scores on different forms of an exam. It can be accomplished using either classical test theory or item response theory....

     – test equating
  • Equipossible
    Equipossible
    Equipossibility is a philosophical concept in possibility theory that is a precursor to the notion of equiprobability in probability theory. It is used to distinguish what can occur in a probability experiment...

  • Equiprobable
    Equiprobable
    Equiprobability is a philosophical concept in probability theory that allows one to assign equal probabilities to outcomes when they are judged to be equipossible or to be "equally likely" in some sense...

  • Erdős–Rényi model
    Erdos–Rényi model
    In graph theory, the Erdős–Rényi model, named for Paul Erdős and Alfréd Rényi, is either of two models for generating random graphs, including one that sets an edge between each pair of nodes with equal probability, independently of the other edges...

  • Erlang distribution
  • Ergodic theory
    Ergodic theory
    Ergodic theory is a branch of mathematics that studies dynamical systems with an invariant measure and related problems. Its initial development was motivated by problems of statistical physics....

  • Ergodicity
    Ergodicity
    In mathematics, the term ergodic is used to describe a dynamical system which, broadly speaking, has the same behavior averaged over time as averaged over space. In physics the term is used to imply that a system satisfies the ergodic hypothesis of thermodynamics.-Etymology:The word ergodic is...

  • Error bar
    Error bar
    Error bars are a graphical representation of the variability of data and are used on graphs to indicate the error, or uncertainty in a reported measurement. They give a general idea of how accurate a measurement is, or conversely, how far from the reported value the true value might be...

  • Error correction model
    Error correction model
    An error correction model is a dynamical system with the characteristics that the deviation of the current state from its long-run relationship will be fed into its short-run dynamics....

  • Error function
    Error function
    In mathematics, the error function is a special function of sigmoid shape which occurs in probability, statistics and partial differential equations...

  • Errors and residuals in statistics
    Errors and residuals in statistics
    In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...

  • Errors-in-variables models
    Errors-in-variables models
    In statistics and econometrics, errors-in-variables models or measurement errors models are regression models that account for measurement errors in the independent variables...

  • An Essay towards solving a Problem in the Doctrine of Chances
    An Essay towards solving a Problem in the Doctrine of Chances
    An Essay towards solving a Problem in the Doctrine of Chances is a work on the mathematical theory of probability by the Reverend Thomas Bayes, published in 1763, two years after its author's death. It included a statement of a special case of what is now called Bayes' theorem. In 18th-century...

  • Estimating equations
    Estimating equations
    In statistics, the method of estimating equations is a way of specifying how the parameters of a statistical model should be estimated. This can be thought of as a generalisation of many classical methods --- the method of moments, least squares, and maximum likelihood --- as well as some recent...

  • Estimation
    Estimation
    Estimation is the calculated approximation of a result which is usable even if input data may be incomplete or uncertain.In statistics,*estimation theory and estimator, for topics involving inferences about probability distributions...

  • Estimation theory
    Estimation theory
    Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...

  • Estimation of covariance matrices
    Estimation of covariance matrices
    In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution...

  • Estimation of signal parameters via rotational invariance techniques
  • Estimator
    Estimator
    In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule and its result are distinguished....

  • Etemadi's inequality
    Etemadi's inequality
    In probability theory, Etemadi's inequality is a so-called "maximal inequality", an inequality that gives a bound on the probability that the partial sums of a finite collection of independent random variables exceed some specified bound...

  • Ethical problems using children in clinical trials
    Ethical problems using children in clinical trials
    In health care, a clinical trial is a comparison test of a medication or other medical treatment , versus a placebo , other medications or devices, or the standard medical treatment for a patient's condition....

  • Event (probability theory)
    Event (probability theory)
    In probability theory, an event is a set of outcomes to which a probability is assigned. Typically, when the sample space is finite, any subset of the sample space is an event...

  • Event study
    Event study
    An Event study is a statistical method to assess the impact of an event on the value of a firm. For example, the announcement of a merger between two business entities can be analyzed to see whether investors believe the merger will create or destroy value...

  • Evidence under Bayes theorem
    Evidence under Bayes theorem
    Bayes' theorem provides a way of updating the probability of an event in the light of new information. In the evidence law context, for example, it could be used as a way of updating the probability that a genetic sample found at the scene of the crime came from the defendant in light of a genetic...

  • Evolutionary data mining
    Evolutionary data mining
    Evolutionary data mining, or genetic data mining is an umbrella term for any data mining using evolutionary algorithms. While it can be used for mining data from DNA sequences, it is not limited to biological contexts and can be used in any classification-based prediction scenario, which helps...

  • Ewens's sampling formula
    Ewens's sampling formula
    In population genetics, Ewens' sampling formula, describes the probabilities associated with counts of how many different alleles are observed a given number of times in the sample.-Definition:...

  • EWMA chart
    EWMA chart
    In statistical quality control, the EWMA chart is a type of control chart used to monitor either variables or attributes-type data using the monitored business or industrial process's entire history of output...

  • Exact statistics
    Exact statistics
    Exact statistics, such as that described in exact test, is a branch of statistics that was developed to provide more accurate results pertaining to statistical testing and interval estimation by eliminating procedures based on asymptotic and approximate statistical methods...

  • Exact test
    Exact test
    In statistics, an exact test is a test where all assumptions upon which the derivation of the distribution of the test statistic is based are met, as opposed to an approximate test, in which the approximation may be made as close as desired by making the sample size big enough...

  • Examples of Markov chains
    Examples of Markov chains
    - Board games played with dice :A game of snakes and ladders or any other game whose moves are determined entirely by dice is a Markov chain, indeed, an absorbing Markov chain. This is in contrast to card games such as blackjack, where the cards represent a 'memory' of the past moves. To see the...

  • Excess risk
    Excess risk
    In statistics, excess risk is a measure of the association between a specified risk factor and a specified outcome...

  • Exchange paradox
  • Exchangeable random variables
  • Expander walk sampling
    Expander walk sampling
    In the mathematical discipline of graph theory, the expander walk sampling theorem states that sampling vertices in an expander graph by doing a random walk is almost as good as sampling the vertices independently from a uniform distribution....

  • Expectation-maximization algorithm
    Expectation-maximization algorithm
    In statistics, an expectation–maximization algorithm is an iterative method for finding maximum likelihood or maximum a posteriori estimates of parameters in statistical models, where the model depends on unobserved latent variables...

  • Expectation propagation
    Expectation propagation
    Expectation propagation is a technique in Bayesian machine learning, developed by Thomas Minka.EP finds approximations to a probability distribution. It uses an iterative approach that leverages the factorization structure of the target distribution. It differs from other Bayesian approximation...

  • Expected utility hypothesis
    Expected utility hypothesis
    In economics, game theory, and decision theory the expected utility hypothesis is a theory of utility in which "betting preferences" of people with regard to uncertain outcomes are represented by a function of the payouts , the probabilities of occurrence, risk aversion, and the different utility...

  • Expected value
    Expected value
    In probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...

  • Expected value of sample information
    Expected value of sample information
    In decision theory, the expected value of sample information is the expected increase in utility that you could obtain from gaining access to a sample of additional observations before making a decision. The additional information obtained from the sample may allow you to make a more informed,...

  • Experiment
    Experiment
    An experiment is a methodical procedure carried out with the goal of verifying, falsifying, or establishing the validity of a hypothesis. Experiments vary greatly in their goal and scale, but always rely on repeatable procedure and logical analysis of the results...

  • Experimental design diagram
    Experimental Design Diagram
    Experimental Design Diagram is a diagram used by scientists, to design an experiment. This diagram helps to identify the essential components of an experiment...

  • Experimental event rate
    Experimental event rate
    In epidemiology and biostatistics, the experimental event rate is a measure of how often a particular statistical event occurs within the experimental group of an experiment ....

  • Experimental research design
  • Experimental uncertainty analysis
    Experimental uncertainty analysis
    The purpose of this introductory article is to discuss the experimental uncertainty analysis of a derived quantity, based on the uncertainties in the experimentally measured quantities that are used in some form of mathematical relationship to calculate that derived quantity...

  • Experimental techniques — redirects to Experimental research design
  • Experimenter's bias
    Experimenter's bias
    In experimental science, experimenter's bias is subjective bias towards a result expected by the human experimenter. David Sackett, in a useful review of biases in clinical studies, states that biases can occur in any one of seven stages of research:...

  • Experimentwise error rate
    Experimentwise error rate
    In statistics, during multiple comparisons testing, experimentwise error rate is the probability of at least one false rejection of the null hypothesis over an entire experiment. The α that is assigned applies to all of the hypothesis tests as a whole, not individually as in the comparisonwise...

  • Explained sum of squares
    Explained sum of squares
    In statistics, the explained sum of squares is a quantity used in describing how well a model, often a regression model, represents the data being modelled...

  • Explained variation
    Explained variation
    In statistics, explained variation or explained randomness measures the proportion to which a mathematical model accounts for the variation of a given data set...

  • Explanatory variable
  • Exploratory data analysis
    Exploratory data analysis
    In statistics, exploratory data analysis is an approach to analysing data sets to summarize their main characteristics in easy-to-understand form, often with visual graphs, without using a statistical model or having formulated a hypothesis...

  • Exponential dispersion model
    Exponential dispersion model
    Exponential dispersion models are statistical models in which the probability distribution is of a special form. This class of models represents a generalisation of the exponential family of models which themselves play an important role in statistical theory because they have a special structure...

  • Exponential distribution
    Exponential distribution
    In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

  • Exponential family
    Exponential family
    In probability and statistics, an exponential family is an important class of probability distributions sharing a certain form, specified below. This special form is chosen for mathematical convenience, on account of some useful algebraic properties, as well as for generality, as exponential...

  • Exponential-logarithmic distribution
  • Exponential power distribution — redirects to Generalized normal distribution
  • Exponential random numbers — redirect to subsection of Exponential distribution
    Exponential distribution
    In probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...

  • Exponential smoothing
    Exponential smoothing
    Exponential smoothing is a technique that can be applied to time series data, either to produce smoothed data for presentation, or to make forecasts. The time series data themselves are a sequence of observations. The observed phenomenon may be an essentially random process, or it may be an...

  • Exponentiated Weibull distribution
    Exponentiated weibull distribution
    In statistics, the exponentiated Weibull family of probability distributions was introduced by Mudholkar and Srivastava as an extension of the Weibull family obtained by adding a second shape parameter....

  • Exposure variable
  • Extended Kalman filter
    Extended Kalman filter
    In estimation theory, the extended Kalman filter is the nonlinear version of the Kalman filter which linearizes about the current mean and covariance...

  • Extended negative binomial distribution
    Extended negative binomial distribution
    In probability and statistics the extended negative binomial distribution is a discrete probability distribution extending the negative binomial distribution. It is a truncated version of the negative binomial distribution for which estimation methods have been studied.In the context of actuarial...

  • Extensions of Fisher's method
    Extensions of Fisher's method
    In statistics, extensions of Fisher's method are a group of approaches that allow approximately valid statistical inferences to be made when the assumptions required for the direct application of Fisher's method are not valid...

  • External validity
    External validity
    External validity is the validity of generalized inferences in scientific studies, usually based on experiments as experimental validity....

  • Extrapolation domain analysis
    Extrapolation domain analysis
    Extrapolation domain analysis is a methodology for identifying geographical areas that seem suitable for adoption of innovative ecosystem management practices on the basis of sites exhibiting similarity in conditions such as climatic, land use and socio-economic indicators...

  • Extreme value theory
    Extreme value theory
    Extreme value theory is a branch of statistics dealing with the extreme deviations from the median of probability distributions. The general theory sets out to assess the type of probability distributions generated by processes...

  • Extremum estimator
    Extremum estimator
    In statistics and econometrics, extremum estimators is a wide class of estimators for parametric models that are calculated through maximization of a certain objective function, which depends on the data...


F

  • F-distribution
  • F-divergence
    F-divergence
    In probability theory, an ƒ-divergence is a function Df that measures the difference between two probability distributions P and Q...

  • F-statistics
    F-statistics
    In population genetics, F-statistics describe the level of heterozygosity in a population; more specifically the degree of a reduction in heterozygosity when compared to Hardy–Weinberg expectation...

     – population genetics
  • F-test
    F-test
    An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis.It is most often used when comparing statistical models that have been fit to a data set, in order to identify the model that best fits the population from which the data were sampled. ...

  • F-test of equality of variances
    F-test of equality of variances
    In statistics, an F-test for the null hypothesis that two normal populations have the same variance is sometimes used, although it needs to be used with caution as it can be sensitive to the assumption that the variables have this distribution....

  • F1 score
    F1 Score
    In statistics, the F1 score is a measure of a test's accuracy. It considers both the precision p and the recall r of the test to compute the score: p is the number of correct results divided by the number of all returned results and r is the number of correct results divided by the number of...

  • Factor analysis
    Factor analysis
    Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved, uncorrelated variables called factors. In other words, it is possible, for example, that variations in three or four observed variables...

  • Factor regression model
  • Factor graph
    Factor graph
    In probability theory and its applications, a factor graph is a particular type of graphical model, with applications in Bayesian inference, that enables efficient computation of marginal distributions through the sum-product algorithm...

  • Factorial code
    Factorial code
    Most real world data sets consist of data vectors whose individual components are not statistically independent, that is, they are redundant in the statistical sense. Then it is desirable to create a factorial code of the data, i...

  • Factorial experiment
    Factorial experiment
    In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be...

  • Factorial moment
  • Factorial moment generating function
  • Failure rate
    Failure rate
    Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is often denoted by the Greek letter λ and is important in reliability engineering....

  • Fair coin
    Fair coin
    In probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin...

  • Falconer's formula
    Falconer's formula
    Falconer's formula is used in twin studies to determine the genetic heritability of a trait based on the difference between twin correlations.The formula is hb2 = 2, where hb2 is the broad sense heritability, rmz is the identical twin correlation, and rdz is the fraternal twin correlation...

  • False discovery rate
    False discovery rate
    False discovery rate control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. In a list of rejected hypotheses, FDR controls the expected proportion of incorrectly rejected null hypotheses...

  • False negative
    Type I and type II errors
    In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...

  • False positive
    Type I and type II errors
    In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...

  • False positive rate
    False positive rate
    When performing multiple comparisons in a statistical analysis, the false positive rate is the probability of falsely rejecting the null hypothesis for a particular test among all the tests performed...

  • False positive paradox
    False positive paradox
    The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate...

  • Familywise error rate
    Familywise error rate
    In statistics, familywise error rate is the probability of making one or more false discoveries, or type I errors among all the hypotheses when performing multiple pairwise tests.-Classification of m hypothesis tests:...

  • Fan chart (time series)
    Fan chart (time series)
    In time series analysis, a fan chart is a chart that joins a simple line chart for observed past data, by showing ranges for possible values of future data together with a line showing a central estimate or most likely value for the future outcomes...

  • Fano factor
  • Fast Fourier transform
    Fast Fourier transform
    A fast Fourier transform is an efficient algorithm to compute the discrete Fourier transform and its inverse. "The FFT has been called the most important numerical algorithm of our lifetime ." There are many distinct FFT algorithms involving a wide range of mathematics, from simple...

  • Fast Kalman filter
    Fast Kalman filter
    The fast Kalman filter , devised by Antti Lange , is an extension of the Helmert-Wolf blocking method from geodesy to real-time applications of Kalman filtering such as satellite imaging of the Earth...

  • FastICA
    FastICA
    FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. The algorithm is based on a fixed-point iteration scheme maximizing non-Gaussianity as a measure of statistical independence...

     – fast independent component analysis
  • Fat tail
    Fat tail
    A fat-tailed distribution is a probability distribution that has the property, along with the heavy-tailed distributions, that they exhibit extremely large skewness or kurtosis. This comparison is often made relative to the ubiquitous normal distribution, which itself is an example of an...

  • Feasible generalized least squares
  • Feature extraction
    Feature extraction
    In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant then the input data will be transformed into a reduced representation...

  • Feller process
    Feller process
    In probability theory relating to stochastic processes, a Feller process is a particular kind of Markov process.-Definitions:Let X be a locally compact topological space with a countable base...

  • Feller's coin-tossing constants
    Feller's coin-tossing constants
    Feller's coin-tossing constants are a set of numerical constants which describe asymptotic probabilities that in n independent tosses of a fair coin, no run of k consecutive heads appears....

  • Feller-continuous process
    Feller-continuous process
    In mathematics, a Feller-continuous process is a continuous-time stochastic process for which the expected value of suitable statistics of the process at a given time in the future depend continuously on the initial condition of the process...

  • Felsenstein's tree peeling algorithm
    Felsenstein's Tree Peeling Algorithm
    In statistical genetics, Felsenstein's tree-pruning algorithm , due to Joseph Felsenstein, is an algorithm for computing the likelihood of an evolutionary tree from nucleic acid sequence data....

     — statistical genetics
  • Fides (reliability)
    Fides (reliability)
    Fides is a guide allowing estimated reliability calculation for electronic components and systems. The reliability prediction is generally expressed in FIT or MTBF...

  • Fiducial inference
    Fiducial inference
    Fiducial inference is one of a number of different types of statistical inference. These are rules, intended for general application, by which conclusions can be drawn from samples of data. In modern statistical practice, attempts to work with fiducial inference have fallen out of fashion in...

  • Field experiment
    Field experiment
    A field experiment applies the scientific method to experimentally examine an intervention in the real world rather than in the laboratory...

  • Fieller's theorem
    Fieller's theorem
    In statistics, Fieller's theorem allows the calculation of a confidence interval for the ratio of two means.-Approximate confidence interval:...

  • File drawer problem
  • Filtering problem (stochastic processes)
    Filtering problem (stochastic processes)
    In the theory of stochastic processes, the filtering problem is a mathematical model for a number of filtering problems in signal processing and the like. The general idea is to form some kind of "best estimate" for the true value of some system, given only some observations of that system...

  • Financial econometrics
    Financial econometrics
    People working in the finance industry often use econometric techniques in a range of activities. For example in support of portfolio management, risk management and in the analysis of securities...

  • Financial models with long-tailed distributions and volatility clustering
  • Finite-dimensional distribution
    Finite-dimensional distribution
    In mathematics, finite-dimensional distributions are a tool in the study of measures and stochastic processes. A lot of information can be gained by studying the "projection" of a measure onto a finite-dimensional vector space .-Finite-dimensional distributions of a measure:Let be a measure space...

  • First-hitting-time model
    First-hitting-time model
    In statistics, first-hitting-time models are a sub-class of survival models. The first hitting time, also called first passage time, of a set A with respect to an instance of a stochastic process is the time until the stochastic process first enters A....

  • First-in-man study
    First-in-man study
    A first-in-man study is a clinical trial where a medical procedure, previously developed and assessed through in vitro or animal testing, or through mathematical modelling is tested on human subjects for the first time....

  • Fishburn–Shepp inequality
    Fishburn–Shepp inequality
    In combinatorial mathematics, the Fishburn–Shepp inequality is an inequality for the number of extensions of partial orders to linear orders, found by and .It states that if x, y, and z are incomparable elements of a finite poset, then PIn combinatorial mathematics, the Fishburn–Shepp inequality...

  • Fisher consistency
    Fisher consistency
    In statistics, Fisher consistency, named after Ronald Fisher, is a desirable property of an estimator asserting that if the estimator were calculated using the entire population rather than a sample, the true value of the estimated parameter would be obtained...

  • Fisher information
    Fisher information
    In mathematical statistics and information theory, the Fisher information is the variance of the score. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior...

  • Fisher information metric
    Fisher information metric
    In information geometry, the Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space....

  • Fisher kernel
    Fisher kernel
    In statistical classification, the Fisher kernel, named in honour of Sir Ronald Fisher, is a function that measures the similarity of two objects on the basis of sets of measurements for each object and a statistical model...

  • Fisher transformation
    Fisher transformation
    In statistics, hypotheses about the value of the population correlation coefficient ρ between variables X and Y can be tested using the Fisher transformation applied to the sample correlation coefficient r.-Definition:...

  • Fisher's exact test
    Fisher's exact test
    Fisher's exact test is a statistical significance test used in the analysis of contingency tables where sample sizes are small. It is named after its inventor, R. A...

  • Fisher's inequality
    Fisher's inequality
    In combinatorial mathematics, Fisher's inequality, named after Ronald Fisher, is a necessary condition for the existence of a balanced incomplete block design satisfying certain prescribed conditions....

  • Fisher's linear discriminator
  • Fisher's method
    Fisher's Method
    In statistics, Fisher's method, also known as Fisher's combined probability test, is a technique for data fusion or "meta-analysis" . It was developed by and named for Ronald Fisher...

  • Fisher's noncentral hypergeometric distribution
    Fisher's noncentral hypergeometric distribution
    In probability theory and statistics, Fisher's noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where sampling probabilities are modified by weight factors...

  • Fisher's z-distribution
  • Fisher-Tippett distribution — redirects to Generalized extreme value distribution
  • Fisher–Tippett–Gnedenko theorem
  • Five-number summary
    Five-number summary
    The five-number summary is a descriptive statistic that provides information about a set of observations. It consists of the five most important sample percentiles:# the sample minimum # the lower quartile or first quartile...

  • Fixed effects estimator
    Fixed effects estimator
    In econometrics and statistics, a fixed effects model is a statistical model that represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random. This is in contrast to random effects models and mixed models in which either all or some of...

     and Fixed effects estimation — redirect to Fixed effects model
  • FLAME clustering
    FLAME clustering
    Fuzzy clustering by Local Approximation of MEmberships is a data clustering algorithm that defines clusters in the dense parts of a dataset and performs cluster assignment solely based on the neighborhood relationships among objects...

  • Fleiss' kappa
    Fleiss' kappa
    Fleiss' kappa is a statistical measure for assessing the reliability of agreement between a fixed number of raters when assigning categorical ratings to a number of items or classifying items. This contrasts with other kappas such as Cohen's kappa, which only work when assessing the agreement...

  • Fleming-Viot process
    Fleming-Viot process
    In probability theory, a Fleming–Viot process is a member of a particular subset of probability-measure valued Markov processes on compact metric spaces, as defined in the 1979 paper by Wendell Helms Fleming and Michel Viot...

  • Flood risk assessment
    Flood risk assessment
    A flood risk assessment is an assessment of the risk of flooding, particularly in relation to residential, commercial and industrial land use.-England and Wales:...

  • Floor effect
    Floor effect
    In statistics, the term floor effect refers to when data cannot take on a value lower than some particular number, called the floor.An example of this is when an IQ test is given to young children who have either been given training or have been given no training...

  • FNN algorithm
    FNN algorithm
    The false nearest neighbor algorithm is an algorithm for estimating the embedding dimension....

      (false nearest neighbour algorithm)
  • Focused information criterion
    Focused information criterion
    In statistics, the focused information criterion is a method for selecting the most appropriate model among a set of competitors for a given data set...

  • Fokker–Planck equation
  • Folded normal distribution
    Folded Normal Distribution
    The folded normal distribution is a probability distribution related to the normal distribution. Given a normally distributed random variable X with mean μ and variance σ2, the random variable Y = |X| has a folded normal distribution. Such a case may be encountered if only the magnitude of some...

  • Forecast bias
    Forecast bias
    A forecast bias occurs when there are consistent differences between actual outcomes and previously generated forecasts of those quantities; that is, forecasts may have a general tendency to be too high or too low...

  • Forecast error
    Forecast error
    In statistics, a forecast error is the difference between the actual or real and the predicted or forecast value of a time series or any other phenomenon of interest....

  • Forecast skill
    Forecast skill
    Skill in forecasting is a scaled representation of forecast error that relates the forecast accuracy of a particular forecast model to some reference model....

  • Forecasting
    Forecasting
    Forecasting is the process of making statements about events whose actual outcomes have not yet been observed. A commonplace example might be estimation for some variable of interest at some specified future date. Prediction is a similar, but more general term...

  • Forest plot
    Forest plot
    A forest plot is a graphical display designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. It was developed for use in medical research as a means of graphically representing a meta-analysis of the results of...

  • Fork-join queue
    Fork-join queue
    In queueing theory, a discipline within the mathematical theory of probability, a fork-join queue is a queue where incoming jobs are split on arrival for service by numerous servers and joined before departure. The model is often used for parallel computations or systems where products need to be...

  • Formation matrix
    Formation matrix
    In statistics and information theory, the expected formation matrix of a likelihood function L is the matrix inverse of the Fisher information matrix of L, while the observed formation matrix of L is the inverse of the observed information matrix of L.Currently, no notation for dealing with...

  • Forward measure
    Forward measure
    In finance, a T-forward measure is a pricing measure absolutely continuous with respect to a risk-neutral measure but rather than using the money market as numeraire, it uses a bond with maturity T...

  • Foster's theorem
    Foster's theorem
    In probability theory, Foster's theorem, named after F. G. Foster, is used to draw conclusions about the positive recurrence of Markov chains with countable state spaces...

  • Foundations of statistics
    Foundations of statistics
    Foundations of statistics is the usual name for the epistemological debate in statistics over how one should conduct inductive inference from data...

  • Founders of statistics
    Founders of statistics
    Statistics is the theory and application of mathematics to the scientific method including hypothesis generation, experimental design, sampling, data collection, data summarization, estimation, prediction and inference from those results to the population from which the experimental sample was drawn...

  • Fourier analysis
  • Fraction of variance unexplained
    Fraction of variance unexplained
    In statistics, the fraction of variance unexplained in the context of a regression task is the fraction of variance of the regressand Y which cannot be explained, i.e., which is not correctly predicted, by the explanatory variables X....

  • Fractional Brownian motion
  • Fractional factorial design
  • Fréchet distribution
  • Fréchet mean
    Fréchet mean
    The Fréchet mean , is the point, x, that minimizes the Fréchet function, in cases where such a unique minimizer exists. The value at a point p, of the Fréchet function associated to a random point X on a complete metric space is the expected squared distance from p to X...

  • Free statistical software
    Free statistical software
    In this article, the word free generally means can be legally obtained without paying any money . Just a few of the software packages mentioned here are also free as in the sense of free speech: they are not only open source but also free software in the sense that the source code of the software...

  • Freedman's paradox
    Freedman's paradox
    In statistical analysis, Freedman's paradox, named after David Freedman, describes a problem in model selection whereby predictor variables with no explanatory power can appear artificially important. Freedman demonstrated that this is a common occurrence when the number of variables is similar to...

  • Freedman–Diaconis rule
  • Freidlin–Wentzell theorem
  • Frequency (statistics)
    Frequency (statistics)
    In statistics the frequency of an event i is the number ni of times the event occurred in the experiment or the study. These frequencies are often graphically represented in histograms....

  • Frequency distribution
    Frequency distribution
    In statistics, a frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of...

  • Frequency domain
    Frequency domain
    In electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....

  • Frequency probability
    Frequency probability
    Frequency probability is the interpretation of probability that defines an event's probability as the limit of its relative frequency in a large number of trials. The development of the frequentist account was motivated by the problems and paradoxes of the previously dominant viewpoint, the...

  • Frequentist inference
    Frequentist inference
    Frequentist inference is one of a number of possible ways of formulating generally applicable schemes for making statistical inferences: that is, for drawing conclusions from statistical samples. An alternative name is frequentist statistics...

  • Friedman test
    Friedman test
    The Friedman test is a non-parametric statistical test developed by the U.S. economist Milton Friedman. Similar to the parametric repeated measures ANOVA, it is used to detect differences in treatments across multiple test attempts. The procedure involves ranking each row together, then...

  • Friendship paradox
    Friendship paradox
    The friendship paradox is the phenomenon first observed by the sociologist Scott L. Feld in 1991 that most people have fewer friends than their friends have, on average. It can be explained as a form of sampling bias in which people with greater numbers of friends have an increased likelihood of...

  • Frisch–Waugh–Lovell theorem
  • Fully crossed design
  • Function approximation
    Function approximation
    The need for function approximations arises in many branches of applied mathematics, and computer science in particular. In general, a function approximation problem asks us to select a function among a well-defined class that closely matches a target function in a task-specific way.One can...

  • Functional data analysis
    Functional data analysis
    Functional data analysis is a branch of statistics that analyzes data providing information about curves, surfaces or anything else varying over a continuum...

  • Funnel plot
    Funnel plot
    A funnel plot is a useful graph designed to check the existence of publication bias in systematic reviews and meta-analyses. It assumes that the largest studies will be near the average, and small studies will be spread on both sides of the average...

  • Fuzzy logic
    Fuzzy logic
    Fuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...

  • Fuzzy measure theory
    Fuzzy measure theory
    Fuzzy measure theory considers a number of special classes of measures, each of which is characterized by a special property. Some of the measures used in this theory are plausibility and belief measures, fuzzy set membership function and the classical probability measures...

  • FWL theorem
    FWL theorem
    In econometrics, the Frisch–Waugh–Lovell theorem is named after the econometricians Ragnar Frisch, Frederick V. Waugh, and Michael C. Lovell.The Frisch–Waugh–Lovell theorem states that if the regression we are concerned with is:...

     — relating regression and projection

G

  • G-network
  • G-test
    G-test
    In statistics, G-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-squared tests were previously recommended....

  • Galbraith plot
    Galbraith plot
    In statistics, a Galbraith plot , is one way of displaying several estimates of the same quantity that have different standard errors....

  • Gallagher Index
    Gallagher Index
    The Gallagher Index is used to measure the disproportionality of an electoral outcome, that is the difference between the percentage of votes received and the percentage of seats a party gets in the resulting legislature. This is especially useful for comparing proportionality across electoral...

  • Galton–Watson process
  • Galton's problem
    Galton's problem
    Galton’s problem, named after Sir Francis Galton, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. The problem is now recognized as a general one that applies to all nonexperimental studies and to experimental design as well...

  • Gambler's fallacy
    Gambler's fallacy
    The Gambler's fallacy, also known as the Monte Carlo fallacy , and also referred to as the fallacy of the maturity of chances, is the belief that if deviations from expected behaviour are observed in repeated independent trials of some random process, future deviations in the opposite direction are...

  • Gambler's ruin
    Gambler's ruin
    The term gambler's ruin is used for a number of related statistical ideas:* The original meaning is that a gambler who raises his bet to a fixed fraction of bankroll when he wins, but does not reduce it when he loses, will eventually go broke, even if he has a positive expected value on each bet.*...

  • Gambling and information theory
    Gambling and information theory
    Statistical inference might be thought of as gambling theory applied to the world around. The myriad applications for logarithmic information measures tell us precisely how to take the best guess in the face of partial information. In that sense, information theory might be considered a formal...

  • Game of chance
    Game of chance
    A game of chance is a game whose outcome is strongly influenced by some randomizing device, and upon which contestants may or may not wager money or anything of monetary value...

  • Gamma distribution
  • Gamma test (statistics)
    Gamma test (statistics)
    In statistics, a gamma test tests the strength of association of the cross tabulated data when both variables are measured at the ordinal level. It makes no adjustment for either table size or ties. Values range from −1 to +1...

  • Gamma process
  • Gamma variate
  • GAUSS (software)
    GAUSS (software)
    GAUSS is a matrix programming language for mathematics and statistics, developed and marketed by Aptech Systems. Its primary purpose is the solution of numerical problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization...

  • Gauss's inequality
    Gauss's inequality
    In probability theory, Gauss's inequality gives an upper bound on the probability that a unimodal random variable lies more than any given distance from its mode....

  • Gauss–Kuzmin distribution
  • Gauss–Markov process
    Gauss–Markov process
    Gauss–Markov stochastic processes are stochastic processes that satisfy the requirements for both Gaussian processes and Markov processes. The stationary Gauss–Markov process is a very special case because it is unique, except for some trivial exceptions...

  • Gauss–Markov theorem
    Gauss–Markov theorem
    In statistics, the Gauss–Markov theorem, named after Carl Friedrich Gauss and Andrey Markov, states that in a linear regression model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimator of the coefficients is given by the...

  • Gauss–Newton algorithm
  • Gaussian function
  • Gaussian isoperimetric inequality
  • Gaussian measure
    Gaussian measure
    In mathematics, Gaussian measure is a Borel measure on finite-dimensional Euclidean space Rn, closely related to the normal distribution in statistics. There is also a generalization to infinite-dimensional spaces...

  • Gaussian noise
    Gaussian noise
    Gaussian noise is statistical noise that has its probability density function equal to that of the normal distribution, which is also known as the Gaussian distribution. In other words, the values that the noise can take on are Gaussian-distributed. A special case is white Gaussian noise, in which...

  • Gaussian process
    Gaussian process
    In probability theory and statistics, a Gaussian process is a stochastic process whose realisations consist of random values associated with every point in a range of times such that each such random variable has a normal distribution...

  • Gaussian process emulator
    Gaussian process emulator
    In statistics, Gaussian process emulator is one name for a general type of statistical model that has been used in contexts where the problem is to make maximum use of the outputs of a complicated computer-based simulation model. Each run of the simulation model is computationally expensive and...

  • Gaussian q-distribution
    Gaussian q-distribution
    In mathematical physics and probability and statistics, the Gaussian q-distribution is a family of probability distributions that includes, as limiting cases, the uniform distribution and the normal distribution...

  • Geary's C
    Geary's C
    Geary's C is a measure of spatial autocorrelation. Like autocorrelation, spatial autocorrelation means that adjacent observations of the same phenomenon are correlated. However, autocorrelation is about proximity in time. Spatial autocorrelation is about proximity in space...

  • GEH
    GEH
    The GEH Statistic is a formula used in traffic engineering, traffic forecasting, and traffic modelling to compare two sets of traffic volumes. The GEH formula gets its name from Geoffrey E. Havers, who invented it in the 1970s while working as a transport planner in London, England. Although its...

     — a statistic comparing modelled and observed counts
  • General linear model
    General linear model
    The general linear model is a statistical linear model.It may be written aswhere Y is a matrix with series of multivariate measurements, X is a matrix that might be a design matrix, B is a matrix containing parameters that are usually to be estimated and U is a matrix containing errors or...

  • General matrix notation of a VAR(p)
  • Generalizability theory
    Generalizability theory
    Generalizability theory, or G Theory, is a statistical framework for conceptualizing, investigating, and designing reliable observations. It is used to determine the reliability of measurements under specific conditions. It is particularly useful for assessing the reliability of performance...

  • Generalized additive model
    Generalized additive model
    In statistics, the generalized additive model is a statistical model developed by Trevor Hastie and Rob Tibshirani for blending properties of generalized linear models with additive models....

  • Generalized additive model for location, scale and shape
    Generalized additive model for location, scale and shape
    In statistics, the generalized additive model location, scale and shape is a class of statistical model that provides extended capabilities compared to the simpler generalized linear models and generalized additive models. These simpler models allow the typical values of a quantity being modelled...

  • Generalized canonical correlation
    Generalized canonical correlation
    In statistics, the generalized canonical correlation analysis , is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. While a conventional CCA generalizes Principal component analysis to two sets of random variables, a gCCA ...

  • Generalized chi-squared distribution
  • Generalized Dirichlet distribution
    Generalized Dirichlet distribution
    In statistics, the generalized Dirichlet distribution is a generalization of the Dirichlet distribution with a more general covariance structure and twice the number of parameters...

  • Generalized entropy index
    Generalized entropy index
    The generalized entropy index is a general formula for measuring redundancy in data. The redundancy can be viewed as inequality, lack of diversity, non-randomness, compressibility, or segregation in the data. The primary use is for income inequality...

  • Generalized estimating equation
  • Generalized expected utility
    Generalized expected utility
    The expected utility model developed by John von Neumann and Oskar Morgenstern dominated decision theory from its formulation in 1944 until the late 1970s, not only as a prescriptive, but also as a descriptive model, despite powerful criticism from Maurice Allais and Daniel Ellsberg who showed...

  • Generalized extreme value distribution
  • Generalized gamma distribution
    Generalized gamma distribution
    The generalized gamma distribution is a continuous probability distribution with three parameters. It is a generalization of the two-parameter gamma distribution...

  • Generalized Gaussian distribution
  • Generalised hyperbolic distribution
  • Generalized inverse Gaussian distribution
  • Generalized least squares
    Generalized least squares
    In statistics, generalized least squares is a technique for estimating the unknown parameters in a linear regression model. The GLS is applied when the variances of the observations are unequal , or when there is a certain degree of correlation between the observations...

  • Generalized linear array model
    Generalized linear array model
    In statistics, the generalized linear array model is used for analyzing data sets with array structures. It based on the generalized linear model with the design matrix written as a Kronecker product.- Overview :...

  • Generalized linear mixed model
    Generalized linear mixed model
    In statistics, a generalized linear mixed model is a particular type of mixed model. It is an extension to the generalized linear model in which the linear predictor contains random effects in addition to the usual fixed effects...

  • Generalized linear model
    Generalized linear model
    In statistics, the generalized linear model is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to...

  • Generalized logistic distribution
    Generalized logistic distribution
    The term generalized logistic distribution is used as the name for several different families of probability distributions. For example, Johnson et al. list four forms, which are listed below. One family described here has also been called the skew-logistic distribution...

  • Generalized method of moments
    Generalized method of moments
    In econometrics, generalized method of moments is a generic method for estimating parameters in statistical models. Usually it is applied in the context of semiparametric models, where the parameter of interest is finite-dimensional, whereas the full shape of the distribution function of the data...

  • Generalized multidimensional scaling
  • Generalized normal distribution
  • Generalized p-value
    Generalized p-value
    In statistics, a generalized p-value is an extended version of the classical p-value, which except in a limited number of applications, provide only approximate solutions....

  • Generalized Pareto distribution
  • Generalized Procrustes analysis
    Generalized Procrustes analysis
    Generalized Procrustes analysis is a method of statistical analysis that can be used to compare the shapes of objects, or the results of surveys, interviews, panels. It was developed for analyising the results of free-choice profiling, a survey technique which allows respondents to describe a...

  • Generalized randomized block design
    Generalized randomized block design
    In randomized statistical experiments, generalized randomized block designs are used to study the interaction between blocks and treatments...

  • Generalized Tobit
    Generalized Tobit
    A generalized Tobit is a generalization of the econometric Tobit model after James Tobin. It is also called Heckit after James Heckman. Anothername is "type 2 Tobit model".Tobit models assume that a variable is truncated.-References:...

  • Generalized Wiener process
    Generalized Wiener process
    In statistics, a generalized Wiener process is a continuous time random walk with drift and random jumps at every point in time...

  • Generative model
    Generative model
    In probability and statistics, a generative model is a model for randomly generating observable data, typically given some hidden parameters. It specifies a joint probability distribution over observation and label sequences...

  • Genetic epidemiology
    Genetic epidemiology
    Genetic epidemiology is the study of the role of genetic factors in determining health and disease in families and in populations, and the interplay of such genetic factors with environmental factors...

  • GenStat
    GenStat
    GenStat is a general statistical package. Early versions were developed for large mainframe computers. Up until version 5, there was a Unix binary available, and this continues to be used by many universities and research institutions...

     – software
  • Geo-imputation
    Geo-imputation
    In data analysis involving geographical locations, geo-imputation or geographical imputation methods are steps taken to replace missing values for exact locations with approximate locations derived from associated data...

  • Geodemographic segmentation
    Geodemographic Segmentation
    In marketing, Geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any...

  • Geometric Brownian motion
    Geometric Brownian motion
    A geometric Brownian motion is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion, also called a Wiener process...

  • Geometric data analysis
    Geometric data analysis
    Geometric data analysis can refer to geometric aspects of image analysis, pattern analysis and shape analysis or the approach of multivariate statistics that treats arbitrary data sets as clouds of points in n-dimensional space...

  • Geometric distribution
  • Geometric median
    Geometric median
    The geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points. This generalizes the median, which has the property of minimizing the sum of distances for one-dimensional data, and provides a central tendency in higher...

  • Geometric standard deviation
    Geometric standard deviation
    In probability theory and statistics, the geometric standard deviation describes how spread out are a set of numbers whose preferred average is the geometric mean...

  • Geometric stable distribution
  • Geospatial predictive modeling
    Geospatial predictive modeling
    Geospatial predictive modeling is conceptually rooted in the principle that the occurrences ofevents being modeled are limited in distribution...

  • Geostatistics
    Geostatistics
    Geostatistics is a branch of statistics focusing on spatial or spatiotemporal datasets. Developed originally to predict probability distributions of ore grades for mining operations, it is currently applied in diverse disciplines including petroleum geology, hydrogeology, hydrology, meteorology,...

  • German tank problem
    German tank problem
    In the statistical theory of estimation, estimating the maximum of a uniform distribution is a common illustration of differences between estimation methods...

  • Gerschenkron effect
    Gerschenkron effect
    The Gerschenkron effect was developed by Alexander Gerschenkron, and claims that changing the base year for an index determines the growth rate of the index.This description is from the OECD website :...

  • Gibbs sampling
    Gibbs sampling
    In statistics and in statistical physics, Gibbs sampling or a Gibbs sampler is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables...

  • Gillespie algorithm
    Gillespie algorithm
    In probability theory, the Gillespie algorithm generates a statistically correct trajectory of a stochastic equation. It was created by Joseph L...

  • Gini coefficient
    Gini coefficient
    The Gini coefficient is a measure of statistical dispersion developed by the Italian statistician and sociologist Corrado Gini and published in his 1912 paper "Variability and Mutability" ....

  • Girsanov theorem
    Girsanov theorem
    In probability theory, the Girsanov theorem describes how the dynamics of stochastic processes change when the original measure is changed to an equivalent probability measure...

  • Gittins index
    Gittins index
    The Gittins index is a measure of the reward that can be achieved by a process evolving from its present state onwards with the probability that it will be terminated in future...

  • GLIM (software)
    GLIM (software)
    GLIM is a statistical software program for fitting generalized linear models .It was developed by the Royal Statistical Society'sWorking Party on Statistical Computing...

     – software
  • Glivenko–Cantelli theorem
  • GLUE (uncertainty assessment)
    GLUE (uncertainty assessment)
    In hydrology, Generalized Likelihood Uncertainty Estimation is a statistical method for quantifying the uncertainty of model predictions. The method has been introduced by Beven and Binley...

  • Goldfeld–Quandt test
    Goldfeld–Quandt test
    In statistics, the Goldfeld–Quandt test checks for homoscedasticity in regression analyses. It does this by dividing a dataset into two parts or groups, and hence the test is sometimes called a two-group test. The Goldfeld–Quandt test is one of two tests proposed in a 1965 paper by Stephen...

  • Gompertz function
  • Gompertz–Makeham law of mortality
  • Good–Turing frequency estimation
  • Goodhart's law
    Goodhart's law
    Goodhart's law, although it can be expressed in many ways, states that once a social or economic indicator or other surrogate measure is made a target for the purpose of conducting social or economic policy, then it will lose the information content that would qualify it to play that role...

  • Goodman and Kruskal's lambda
    Goodman and Kruskal's lambda
    In probability theory and statistics, Goodman & Kruskal's lambda is a measure of proportional reduction in error in cross tabulation analysis...

  • Goodness of fit
    Goodness of fit
    The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g...

  • Gordon–Newell network
  • Gordon–Newell theorem
    Gordon–Newell theorem
    In queueing theory, a discipline within the mathematical theory of probability, the Gordon–Newell theorem is an extension of Jackson's theorem from open queueing networks to closed queueing networks of exponential servers. We cannot apply Jackson's theorem to closed networks because the queue...

  • Graeco-Latin square
    Graeco-Latin square
    In mathematics, a Graeco-Latin square or Euler square or orthogonal Latin squares of order n over two sets S and T, each consisting of n symbols, is an n×n arrangement of cells, each cell containing an ordered pair , where s is in S and t is in T, such that every row and every column contains...

  • Grand mean
    Grand mean
    The grand mean is the mean of the means of several subsamples. For example, consider several lots, each containing several items. The items from each lot are sampled for a measure of some variable and the means of the measurements from each lot are computed. The mean of the measures from each lot...

  • Granger causality
    Granger causality
    The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another. Ordinarily, regressions reflect "mere" correlations, but Clive Granger, who won a Nobel Prize in Economics, argued that there is an interpretation of a set of tests...

  • Graph cuts in computer vision
    Graph cuts in computer vision
    As applied in the field of computer vision, graph cuts can be employed to efficiently solve a wide variety of low-level computer vision problems , such as image smoothing, the stereo correspondence problem, and many other computer vision problems that can be formulated in terms of energy minimization...

     – a potential application of Bayesian analysis
  • Graphical model
    Graphical model
    A graphical model is a probabilistic model for which a graph denotes the conditional independence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning....

  • Graphical models for protein structure
    Graphical models for protein structure
    Graphical models have become powerful frameworks for protein structure prediction, protein–protein interaction and free energy calculations for protein structures...

  • GraphPad InStat
    GraphPad InStat
    GraphPad InStat is a commercial scientific statistics software published by GraphPad Software, Inc., a privately owned California corporation. InStat is available for both Windows and Macintosh computers.-Features:...

     – software
  • GraphPad Prism
    GraphPad Prism
    GraphPad Prism is a commercial scientific 2D graphing and statistics software published by GraphPad Software, Inc., a privately-held California corporation...

     – software
  • Gravity model of trade
    Gravity model of trade
    The gravity model of trade in international economics, similar to other gravity models in social science, predicts bilateral trade flows based on the economic sizes of and distance between two units. The model was first used by Tinbergen in 1962...

  • Greenwood statistic
  • Gretl
    Gretl
    gretl is an open-source statistical package, mainly for econometrics. The name is an acronym for Gnu Regression, Econometrics and Time-series Library. It has a graphical user interface and can be used together with X-12-ARIMA, TRAMO/SEATS, R, Octave, and Ox. It is written in C, uses GTK as widget...

  • Group family
    Group family
    In probability theory, especially as that field is used in statistics, a group family of probability distributions is a family obtained by subjecting a random variable with a fixed distribution to a suitable family of transformations such as a location-scale family, or otherwise a family of...

  • Group method of data handling
    Group method of data handling
    Group method of data handling is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models....

  • Group size measures
    Group size measures
    Many animals, including humans, tend to live in groups, herds, flocks, bands, packs, shoals, or colonies of conspecific individuals. The size of these groups, as expressed by the number of participant individuals, is an important aspect of their social environment...

  • Grouped data
    Grouped data
    Grouped data is a statistical term used in data analysis. A raw dataset can be organized by constructing a table showing the frequency distribution of the variable...

  • Grubbs' test for outliers
    Grubbs' test for outliers
    Grubbs' test , also known as the maximum normed residual test, is a statistical test used to detect outliers in a univariate data set assumed to come from a normally distributed population.-Definition:...

  • Guess value
    Guess value
    A guess value is more commonly called a starting value or initial value. These are necessary for most optimization problems which use search algorithms, because those algorithms are mainly deterministic and iterative, and they need to start somewhere...

  • Guesstimate
    Guesstimate
    Guesstimate is an informal English contraction of guess and estimate, first used by American statisticians in 1934 or 1935. It is defined as an estimate made without using adequate or complete information, or, more strongly, as an estimate arrived at by guesswork or conjecture...

  • Gumbel distribution
  • Guttman scale
    Guttman scale
    In statistical surveys conducted by means of structured interviews or questionnaires, a subset of the survey items having binary answers forms a Guttman scale if they can be ranked in some order so that, for a rational respondent, the response pattern can be captured by a single index on that...

  • Gy's sampling theory
    Gy's sampling theory
    Gy's sampling theory is a theory about the sampling of materials, developed by Pierre Gy from the 1950s to beginning 2000s in articles and books including:* Sampling nomogram* Sampling of particulate materials; theory and practice...


H

  • h-index
    H-index
    The h-index is an index that attempts to measure both the productivity and impact of the published work of a scientist or scholar. The index is based on the set of the scientist's most cited papers and the number of citations that they have received in other publications...

  • Hájek–Le Cam convolution theorem
    Hájek–Le Cam convolution theorem
    In statistics, the Hájek–Le Cam convolution theorem states that any regular estimator in a parametric model is asymptotically equivalent to a sum of two independent random variables, one of which is normal with asymptotic variance equal to the inverse of Fisher information, and the other having...

  • Half circle distribution
  • Half-logistic distribution
  • Half-normal distribution
    Half-normal distribution
    The half-normal distribution is the probability distribution of the absolute value of a random variable that is normally distributed with expected value 0 and variance σ2. I.e...

  • Halton sequence
  • Hamburger moment problem
  • Hannan–Quinn information criterion
  • Harris chain
  • Hardy–Weinberg principle – statistical genetics
  • Hartley's test
    Hartley's test
    In statistics, Hartley's test, also known as the Fmax test or Hartley's Fmax, is used in the analysis of variance to verify that different groups have a similar variance, an assumption needed for other statistical tests.It was developed by H. O...

  • Hat matrix
    Hat matrix
    In statistics, the hat matrix, H, maps the vector of observed values to the vector of fitted values. It describes the influence each observed value has on each fitted value...

  • Hammersley–Clifford theorem
    Hammersley–Clifford theorem
    The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics, that gives necessary and sufficient conditions under which a positive probability distribution can be represented as a Markov network...

  • Hausdorff moment problem
  • Hausman specification test redirects to Hausman test
    Hausman test
    The Hausman test or Hausman specification test is a statistical test in econometrics named after Jerry A. Hausman. The test evaluates the significance of an estimator versus an alternative estimator...

  • Haybittle–Peto boundary
    Haybittle–Peto boundary
    The Haybittle–Peto boundary is a rule for deciding when to stop a clinical trial prematurely.The typical clinical trial compares two groups of patients. One group are given a placebo or conventional treatment, while the other group of patients are given the treatment that is being tested...

  • Hazard function — redirects to Failure rate
    Failure rate
    Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is often denoted by the Greek letter λ and is important in reliability engineering....

  • Hazard ratio
    Hazard ratio
    In survival analysis, the hazard ratio is the ratio of the hazard rates corresponding to the conditions described by two sets of explanatory variables. For example, in a drug study, the treated population may die at twice the rate per unit time as the control population. The hazard ratio would be...

  • Heaps' law
    Heaps' law
    In linguistics, Heaps' law is an empirical law which describes the portion of a vocabulary which is represented by an instance document consisting of words chosen from the vocabulary. This can be formulated as V_R = Kn^\beta...

  • Health care analytics
    Health care analytics
    Health care analytics is a rapidly evolving field of health care business solutions that makes extensive use of data, statistical and qualitative analysis, explanatory and predictive modeling.- Theory :...

  • Heart rate variability
    Heart rate variability
    Heart rate variability is a physiological phenomenon where the time interval between heart beats varies. It is measured by the variation in the beat-to-beat interval....

  • Heavy-tailed distribution
    Heavy-tailed distribution
    In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution...

  • Heckman correction
    Heckman correction
    The Heckman correction is any of a number of related statistical methods developed by James Heckman in 1976 through 1979 which allow the researcher to correct for selection bias...

  • Hedonic regression
    Hedonic regression
    In economics, hedonic regression or hedonic demand theory is a revealed preference method of estimating demand or value. It decomposes the item being researched into its constituent characteristics, and obtains estimates of the contributory value of each characteristic...

  • Hellin's law
    Hellin's Law
    Hellin's Law is the principle that one in about 89 pregnancies ends in the birth of twins, triplets once in 892 births, and quadruplets once in 893 births....

  • Hellinger distance
    Hellinger distance
    In probability and statistics, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f-divergence...

  • Helmert–Wolf blocking
  • Herfindahl index
    Herfindahl index
    The Herfindahl index is a measure of the size of firms in relation to the industry and an indicator of the amount of competition among them. Named after economists Orris C. Herfindahl and Albert O. Hirschman, it is an economic concept widely applied in competition law, antitrust and also...

  • Heston model
    Heston model
    In finance, the Heston model, named after Steven Heston, is a mathematical model describing the evolution of the volatility of an underlying asset...

  • Heteroscedasticity
  • Heteroscedasticity-consistent standard errors
    Heteroscedasticity-consistent standard errors
    The topic of heteroscedasticity-consistent standard errors arises in statistics and econometrics in the context of linear regression and also time series analysis...

  • Heteroskedasticity — redirects to Heteroscedasticity
  • Hidden Markov model
    Hidden Markov model
    A hidden Markov model is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved states. An HMM can be considered as the simplest dynamic Bayesian network. The mathematics behind the HMM was developed by L. E...

  • Hidden Markov random field
    Hidden Markov random field
    A hidden Markov random field is a generalization of a hidden Markov model. Instead of having an underlying Markov chain, hidden Markov random fields have an underlying Markov random field.Suppose that we observe a random variable Y_i , where i \in S ....

  • Hidden semi-Markov model
    Hidden semi-Markov model
    A hidden semi-Markov model is a statistical model with the same structure as a hidden Markov model except that the unobservable process is semi-Markov rather than Markov. This means that the probability of there being a change in the hidden state depends on the amount of time that has elapsed...

  • Hierarchical Bayes model
    Hierarchical Bayes model
    The hierarchical Bayes model is a method in modern Bayesian statistical inference. It is a framework for describing statistical models that can capture dependencies more realistically than non-hierarchical models....

  • Hierarchical clustering
    Hierarchical clustering
    In statistics, hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types:...

  • Hierarchical hidden Markov model
    Hierarchical hidden Markov model
    The hierarchical hidden Markov model is a statistical model derived from the hidden Markov model . In an HHMM each state is considered to be a self-contained probabilistic model. More precisely each state of the HHMM is itself an HHMM....

  • Hierarchical linear modeling
    Hierarchical linear modeling
    In statistics, hierarchical linear modeling , a form of multi-level analysis, is a more advanced form of simple linear regression and multiple linear regression. Multilevel analysis allows variance in outcome variables to be analysed at multiple hierarchical levels, whereas in simple linear and...

  • High-dimensional statistics
    High-dimensional statistics
    In statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than dimensions considered in classical multivariate analysis. High-dimensional statistics relies on the theory of random vectors...

  • Higher-order factor analysis
    Higher-order factor analysis
    Higher-order factor analysis is a statistical method consisting of repeating steps factor analysis – oblique rotation – factor analysis of rotated factors... Its merit is to enable the researcher to see the hierarchical structure of studied phenomena...

  • Higher-order statistics
    Higher-order statistics
    Higher-order statistics are descriptive measures of, among other things, qualities of probability distributions and sample distributions, and are, themselves, extensions of first- and second-order measures to higher orders. Skewness and kurtosis are examples of this...

  • Hirschman uncertainty
    Hirschman uncertainty
    In quantum mechanics, information theory, and Fourier analysis, the Hirschman uncertainty is defined as the sum of the temporal and spectral Shannon entropies. It turns out that Heisenberg's uncertainty principle can be expressed as a lower bound on the sum of these entropies...

  • Histogram
    Histogram
    In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

  • Historiometry
    Historiometry
    Historiometry is the historical study of human progress or individual personal characteristics, using statistics to analyze references to geniuses, their statements, behavior and discoveries in relatively neutral texts...

  • History of randomness
    History of randomness
    In ancient history, the concepts of chance and randomness were intertwined with that of fate. Many ancient peoples threw dice to determine fate, and this later evolved into games of chance...

  • History of statistics
    History of statistics
    The history of statistics can be said to start around 1749 although, over time, there have been changes to the interpretation of what the word statistics means. In early times, the meaning was restricted to information about states...

  • Hitting time
    Hitting time
    In the study of stochastic processes in mathematics, a hitting time is a particular instance of a stopping time, the first time at which a given process "hits" a given subset of the state space...

  • Hodges’ estimator
    Hodges’ estimator
    In statistics, Hodges’ estimator is a famous counter example of an estimator which is "superefficient", i.e. it attains smaller asymptotic variance than regular efficient estimators...

  • Hodges–Lehmann estimator
  • Hoeffding's independence test
  • Hoeffding's lemma
    Hoeffding's lemma
    In probability theory, Hoeffding's lemma is an inequality that bounds the moment-generating function of any bounded random variable. It is named after the Finnish–American mathematical statistician Wassily Hoeffding....

  • Hoeffding's inequality
    Hoeffding's inequality
    In probability theory, Hoeffding's inequality provides an upper bound on the probability for the sum of random variables to deviate from its expected value. Hoeffding's inequality was proved by Wassily Hoeffding.LetX_1, \dots, X_n \!...

  • Holm–Bonferroni method
  • Holtsmark distribution
    Holtsmark distribution
    The Holtsmark distribution is a continuous probability distribution. The Holtsmark distribution is a special case of a stable distribution with the index of stability or shape parameter \alpha equal to 3/2 and skewness parameter \beta of zero. Since \beta equals zero, the distribution is...

  • Homogeneity (statistics)
    Homogeneity (statistics)
    In statistics, homogeneity and its opposite, heterogeneity, arise in describing the properties of a dataset, or several datasets. They relate to the validity of the often convenient assumption that the statistical properties of any one part of an overall dataset are the same as any other part...

  • Homoscedasticity
    Homoscedasticity
    In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity...

  • Hoover index
  • Horvitz–Thompson estimator
    Horvitz–Thompson estimator
    In statistics, the Horvitz–Thompson estimator, named after Daniel G. Horvitz and Donovan J. Thompson, is a method for estimating the mean of a superpopulation in a stratified sample. Inverse probability weighting is applied to account for different proportions of observations within strata...

  • Hosmer–Lemeshow test
    Hosmer–Lemeshow test
    The Hosmer–Lemeshow test is a statistical test for goodness of fit for logistic regression models. It is used frequently in risk prediction models. The test assesses whether or not the observed event rates match expected event rates in subgroups of the model population. The Hosmer–Lemeshow...

  • Hotelling's T-squared distribution
  • How to Lie with Statistics
    How to Lie with Statistics
    How to Lie with Statistics is a book written by Darrell Huff in 1954 presenting an introduction to statistics for the general reader. Huff was a journalist who wrote many "how to" articles as a freelancer, but was not a statistician....

     (book)
  • Howland will forgery trial
    Howland will forgery trial
    The Howland will forgery trial was a U.S. court case in 1868 to decide Henrietta Howland Robinson's contest of the will of Sylvia Ann Howland. It is famous for the forensic use of mathematics by Benjamin Peirce as an expert witness.-History:...

  • Hubbert curve
    Hubbert curve
    The Hubbert curve is an approximation of the production rate of a resource over time. It is a symmetric logistic distribution curve, often confused with the "normal" gaussian function. It first appeared in "Nuclear Energy and the Fossil Fuels," geophysicist M...

  • Huber–White standard error — redirects to Heteroscedasticity-consistent standard errors
    Heteroscedasticity-consistent standard errors
    The topic of heteroscedasticity-consistent standard errors arises in statistics and econometrics in the context of linear regression and also time series analysis...

  • Huber loss function
    Huber Loss Function
    In statistical theory, the Huber loss function is a function used in robust estimation that allows construction of an estimate which allows the effect of outliers to be reduced, while treating non-outliers in a more standard way.-Definition:...

  • Human subject research
  • Hurst exponent
    Hurst exponent
    The Hurst exponent is used as a measure of the long term memory of time series. It relates to the autocorrelations of the time series and the rate at which these decrease as the lag between pairs of values increases....

  • Hyper-exponential distribution
  • Hyper-Graeco-Latin square design
    Hyper-Graeco-Latin square design
    In the design of experiments, hyper-Graeco-Latin squares are efficient designs to study the effect of one primary factor in the presence of 4 blocking factors. They are restricted, however, to the case in which all the factors have the same number of levels.Designs for 4- and 5-level factors are...

  • Hyperbolic distribution
    Hyperbolic distribution
    The hyperbolic distribution is a continuous probability distribution that is characterized by the fact that the logarithm of the probability density function is a hyperbola. Thus the distribution decreases exponentially, which is more slowly than the normal distribution...

  • Hyperbolic secant distribution
  • Hypergeometric distribution
  • Hyperparameter
    Hyperparameter
    In Bayesian statistics, a hyperparameter is a parameter of a prior distribution; the term is used to distinguish them from parameters of the model for the underlying system under analysis...

  • Hyperprior
    Hyperprior
    In Bayesian statistics, a hyperprior is a prior distribution on a hyperparameter, that is, on a parameter of a prior distribution.As with the term hyperparameter, the use of hyper is to distinguish it from a prior distribution of a parameter of the model for the underlying system...

  • Hypoexponential distribution
    Hypoexponential distribution
    In probability theory the hypoexponential distribution or the generalized Erlang distribution is a continuous distribution, that has found use in the same fields as the Erlang distribution, such as queueing theory, teletraffic engineering and more generally in stochastic processes...


I

  • Idealised population
    Idealised population
    main article: effective population sizeIn population genetics an idealised population, also sometimes called a Fisher-Wright population after R.A. Fisher and Sewall Wright, is a population whose members can mate and reproduce with any other member of the other gender, has a sex ratio of 1 and no...

  • Idempotent matrix
    Idempotent matrix
    In algebra, an idempotent matrix is a matrix which, when multiplied by itself, yields itself. That is, the matrix M is idempotent if and only if MM = M...

  • Identifiability
    Identifiability
    In statistics, identifiability is a property which a model must satisfy in order for inference to be possible. We say that the model is identifiable if it is theoretically possible to learn the true value of this model’s underlying parameter after obtaining an infinite number of observations from it...

  • Ignorability
    Ignorability
    In statistics, ignorability refers to an experiment design where the method of data collection do not depend on the missing data...

  • Illustration of the central limit theorem
    Illustration of the central limit theorem
    This article gives two concrete illustrations of the central limit theorem. Both involve the sum of independent and identically-distributed random variables and show how the probability distribution of the sum approaches the normal distribution as the number of terms in the sum increases.The first...

  • Image denoising
    Image denoising
    Image denoising refers to the recovery of a digital image that has been contaminated by additive white Gaussian noise .-Technical description:...

  • Importance sampling
    Importance sampling
    In statistics, importance sampling is a general technique for estimating properties of a particular distribution, while only having samples generated from a different distribution rather than the distribution of interest. It is related to Umbrella sampling in computational physics...

  • Imprecise probability
    Imprecise probability
    Imprecise probability generalizes probability theory to allow for partial probability specifications, and is applicable when information is scarce, vague, or conflicting, in which case a unique probability distribution may be hard to identify...

  • Imputation (statistics)
    Imputation (statistics)
    In statistics, imputation is the substitution of some value for a missing data point or a missing component of a data point. Once all missing values have been imputed, the dataset can then be analysed using standard techniques for complete data...

  • Incidence (epidemiology)
    Incidence (epidemiology)
    Incidence is a measure of the risk of developing some new condition within a specified period of time. Although sometimes loosely expressed simply as the number of new cases during some time period, it is better expressed as a proportion or a rate with a denominator.Incidence proportion is the...

  • Inclusion probability
    Inclusion probability
    In statistics, in the theory relating to sampling from finite populations, the inclusion probability of an element or member of the population is its probability of becoming part of the sample during the drawing of a single sample....

  • Increasing process
  • Indecomposable distribution
    Indecomposable distribution
    In probability theory, an indecomposable distribution is a probability distribution that cannot be represented as the distribution of the sum of two or more non-constant independent random variables: Z ≠ X + Y. If it can be so expressed, it is decomposable:...

  • Independence of irrelevant alternatives
    Independence of irrelevant alternatives
    Independence of irrelevant alternatives is an axiom of decision theory and various social sciences.The word is used in different meanings in different contexts....

  • Independent component analysis
    Independent component analysis
    Independent component analysis is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals...

  • Independent and identically distributed random variables
    Independent and identically distributed random variables
    In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent....

  • Index number
  • Index of coincidence
    Index of coincidence
    In cryptography, coincidence counting is the technique of putting two texts side-by-side and counting the number of times that identical letters appear in the same position in both texts...

  • Index of dispersion
  • Indicators of spatial association
    Indicators of spatial association
    Indicators of spatial association are statistics that evaluate the existence of clusters in the spatial arrangement of a given variable. For instance if we are studying cancer rates among census tracts in a given city local clusters in the rates mean that there are areas that have higher or lower...

  • Indirect least squares
  • Inductive inference
    Inductive inference
    Around 1960, Ray Solomonoff founded the theory of universal inductive inference, the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols...

  • An inequality on location and scale parameters — redirects to Chebyshev's inequality
    Chebyshev's inequality
    In probability theory, Chebyshev’s inequality guarantees that in any data sample or probability distribution,"nearly all" values are close to the mean — the precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean...

  • Inference
    Inference
    Inference is the act or process of deriving logical conclusions from premises known or assumed to be true. The conclusion drawn is also called an idiomatic. The laws of valid inference are studied in the field of logic.Human inference Inference is the act or process of deriving logical conclusions...

  • Inferential statistics redirects to Statistical inference
    Statistical inference
    In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...

  • Infinite divisibility (probability)
    Infinite divisibility (probability)
    The concepts of infinite divisibility and the decomposition of distributions arise in probability and statistics in relation to seeking families of probability distributions that might be a natural choice in certain applications, in the same way that the normal distribution is...

  • Infinite monkey theorem
    Infinite monkey theorem
    The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare....

  • Influence diagram
    Influence diagram
    An influence diagram is a compact graphical and mathematical representation of a decision situation...

  • Info-gap decision theory
    Info-gap decision theory
    Info-gap decision theory is a non-probabilistic decision theory that seeks to optimize robustness to failure – or opportuneness for windfall – under severe uncertainty, in particular applying sensitivity analysis of the stability radius type to perturbations in the value of a given estimate of the...

  • Information bottleneck method
    Information bottleneck method
    The information bottleneck method is a technique introduced by Naftali Tishby et al. [1] for finding the best tradeoff between accuracy and complexity when summarizing a random variable X, given a joint probability distribution between X and an observed relevant variable Y...

  • Information geometry
    Information geometry
    Information geometry is a branch of mathematics that applies the techniques of differential geometry to the field of probability theory. It derives its name from the fact that the Fisher information is used as the Riemannian metric when considering the geometry of probability distribution families...

  • Information gain ratio
    Information gain ratio
    - Information Gain Calculation :Let Attr be the set of all attributes and Ex the set of all training examples,value withx\in Ex defines the value of a specific example x for attribute a\in Attr, H specifies the entropy....

  • Information ratio
    Information ratio
    The Information ratio is a measure of the risk-adjusted return of a financial security . It is also known as Appraisal ratio and is defined as expected active return divided by tracking error, where active return is the difference between the return of the security and the return of a selected...

     – finance
  • Information source (mathematics)
    Information source (mathematics)
    In mathematics, an information source is a sequence of random variables ranging over a finite alphabet Γ, having a stationary distribution.The uncertainty, or entropy rate, of an information source is defined as...

  • Information theory
    Information theory
    Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...

  • Inherent bias
    Inherent bias
    The term "inherent bias" refers to the effect of underlying factors or assumptions that skew viewpoints a subject under discussion. There are multiple formal definitions of "inherent bias" which depend on the particular field of study....

  • Inherent zero
    Inherent zero
    In statistics, an inherent zero is a reference point used to describe data sets which are indicative of magnitude of an absolute or relative nature. Inherent zeros are used on ratio scales....

  • Injury prevention
    Injury prevention
    Injury prevention are efforts to prevent or reduce the severity of bodily injuries caused by external mechanisms, such as accidents, before they occur. Injury prevention is a component of safety and public health, and its goal is to improve the health of the population by preventing injuries and...

     – application
  • Innovation (signal processing)
    Innovation (signal processing)
    In time series analysis — as conducted in statistics, signal processing, and many other fields — the innovation is the difference between the observed value of a variable at time t and the optimal forecast of that value based on information available prior to time t...

  • Innovations vector
    Innovations vector
    The innovations vector or residual vector is the difference between the measurement vector and the predicted measurement vector. Each difference represents the deviation of the observed random variable from the predicted response. The innovation vector is often used to check the validity of a...

  • Institutional review board
    Institutional review board
    An institutional review board , also known as an independent ethics committee or ethical review board , is a committee that has been formally designated to approve, monitor, and review biomedical and behavioral research involving humans with the aim to protect the rights and welfare of the...

  • Instrumental variable
    Instrumental variable
    In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....

  • Intention to treat analysis
    Intention to treat analysis
    In epidemiology, an intention to treat analysis is an analysis based on the initial treatment intent, not on the treatment eventually administered. ITT analysis is intended to avoid various misleading artifacts that can arise in intervention research...

  • Interaction (statistics)
    Interaction (statistics)
    In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive...

  • Interaction variable – see Interaction (statistics)
    Interaction (statistics)
    In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive...

  • Interclass correlation
    Interclass correlation
    In statistics, the interclass correlation measures a bivariate relation among variables.The Pearson correlation coefficient is the most commonly used interclass correlation....

  • Interdecile range
    Interdecile range
    In statistics, the interdecile range is the difference between the first and the ninth deciles . The interdecile range is a measure of statistical dispersion of the values in a set of data, similar to the range and the interquartile range....

  • Interim analysis
    Interim analysis
    Clinical trials are unique in that enrollment of patients is a continual process staggered in time. This means that if a treatment is particularly beneficial or harmful compared to the concurrent placebo group while the study is on-going, the investigators are ethically obliged to assess that...

  • Internal consistency
    Internal consistency
    In statistics and research, internal consistency is typically a measure based on the correlations between different items on the same test . It measures whether several items that propose to measure the same general construct produce similar scores...

  • Internal validity
    Internal validity
    Internal validity is the validity of inferences in scientific studies, usually based on experiments as experimental validity.- Details :...

  • Interquartile mean
    Interquartile mean
    The interquartile mean is a statistical measure of central tendency, much like the mean , the median, and the mode....

  • Interquartile range
    Interquartile range
    In descriptive statistics, the interquartile range , also called the midspread or middle fifty, is a measure of statistical dispersion, being equal to the difference between the upper and lower quartiles...

  • Inter-rater reliability
    Inter-rater reliability
    In statistics, inter-rater reliability, inter-rater agreement, or concordance is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by...

  • Interval estimation
    Interval estimation
    In statistics, interval estimation is the use of sample data to calculate an interval of possible values of an unknown population parameter, in contrast to point estimation, which is a single number. Neyman identified interval estimation as distinct from point estimation...

  • Intervening variable
    Intervening variable
    An intervening variable is a hypothetical internal state that is used to explain relationships between observed variables, such as independent and dependent variables, in empirical research.- History :...

  • Intra-rater reliability
    Intra-rater reliability
    In statistics, intra-rater reliability is the degree of agreement among multiple repetitions of a diagnostic test performed by a single rater.-See also:* Inter-rater reliability* Reliability * Repeatability* Test-retest reliability...

  • Intraclass correlation
    Intraclass correlation
    In statistics, the intraclass correlation is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other...

  • Invariant estimator
    Invariant estimator
    In statistics, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities...

  • Invariant extended Kalman filter
    Invariant extended Kalman filter
    The invariant extended Kalman filter is a new version of the extended Kalman filter for nonlinear systems possessing symmetries . It combines the advantages of both the EKF and the recently introduced symmetry-preserving filters...

  • Inverse distance weighting
    Inverse distance weighting
    Inverse distance weighting is a method for multivariate interpolation, a process of assigning values to unknown points by using values from usually scattered set of known points...

  • Inverse Gaussian distribution
    Inverse Gaussian distribution
    | cdf = \Phi\left +\exp\left \Phi\left...

  • Inverse Mills ratio
    Inverse Mills ratio
    In statistics, the inverse Mills ratio, named after John P. Mills, is the ratio of the probability density function to the cumulative distribution function of a distribution....

  • Inverse probability
    Inverse probability
    In probability theory, inverse probability is an obsolete term for the probability distribution of an unobserved variable.Today, the problem of determining an unobserved variable is called inferential statistics, the method of inverse probability is called Bayesian probability, the "distribution"...

  • Inverse relationship
    Inverse relationship
    An inverse or negative relationship is a mathematical relationship in which one variable, say y, decreases as another, say x, increases. For a linear relation, this can be expressed as y = a-bx, where -b is a constant value less than zero and a is a constant...

  • Inverse-chi-squared distribution
  • Inverse-gamma distribution
  • Inverse transform sampling
  • Inverse-variance weighting
    Inverse-variance weighting
    In statistics, inverse-variance weighting is a method of aggregating two or more random variables to minimize the variance of the sum. Each random variable in the sum is weighted in inverse proportion to its variance....

  • Inverse-Wishart distribution
  • Iris flower data set
    Iris flower data set
    The Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by Sir Ronald Aylmer Fisher as an example of discriminant analysis...

  • Irwin–Hall distribution
  • Isomap
    Isomap
    In statistics, Isomap is one of several widely used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling . Isomap is used for computing a quasi-isometric, low-dimensional embedding of a set of high-dimensional data points...

  • Isotonic regression
  • Item response theory
    Item response theory
    In psychometrics, item response theory also known as latent trait theory, strong true score theory, or modern mental test theory, is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is based...

  • Item-total correlation
    Item-total correlation
    The item-total correlation test arises in psychometrics in contexts where a number of tests or questions are given to an individual and where the problem is to construct a useful single quantity for each individual that can be used to compare that individual with others in a given population...

  • Item tree analysis
    Item tree analysis
    Item tree analysis is a data analytical method which allows constructing ahierarchical structure on the items of a questionnaire or test from observed responsepatterns. Assume that we have a questionnaire with m items and that subjects can...

  • Iterative proportional fitting
    Iterative proportional fitting
    The iterative proportional fitting procedure is an iterative algorithm for estimating cell values of a contingency table such that the marginal totals remain fixed and the estimated table decomposes into an outer...

  • Iteratively reweighted least squares
  • Itō calculus
    Ito calculus
    Itō calculus, named after Kiyoshi Itō, extends the methods of calculus to stochastic processes such as Brownian motion . It has important applications in mathematical finance and stochastic differential equations....

  • Itō isometry
    Ito isometry
    In mathematics, the Itō isometry, named after Kiyoshi Itō, is a crucial fact about Itō stochastic integrals. One of its main applications is to enable the computation of variances for stochastic processes....

  • Itō's lemma
    Ito's lemma
    In mathematics, Itō's lemma is used in Itō stochastic calculus to find the differential of a function of a particular type of stochastic process. It is named after its discoverer, Kiyoshi Itō...


J

  • Jaccard index
    Jaccard index
    The Jaccard index, also known as the Jaccard similarity coefficient , is a statistic used for comparing the similarity and diversity of sample sets....

  • Jackknife (statistics) redirects to Resampling (statistics)
    Resampling (statistics)
    In statistics, resampling is any of a variety of methods for doing one of the following:# Estimating the precision of sample statistics by using subsets of available data or drawing randomly with replacement from a set of data points # Exchanging labels on data points when performing significance...

  • Jackson network
  • Jackson's theorem (queueing theory)
  • Jadad scale
    Jadad scale
    The Jadad scale, sometimes known as Jadad scoring or the Oxford quality scoring system, is a procedure to independently assess the methodological quality of a clinical trial...

  • James–Stein estimator
  • Jarque–Bera test
    Jarque–Bera test
    In statistics, the Jarque–Bera test is a goodness-of-fit test of whether sample data have the skewness and kurtosis matching a normal distribution. The test is named after Carlos Jarque and Anil K. Bera...

  • Jeffreys prior
    Jeffreys prior
    In Bayesian probability, the Jeffreys prior, named after Harold Jeffreys, is a non-informative prior distribution on parameter space that is proportional to the square root of the determinant of the Fisher information:...

  • Jensen's inequality
    Jensen's inequality
    In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906. Given its generality, the inequality appears in many forms depending on the context,...

  • Jensen–Shannon divergence
    Jensen–Shannon divergence
    In probability theory and statistics, the Jensen–Shannon divergence is a popular method of measuring the similarity between two probability distributions. It is also known as information radius or total divergence to the average. It is based on the Kullback–Leibler divergence, with the notable ...

  • JMulTi
    JMulTi
    JMulTi is an open-source interactive software for econometric analysis, specialised in univariate and multivariate time series analysis. It has a Java graphical user interface....

     – software
  • Johansen test
    Johansen test
    In statistics, the Johansen test, named after Søren Johansen, is a procedure for testing cointegration of several I time series. This test permits more than one cointegrating relationship so is more generally applicable than the Engle–Granger test which is based on the Dickey–Fuller test for...

  • Joint probability distribution
  • JMP (statistical software)
    JMP (statistical software)
    JMP is a computer program that was first developed by John Sall and others to perform simple and complex statistical analyses.It dynamically links statistics with graphics to interactively explore, understand, and visualize data...

  • Jump process
    Jump process
    A jump process is a type of stochastic process that has discrete movements, called jumps, rather than small continuous movements.In physics, jump processes result in diffusion...

  • Jump-diffusion model
  • Junction tree algorithm

K

  • K-distribution
    K-distribution
    The K-distribution is a probability distribution that arises as the consequence of a statistical or probabilistic model used in Synthetic Aperture Radar imagery...

  • K-means algorithm
    K-means algorithm
    In statistics and data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean...

     redirects to k-means clustering
  • K-means++
    K-means++
    In applied statistics, k-means++ is an algorithm for choosing the initial values for the k-means clustering algorithm. It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor...

  • K-medians clustering
    K-medians clustering
    In statistics and machine learning, k-medians clustering is a variation of k-means clustering where instead of calculating the mean for each cluster to determine its centroid, one instead calculates the median...

  • K-medoids
    K-medoids
    The -medoids algorithm is a clustering algorithm related to the -means algorithm and the medoidshift algorithm. Both the -means and -medoids algorithms are partitional and both attempt to minimize squared error, the distance between points labeled to be in a cluster and a point designated as the...

  • Kalman filter
    Kalman filter
    In statistics, the Kalman filter is a mathematical method named after Rudolf E. Kálmán. Its purpose is to use measurements observed over time, containing noise and other inaccuracies, and produce values that tend to be closer to the true values of the measurements and their associated calculated...

  • Kaplan–Meier estimator
  • Kappa coefficient
  • Kappa statistic
  • Karhunen–Loève theorem
  • Kendall tau distance
    Kendall tau distance
    The Kendall tau distance is a metric that counts the number of pairwise disagreements between two lists. The larger the distance, the more dissimilar the two lists are. Kendall tau distance is also called bubble-sort distance since it is equivalent to the number of swaps that the bubble sort...

  • Kendall tau rank correlation coefficient
    Kendall tau rank correlation coefficient
    In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's tau coefficient, is a statistic used to measure the association between two measured quantities...

  • Kendall's notation
    Kendall's notation
    In queueing theory, Kendall's notation is the standard system used to describe and classify the queueing model that a queueing system corresponds to. First suggested by D. G...

  • Kendall's W
    Kendall's W
    Kendall's W is a non-parametric statistic. It is a normalization of the statistic of the Friedman test, and can be used for assessing agreement among raters...

     – Kendall's coefficient of concordance
  • Kent distribution
  • Kernel density estimation
    Kernel density estimation
    In statistics, kernel density estimation is a non-parametric way of estimating the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample...

  • Kernel methods
    Kernel methods
    In computer science, kernel methods are a class of algorithms for pattern analysis, whose best known elementis the support vector machine...

  • Kernel principal component analysis
  • Kernel regression
    Kernel regression
    The kernel regression is a non-parametric technique in statistics to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables X and Y....

  • Kernel smoother
    Kernel smoother
    A kernel smoother is a statistical technique for estimating a real valued function f\,\,\left by using its noisy observations, when no parametric model for this function is known...

  • Kernel (statistics)
    Kernel (statistics)
    A kernel is a weighting function used in non-parametric estimation techniques. Kernels are used in kernel density estimation to estimate random variables' density functions, or in kernel regression to estimate the conditional expectation of a random variable. Kernels are also used in time-series,...

  • Khmaladze transformation
    Khmaladze transformation
    The Khmaladze Transformation is a statistical tool.Consider the sequence of empirical distribution functions F_n based on asequence of i.i.d random variables, X_1,\ldots, X_n, as n increases.Suppose F is the hypothetical distribution function of...

     (probability theory)
  • Killed process
    Killed process
    In probability theory — specifically, in stochastic analysis — a killed process is a stochastic process that is forced to assume an undefined or "killed" state at some time.-Definition:...

  • Khintchine inequality
    Khintchine inequality
    In mathematics, the Khintchine inequality, named after Aleksandr Khinchin and spelled in multiple ways in the Roman alphabet, is a theorem from probability, and is also frequently used in analysis...

  • Kingman's formula
  • Kirkwood approximation
    Kirkwood approximation
    The Kirkwood superposition approximation was introduced by Matsuda as a means of representing a discrete probability distribution. The name apparently refers to a 1942 paper by John G. Kirkwood...

  • Kish grid
    Kish grid
    The Kish grid is a method for selecting members within a household to be interviewed. In telephone surveys, the next-birthday method is sometimes preferred to the Kish grid.- References :...

  • Kitchen sink regression
    Kitchen sink regression
    A kitchen sink regression is an informal and usually pejorative term for a regression analysis which uses a long list of possible independent variables to attempt to explain variance in a dependent variable. In economics, psychology, and other social sciences, regression analysis is typically used...

  • Knightian uncertainty
    Knightian uncertainty
    In economics, Knightian uncertainty is risk that is immeasurable, not possible to calculate.Knightian uncertainty is named after University of Chicago economist Frank Knight , who distinguished risk and uncertainty in his work Risk, Uncertainty, and Profit:- Common-cause and special-cause :The...

  • Kolmogorov backward equation
    Kolmogorov backward equation
    The Kolmogorov backward equation and its adjoint sometimes known as the Kolmogorov forward equation are partial differential equations that arise in the theory of continuous-time continuous-state Markov processes. Both were published by Andrey Kolmogorov in 1931...

  • Kolmogorov continuity theorem
    Kolmogorov continuity theorem
    In mathematics, the Kolmogorov continuity theorem is a theorem that guarantees that a stochastic process that satisfies certain constraints on the moments of its increments will be continuous...

  • Kolmogorov extension theorem
    Kolmogorov extension theorem
    In mathematics, the Kolmogorov extension theorem is a theorem that guarantees that a suitably "consistent" collection of finite-dimensional distributions will define a stochastic process...

  • Kolmogorov’s criterion
    Kolmogorov’s criterion
    In probability theory, Kolmogorov's criterion, named after Andrey Kolmogorov, is a theorem in Markov processes concerning stationary Markov chains...

  • Kolmogorov’s generalized criterion
  • Kolmogorov's inequality
    Kolmogorov's inequality
    In probability theory, Kolmogorov's inequality is a so-called "maximal inequality" that gives a bound on the probability that the partial sums of a finite collection of independent random variables exceed some specified bound...

  • Kolmogorov's zero-one law
    Kolmogorov's zero-one law
    In probability theory, Kolmogorov's zero-one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, called a tail event, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.Tail...

  • Kolmogorov–Smirnov test
  • KPSS test
    KPSS test
    In econometrics, Kwiatkowski–Phillips–Schmidt–Shin tests are used for testing a null hypothesis that an observable time series is stationary around a deterministic trend. Such models were proposed in 1982 by Alok Bhargava in his Ph.D. thesis where several John von Neumann or Durbin–Watson type...

  • Kriging
    Kriging
    Kriging is a group of geostatistical techniques to interpolate the value of a random field at an unobserved location from observations of its value at nearby locations....

  • Kruskal–Wallis one-way analysis of variance
  • Kuder-Richardson Formula 20
    Kuder-Richardson Formula 20
    In statistics, the Kuder-Richardson Formula 20 first published in 1937 is a measure of internal consistency reliability for measures with dichotomous choices. It is analogous to Cronbach's α, except Cronbach's α is also used for non-dichotomous measures...

  • Kuiper's test
    Kuiper's test
    Kuiper's test is used in statistics to test that whether a given distribution, or family of distributions, is contradicted by evidence from a sample of data. It is named after Dutch mathematician Nicolaas Kuiper....

  • Kullback's inequality
    Kullback's inequality
    In information theory and statistics, Kullback's inequality is a lower bound on the Kullback–Leibler divergence expressed in terms of the large deviations rate function. If P and Q are probability distributions on the real line, such that P is absolutely continuous with respect to Q, i.e...

  • Kullback–Leibler divergence
    Kullback–Leibler divergence
    In probability theory and information theory, the Kullback–Leibler divergence is a non-symmetric measure of the difference between two probability distributions P and Q...

  • Kumaraswamy distribution
    Kumaraswamy distribution
    In probability and statistics, the Kumaraswamy's double bounded distribution is a family of continuous probability distributions defined on the interval [0,1] differing in the values of their two non-negative shape parameters, a and b....

  • Kurtosis
    Kurtosis
    In probability theory and statistics, kurtosis is any measure of the "peakedness" of the probability distribution of a real-valued random variable...

  • Kushner equation
    Kushner equation
    In filtering theory the Kushner equation is an equation for the conditional probability density of the state of a stochastic non-linear dynamical system, given noisy measurements of the state. It therefore provides the solution of the nonlinear filtering problem in estimation theory...


L

  • L-estimator
  • L-moment
    L-moment
    In statistics, L-moments are statistics used to summarize the shape of a probability distribution. They are analogous to conventional moments in that they can be used to calculate quantities analogous to standard deviation, skewness and kurtosis, termed the L-scale, L-skewness and L-kurtosis...

  • Labour Force Survey
    Labour Force Survey
    Labour Force Surveys are statistical surveys conducted in a number of countries designed to capture data about the labour market. All European Union member states are required to conduct a Labour Force Survey annually. Labour Force Surveys are also carried out in some non-EU countries. They are...

  • Lack-of-fit sum of squares
    Lack-of-fit sum of squares
    In statistics, a sum of squares due to lack of fit, or more tersely a lack-of-fit sum of squares, is one of the components of a partition of the sum of squares in an analysis of variance, used in the numerator in an F-test of the null hypothesis that says that a proposed model fits well.- Sketch of...

  • Lady tasting tea
    Lady tasting tea
    In the design of experiments in statistics, the lady tasting tea is a famous randomized experiment devised by Ronald A. Fisher and reported in his book Statistical methods for research workers . The lady in question was Dr...

  • Lag operator
  • Lag windowing
    Lag windowing
    Lag windowing is a technique that consists of windowing the auto-correlation coefficients prior to estimating Linear prediction coefficients . The windowing in the auto-correlation domain has the same effect as a convolution in the power spectral domain and helps stabilizing the result of the...

  • Lambda distribution — disambiguation
  • Landau distribution
  • Lander–Green algorithm
    Lander–Green algorithm
    The Lander–Green algorithm is an algorithm, due to Eric Lander and Philip Green for computing the likelihood of observed genotype data given a pedigree. It is appropriate for relatively small pedigrees and a large number of markers. It is used in the analysis of genetic linkage....

  • Language model
    Language model
    A statistical language model assigns a probability to a sequence of m words P by means of a probability distribution.Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information...

  • Laplace distribution
  • Laplace principle (large deviations theory)
    Laplace principle (large deviations theory)
    In mathematics, Laplace's principle is a basic theorem in large deviations theory, similar to Varadhan's lemma. It gives an asymptotic expression for the Lebesgue integral of exp over a fixed set A as θ becomes large...

  • Large deviations theory
    Large deviations theory
    In probability theory, the theory of large deviations concerns the asymptotic behaviour of remote tails of sequences of probability distributions. Some basic ideas of the theory can be tracked back to Laplace and Cramér, although a clear unified formal definition was introduced in 1966 by Varadhan...

  • Large deviations of Gaussian random functions
    Large deviations of Gaussian random functions
    A random function – of either one variable , or two or more variables – is called Gaussian if every finite-dimensional distribution is a multivariate normal distribution. Gaussian random fields on the sphere are useful when analysing* the anomalies in the cosmic microwave background...

  • LARS — see least-angle regression
    Least-angle regression
    In statistics, least-angle regression is a regression algorithm for high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani....

  • Latent variable
    Latent variable
    In statistics, latent variables , are variables that are not directly observed but are rather inferred from other variables that are observed . Mathematical models that aim to explain observed variables in terms of latent variables are called latent variable models...

    , latent variable model
    Latent variable model
    A latent variable model is a statistical model that relates a set of variables to a set of latent variables.It is assumed that 1) the responses on the indicators or manifest variables are the result of...

  • Latent class model
    Latent class model
    In statistics, a latent class model relates a set of observed discrete multivariate variables to a set of latent variables. It is a type of latent variable model. It is called a latent class model because the latent variable is discrete...

  • Latent Dirichlet allocation
    Latent Dirichlet allocation
    In statistics, latent Dirichlet allocation is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar...

  • Latent growth modeling
    Latent growth modeling
    Latent growth modeling is a statistical technique used in the structural equation modeling framework to estimate growth trajectory. It is a longitudinal analysis technique to estimate growth over a period of time. It is widely used in the field of behavioral science, education and social science. ...

  • Latent semantic analysis
    Latent semantic analysis
    Latent semantic analysis is a technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close...

  • Latin rectangle
    Latin rectangle
    In combinatorial mathematics, a Latin rectangle is an r × n matrix that has the numbers 1, 2, 3, ..., n as its entries with no number occurring more than once in any row or column where r ≤ n. An n × n Latin rectangle is called a...

  • Latin square
    Latin square
    In combinatorics and in experimental design, a Latin square is an n × n array filled with n different symbols, each occurring exactly once in each row and exactly once in each column...

  • Latin hypercube sampling
    Latin hypercube sampling
    Latin hypercube sampling is a statistical method for generating a distribution of plausible collections of parameter values from a multidimensional distribution. The sampling method is often applied in uncertainty analysis....

  • Law (stochastic processes)
    Law (stochastic processes)
    In mathematics, the law of a stochastic process is the measure that the process induces on the collection of functions from the index set into the state space...

  • Law of averages
    Law of averages
    The law of averages is a lay term used to express a belief that outcomes of a random event will "even out" within a small sample.As invoked in everyday life, the "law" usually reflects bad statistics or wishful thinking rather than any mathematical principle...

  • Law of comparative judgment
    Law of comparative judgment
    The law of comparative judgment was conceived by L. L. Thurstone. In modern day terminology, it is more aptly described as a model that is used to obtain measurements from any process of pairwise comparison...

  • Law of large numbers
    Law of large numbers
    In probability theory, the law of large numbers is a theorem that describes the result of performing the same experiment a large number of times...

  • Law of the iterated logarithm
    Law of the iterated logarithm
    In probability theory, the law of the iterated logarithm describes the magnitude of the fluctuations of a random walk. The original statement of the law of the iterated logarithm is due to A. Y. Khinchin . Another statement was given by A.N...

  • Law of the unconscious statistician
    Law of the unconscious statistician
    In probability theory and statistics, the law of the unconscious statistician is a theorem used to calculate the expected value of a function g of a random variable X when one knows the probability distribution of X but one does not explicitly know the distribution of g.The form of the law can...

  • Law of total covariance
    Law of total covariance
    In probability theory, the law of total covariance or covariance decomposition formula states that if X, Y, and Z are random variables on the same probability space, and the covariance of X and Y is finite, then...

  • Law of total cumulance
    Law of total cumulance
    In probability theory and mathematical statistics, the law of total cumulance is a generalization to cumulants of the law of total probability, the law of total expectation, and the law of total variance. It has applications in the analysis of time series...

  • Law of total expectation
    Law of total expectation
    The proposition in probability theory known as the law of total expectation, the law of iterated expectations, the tower rule, the smoothing theorem, among other names, states that if X is an integrable random variable The proposition in probability theory known as the law of total expectation, ...

  • Law of total probability
    Law of total probability
    In probability theory, the law of total probability is a fundamental rule relating marginal probabilities to conditional probabilities.-Statement:The law of total probability is the proposition that if \left\...

  • Law of total variance
    Law of total variance
    In probability theory, the law of total variance or variance decomposition formula states that if X and Y are random variables on the same probability space, and the variance of Y is finite, then...

  • Law of Truly Large Numbers
    Law of Truly Large Numbers
    The law of truly large numbers, attributed to Persi Diaconis and Frederick Mosteller, states that with a sample size large enough, any outrageous thing is likely to happen. Because we never find it notable when likely events occur, we highlight unlikely events and notice them more...

  • Layered hidden Markov model
    Layered hidden Markov model
    The layered hidden Markov model is a statistical model derived from the hidden Markov model .A layered hidden Markov model consists of N levels of HMMs, where the HMMs on level i + 1 correspond to observation symbols or probability generators at level i.Every level i of the LHMM...

  • Le Cam's theorem
    Le Cam's theorem
    In probability theory, Le Cam's theorem, named after Lucien le Cam , is as follows.Suppose:* X1, ..., Xn are independent random variables, each with a Bernoulli distribution , not necessarily identically distributed.* Pr = pi for i = 1, 2, 3, ...* \lambda_n = p_1 + \cdots + p_n.\,* S_n = X_1...

  • Lead time bias
    Lead time bias
    Lead time is the length of time between the detection of a disease and its usual clinical presentation and diagnosis ....

  • Least absolute deviations
    Least absolute deviations
    Least absolute deviations , also known as Least Absolute Errors , Least Absolute Value , or the L1 norm problem, is a mathematical optimization technique similar to the popular least squares technique that attempts to find a function which closely approximates a set of data...

  • Least-angle regression
    Least-angle regression
    In statistics, least-angle regression is a regression algorithm for high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani....

  • Least squares
    Least squares
    The method of least squares is a standard approach to the approximate solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns. "Least squares" means that the overall solution minimizes the sum of the squares of the errors made in solving every...

  • Least-squares spectral analysis
    Least-squares spectral analysis
    Least-squares spectral analysis is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis...

  • Least squares support vector machine
    Least squares support vector machine
    Least squares support vector machines are least squares versions of support vector machines , which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis...

  • Least trimmed squares
    Least Trimmed Squares
    Least trimmed squares , or least trimmed sum of squares, is a robust statistical method that attempts to fit a function to a set of data whilst not being unduly affected by the presence of outliers...

  • Learning theory (statistics)
  • Leftover hash-lemma
    Leftover hash-lemma
    The leftover hash lemma is a lemma in cryptography first stated by Russell Impagliazzo, Leonid Levin, and Michael Luby.Imagine that you have a secret key X that has n uniform random bits, and you would like to use this secret key to encrypt a message. Unfortunately, you were a bit careless with the...

  • Lehmann–Scheffé theorem
    Lehmann–Scheffé theorem
    In statistics, the Lehmann–Scheffé theorem is prominent in mathematical statistics, tying together the ideas of completeness, sufficiency, uniqueness, and best unbiased estimation...

  • Length time bias
    Length time bias
    Length time bias is a form of selection bias, a statistical distortion of results which can lead to incorrect conclusions about the data. Length time bias can occur when the lengths of intervals are analysed by selecting intervals that occupy randomly chosen points in time or space...

  • Levene's test
    Levene's test
    In statistics, Levene's test is an inferential statistic used to assess the equality of variances in different samples. Some common statistical procedures assume that variances of the populations from which different samples are drawn are equal. Levene's test assesses this assumption. It tests the...

  • Level of measurement
    Level of measurement
    The "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed by the psychologist Stanley Smith Stevens. Stevens proposed his theory in a 1946 Science article titled "On the theory of scales of measurement"...

  • Levenberg–Marquardt algorithm
  • Leverage (statistics)
    Leverage (statistics)
    In statistics, leverage is a term used in connection with regression analysis and, in particular, in analyses aimed at identifying those observations that are far away from corresponding average predictor values...

  • Levey–Jennings chart — redirects to Laboratory quality control
    Laboratory quality control
    Laboratory quality control is designed to detect, reduce, and correct deficiencies in a laboratory's internal analytical process prior to the release of patient results and improve the quality of the results reported by the laboratory. Quality control is a measure of precision or how well the...

  • Lévy's convergence theorem
  • Lévy's continuity theorem
    Lévy's continuity theorem
    In probability theory, the Lévy’s continuity theorem, named after the French mathematician Paul Lévy, connects convergence in distribution of the sequence of random variables with pointwise convergence of their characteristic functions...

  • Lévy arcsine law
    Lévy arcsine law
    In probability theory, the Lévy arcsine law, found by , states that the probability distribution of the proportion of the time that a Wiener process is positive is a random variable whose probability distribution is the arcsine distribution...

  • Lévy distribution
  • Lévy flight
    Lévy flight
    A Lévy flight is a random walk in which the step-lengths have a probability distribution that is heavy-tailed. When defined as a walk in a space of dimension greater than one, the steps made are in isotropic random directions...

  • Lévy process
    Lévy process
    In probability theory, a Lévy process, named after the French mathematician Paul Lévy, is any continuous-time stochastic process that starts at 0, admits càdlàg modification and has "stationary independent increments" — this phrase will be explained below...

  • Lewontin's Fallacy
    Lewontin's Fallacy
    Human genetic diversity: Lewontin's fallacy is a 2003 paper by A. W. F. Edwards that refers to an argument first made by Richard Lewontin in his 1972 article The apportionment of human diversity, which argued that race for humans is not a valid taxonomic construct. Edwards' paper criticized and...

  • Lexis diagram
    Lexis diagram
    In demography a Lexis diagram is a two dimensional diagram that is used to represent events that occur to individuals belonging to different cohorts...

  • Lexis ratio
    Lexis ratio
    The Lexis ratio is used in statistics as a measure which seeks to evaluate differences between the statistical properties of random mechanisms where the outcome is two-valued — for example "success" or "failure", "win" or "lose"...

  • Lies, damned lies, and statistics
    Lies, damned lies, and statistics
    "Lies, damned lies, and statistics" is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments...

  • Life expectancy
    Life expectancy
    Life expectancy is the expected number of years of life remaining at a given age. It is denoted by ex, which means the average number of subsequent years of life for someone now aged x, according to a particular mortality experience...

  • Life table
    Life table
    In actuarial science, a life table is a table which shows, for each age, what the probability is that a person of that age will die before his or her next birthday...

  • Lift (data mining)
    Lift (data mining)
    In data mining, lift is a measure of the performance of a model at predicting or classifying cases, measuring against a random choice model.For example, suppose a population has a predicted response rate of 5%, but a certain model has identified a segment with a predicted response rate of 20%...

  • Likelihood function
    Likelihood function
    In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...

  • Likelihood principle
    Likelihood principle
    In statistics,the likelihood principle is a controversial principle of statistical inference which asserts that all of the information in a sample is contained in the likelihood function....

  • Likelihood-ratio test
    Likelihood-ratio test
    In statistics, a likelihood ratio test is a statistical test used to compare the fit of two models, one of which is a special case of the other . The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other...

  • Likelihood ratios in diagnostic testing
    Likelihood ratios in diagnostic testing
    In evidence-based medicine, likelihood ratios are used for assessing the value of performing a diagnostic test. They use the sensitivity and specificity of the test to determine whether a test result usefully changes the probability that a condition exists.-Calculation:Two versions of the...

  • Likert scale
    Likert scale
    A Likert scale is a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, or more accurately the Likert-type scale, even though...

  • Lilliefors test
    Lilliefors test
    In statistics, the Lilliefors test, named after Hubert Lilliefors, professor of statistics at George Washington University, is an adaptation of the Kolmogorov–Smirnov test...

  • Limited dependent variable
    Limited dependent variable
    A limited dependent variable is a variable whose range ofpossible values is "restricted in some important way." In econometrics, the term is often used whenestimation of the relationship between the limited dependent variable...

  • Limiting density of discrete points
    Limiting density of discrete points
    In information theory, the limiting density of discrete points is an adjustment to the formula of Claude Elwood Shannon for differential entropy.It was formulated by Edwin Thompson Jaynes to address defects in the initial definition of differential entropy....

  • Lincoln index
    Lincoln Index
    The Lincoln index is a statistical measure used in several fields to estimate the number of cases that have not yet been observed, based on two independent sets of observed cases. It is also sometimes known as the Lincoln-Petersen method.-Applications:...

  • Lindeberg's condition
    Lindeberg's condition
    In probability theory, Lindeberg's condition is a sufficient condition for the central limit theorem to hold for a sequence of independent random variables...

  • Lindley equation
    Lindley equation
    In probability theory, the Lindley equation, Lindley recursion or Lindley processes is a discrete time stochastic process An where n takes integer values and...

  • Lindley's paradox
    Lindley's paradox
    Lindley's paradox is a counterintuitive situation in statistics in which the Bayesian and frequentist approaches to a hypothesis testing problem give opposite results for certain choices of the prior distribution...

  • Line chart
    Line chart
    A line chart or line graph is a type of graph, which displays information as a series of data points connected by straight line segments. It is a basic type of chart common in many fields. It is an extension of a scatter graph, and is created by connecting a series of points that represent...

  • Line-intercept sampling
    Line-intercept sampling
    In statistics, line-intercept sampling is a method of sampling elements in a region whereby an element is sampled if a chosen line segment, called a “transect”, intersects the element ....

  • Linear classifier
    Linear classifier
    In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics...

  • Linear discriminant analysis
    Linear discriminant analysis
    Linear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...

  • Linear least squares — disambiguation
  • Linear least squares (mathematics)
  • Linear model
    Linear model
    In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However the term is also used in time series analysis with a different...

  • Linear prediction
    Linear prediction
    Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples....

  • Linear probability model
  • Linear regression
    Linear regression
    In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple regression...

  • Linguistic demography
  • LISREL
    LISREL
    LISREL, an acronym for linear structural relations, is a statistical software package used in structural equation modeling. LISREL was developed in 1970s by Karl Jöreskog, then a scientist at Educational Testing Service in Princeton, NJ, and Dag Sörbom, later both professors of Uppsala University,...

     — proprietary statistical software package
  • List of basic statistics topics — redirects to Outline of statistics
  • List of convolutions of probability distributions
  • List of graphical methods
  • List of information graphics software
  • List of probability topics
  • List of random number generators
  • List of scientific journals in statistics
  • List of statistical packages
  • List of statisticians
  • Listwise deletion
  • Little's law
    Little's law
    In the mathematical theory of queues, Little's result, theorem, lemma, law or formula says:It is a restatement of the Erlang formula, based on the work of Danish mathematician Agner Krarup Erlang...

  • Littlewood's law
    Littlewood's law
    Littlewood's Law states that individuals can expect a "miracle" to happen to them at the rate of about one per month.-History:The law was framed by Cambridge University Professor J. E...

  • Ljung–Box test
    Ljung–Box test
    The Ljung–Box test is a type of statistical test of whether any of a group of autocorrelations of a time series are different from zero...

  • Local convex hull
  • Local independence
    Local independence
    Local independence is the underlying assumption of latent variable models.The observed items are conditionally independent of each other given an individual score on the latent variable. This means that the latent variable explains why the observed items are related to another...

  • Local martingale
    Local martingale
    In mathematics, a local martingale is a type of stochastic process, satisfying the localized version of the martingale property. Every martingale is a local martingale; every bounded local martingale is a martingale; however, in general a local martingale is not a martingale, because its...

  • Local regression
    Local regression
    LOESS, or LOWESS , is one of many "modern" modeling methods that build on "classical" methods, such as linear and nonlinear least squares regression. Modern regression methods are designed to address situations in which the classical procedures do not perform well or cannot be effectively applied...

  • Location estimation redirects to Location parameter
    Location parameter
    In statistics, a location family is a class of probability distributions that is parametrized by a scalar- or vector-valued parameter μ, which determines the "location" or shift of the distribution...

  • Location estimation in sensor networks
    Location estimation in sensor networks
    Location estimation in wireless sensor networks is the problem of estimating the location of an object from a set of noisy measurements, when the measurements are acquired in a distributedmanner by a set of sensors.-Motivation:...

  • Location parameter
    Location parameter
    In statistics, a location family is a class of probability distributions that is parametrized by a scalar- or vector-valued parameter μ, which determines the "location" or shift of the distribution...

  • Location test
    Location test
    A location test is a statistical hypothesis test that compares the location parameter of a statistical population to a given constant, or that compares the location parameters of two statistical populations to each other...

  • Location-scale family
    Location-scale family
    In probability theory, especially as that field is used in statistics, a location-scale family is a family of univariate probability distributions parametrized by a location parameter and a non-negative scale parameter; if X is any random variable whose probability distribution belongs to such a...

  • Local asymptotic normality
    Local asymptotic normality
    In statistics, local asymptotic normality is a property of a sequence of statistical models, which allows this sequence to be asymptotically approximated by a normal location model, after a rescaling of the parameter...

  • Locality (statistics)
  • Loess curve redirects to Local regression
    Local regression
    LOESS, or LOWESS , is one of many "modern" modeling methods that build on "classical" methods, such as linear and nonlinear least squares regression. Modern regression methods are designed to address situations in which the classical procedures do not perform well or cannot be effectively applied...

  • Log-Cauchy distribution
    Log-Cauchy distribution
    In probability theory, a log-Cauchy distribution is a probability distribution of a random variable whose logarithm is distributed in accordance with a Cauchy distribution...

  • Log-Laplace distribution
    Log-Laplace distribution
    In probability theory and statistics, the log-Laplace distribution is the probability distribution of a random variable whose logarithm has a Laplace distribution. If X has a Laplace distribution with parameters μ and b, then Y = eX has a log-Laplace distribution...

  • Log-normal distribution
  • Log-linear model
  • Log-linear modeling
  • Log-log graph
    Log-log graph
    In science and engineering, a log-log graph or log-log plot is a two-dimensional graph of numerical data that uses logarithmic scales on both the horizontal and vertical axes...

  • Log-logistic distribution
    Log-logistic distribution
    In probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, for example mortality from cancer following...

  • Logarithmic distribution
  • Logarithmic mean
    Logarithmic mean
    In mathematics, the logarithmic mean is a function of two non-negative numbers which is equal to their difference divided by the logarithm of their quotient...

  • Logistic distribution
  • Logistic function
    Logistic function
    A logistic function or logistic curve is a common sigmoid curve, given its name in 1844 or 1845 by Pierre François Verhulst who studied it in relation to population growth. It can model the "S-shaped" curve of growth of some population P...

  • Logistic regression
    Logistic regression
    In statistics, logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logit function logistic curve. It is a generalized linear model used for binomial regression...

  • Logit
    Logit
    The logit function is the inverse of the sigmoidal "logistic" function used in mathematics, especially in statistics.Log-odds and logit are synonyms.-Definition:The logit of a number p between 0 and 1 is given by the formula:...

  • Logit analysis in marketing
    Logit analysis in marketing
    Logit analysis is a statistical technique used by marketers to assess the scope of customer acceptance of a product, particularly a new product. It attempts to determine the intensity or magnitude of customers' purchase intentions and translates that into a measure of actual buying behaviour...

  • Logit-normal distribution
  • Lognormal distribution
  • Logrank test
    Logrank test
    In statistics, the logrank test is a hypothesis test to compare the survival distributions of two samples. It is a nonparametric test and appropriate to use when the data are right skewed and censored...

  • Lomax distribution
  • Long-range dependency
    Long-range dependency
    Long-range dependency is a phenomenon that may arise in the analysis of spatial or time series data. It relates to the rate of decay of statistical dependence, with the implication that this decays more slowly than an exponential decay, typically a power-like decay...

  • Long Tail
    Long tail
    Long tail may refer to:*The Long Tail, a consumer demographic in business*Power law's long tail, a statistics term describing certain kinds of distribution*Long-tail boat, a type of watercraft native to Southeast Asia...

  • Long-tail traffic
    Long-tail traffic
    This article covers a range of tools from different disciplines that may be used in the important science of determining the probability of rare events....

  • Longitudinal study
    Longitudinal study
    A longitudinal study is a correlational research study that involves repeated observations of the same variables over long periods of time — often many decades. It is a type of observational study. Longitudinal studies are often used in psychology to study developmental trends across the...

  • Lorenz curve
    Lorenz curve
    In economics, the Lorenz curve is a graphical representation of the cumulative distribution function of the empirical probability distribution of wealth; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values...

  • Loss function
    Loss function
    In statistics and decision theory a loss function is a function that maps an event onto a real number intuitively representing some "cost" associated with the event. Typically it is used for parameter estimation, and the event in question is some function of the difference between estimated and...

  • Lot quality assurance sampling
    Lot Quality Assurance Sampling
    Lot quality assurance sampling is a simple, low-cost random sampling methodology developed in the 1920s to control the quality of output in industrial production processes....

  • Lotka's law
  • Low birth weight paradox
    Low birth weight paradox
    The low birth weight paradox is an apparently paradoxical observation relating to the birth weights and mortality of children born to tobacco smoking mothers. Low birth weight children born to smoking mothers have a lower infant mortality rate than the low birth weight children of non-smokers...

  • Lucia de Berk
    Lucia de Berk
    Lucia de Berk, often called Lucia de B. or Lucy de B is a Dutch licenced paediatric nurse, who was subject to a miscarriage of justice. She was sentenced to life imprisonment in 2003 for four murders and three attempted murders of patients in her care...

     – prob/stats related court case
  • Lukacs's proportion-sum independence theorem
    Lukacs's proportion-sum independence theorem
    In statistics, Lukacs's proportion-sum independence theorem is a result that is used when studying proportions, in particular the Dirichlet distribution...

  • Lumpability
    Lumpability
    In probability theory, lumpability is a method for reducing the size of the state space of some continuous-time Markov chains, first published by Kemeny and Snell.-Definition:...

  • Lusser's law
    Lusser's Law
    Lusser's law, named after Robert Lusser, is a prediction of reliability named after Robert Lusser. It is also called the "probability product law of series components". It states that the reliability of a series system is equal to the product of the reliability of its component subsystems, if their...

  • Lyapunov's central limit theorem

M

  • M/G/1 model
  • M/M/1 model
    M/M/1 model
    In queueing theory, a discipline within the mathematical theory of probability, a M/M/1 queue represents the queue length in a system having a single server, where arrivals are detemined by a Poisson process and job service times have an exponential distribution. The model name is written in...

  • M/M/c model
    M/M/c model
    In the mathematical theory of random processes, the M/M/c queue is a multi-server queue model. It is a generalisation of the M/M/1 queue.Following Kendall's notation it indicates a system where:*Arrivals are a Poisson process...

  • M-estimator
    M-estimator
    In statistics, M-estimators are a broad class of estimators, which are obtained as the minima of sums of functions of the data. Least-squares estimators and many maximum-likelihood estimators are M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new...

    • Redescending M-estimator
      Redescending M-estimator
      In statistics, Redescending M-estimators are Ψ-type M-estimators which have Ψ functions that are non-decreasing near the origin, but decreasing toward 0 far from the origin...

  • M-separation
    M-separation
    In statistics, m-separation is a measure of disconnectedness in ancestral graphs and a generalization of d-separation for directed acyclic graphs. It is the opposite of m-connectedness....

  • Machine learning
    Machine learning
    Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

  • Mahalanobis distance
    Mahalanobis distance
    In statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analyzed. It gauges similarity of an unknown sample set to a known one. It differs from Euclidean...

  • Main effect
    Main effect
    In the design of experiments and analysis of variance, a main effect is the effect of an independent variable on a dependent variable averaging across the levels of any other independent variables...

  • Mallows' Cp
    Mallows' Cp
    In statistics, Mallows' Cp, named for Colin L. Mallows, is used to assess the fit of a regression model that has been estimated using ordinary least squares. It is applied in the context of model selection, where a number of predictor variables are available for predicting some outcome, and the...

  • MANCOVA
    MANCOVA
    Multivariate analysis of covariance is an extension of analysis of covariance methods to cover cases where there is more than one dependent variable and where the dependent variables cannot simply be combined....

  • Manhattan plot
    Manhattan plot
    A Manhattan plot is a type of scatter plot, usually used to display data with a large number of data-points - many of non-zero amplitude, and with a distribution of higher-magnitude values, for instance in genome-wide association studies...

  • Mann–Whitney U
  • MANOVA
    MANOVA
    Multivariate analysis of variance is a generalized form of univariate analysis of variance . It is used when there are two or more dependent variables. It helps to answer : 1. do changes in the independent variable have significant effects on the dependent variables; 2. what are the interactions...

  • Mantel test
    Mantel test
    The Mantel test, named after Nathan Mantel, is a statistical test of the correlation between two matrices. The matrices must be of the same rank, in most applications they are matrices of interrelations between the same vectors of objects....

  • MAP estimator — redirects to Maximum a posteriori estimation
  • Marchenko–Pastur distribution
    Marchenko–Pastur distribution
    In random matrix theory, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the asymptotic behavior of singular values of large rectangular random matrices...

  • Marcinkiewicz–Zygmund inequality
    Marcinkiewicz–Zygmund inequality
    In mathematics, the Marcinkiewicz–Zygmund inequality, named after Józef Marcinkiewicz and Antoni Zygmund, gives relations between moments of a collection of independent random variables...

  • Marcum Q-function
  • Margin of error
    Margin of error
    The margin of error is a statistic expressing the amount of random sampling error in a survey's results. The larger the margin of error, the less faith one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population...

  • Marginal distribution
    Marginal distribution
    In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained...

  • Marginal likelihood
    Marginal likelihood
    In statistics, a marginal likelihood function, or integrated likelihood, is a likelihood function in which some parameter variables have been marginalised...

  • Marginal model
    Marginal model
    In statistics, marginal models are a technique for obtaining regression estimates in multilevel modeling, also called hierarchical linear models....

  • Marginal variable — redirects to Marginal distribution
    Marginal distribution
    In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained...

  • Mark and recapture
    Mark and recapture
    Mark and recapture is a method commonly used in ecology to estimate population size. This method is most valuable when a researcher fails to detect all individuals present within a population of interest every time that researcher visits the study area...

  • Markov additive process
  • Markov blanket
    Markov blanket
    In machine learning, the Markov blanket for a node A in a Bayesian network is the set of nodes \partial A composed of A's parents, its children, and its children's other parents. In a Markov network, the Markov blanket of a node is its set of neighbouring nodes...

  • Markov chain
    Markov chain
    A Markov chain, named after Andrey Markov, is a mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process characterized as memoryless: the next state depends only on the current state and not on the...

    • Markov chain geostatistics
      Markov chain geostatistics
      Markov chain geostatistics refer to the Markov chain models, simulation algorithms and associated spatial correlation measures based on the Markov chain random field theory, which extends a single Markov chain into a multi-dimensional field for geostatistical modeling. A Markov chain random field...

    • Markov chain mixing time
      Markov chain mixing time
      In probability theory, the mixing time of a Markov chain is the time until the Markov chain is "close" to its steady state distribution.More precisely, a fundamental result about Markov chains is that a finite state irreducible aperiodic chain has a unique stationary distribution π and,...

  • Markov chain Monte Carlo
    Markov chain Monte Carlo
    Markov chain Monte Carlo methods are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a large number of steps is then used as a sample of the...

  • Markov decision process
    Markov decision process
    Markov decision processes , named after Andrey Markov, provide a mathematical framework for modeling decision-making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying a wide range of optimization problems solved via...

  • Markov information source
    Markov information source
    In mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain.-Formal definition:...

  • Markov kernel
    Markov kernel
    In probability theory, a Markov kernel is a map that plays the role, in the general theory of Markov processes, that the transition matrix does in the theory of Markov processes with a finite state space.- Formal definition :...

  • Markov logic network
    Markov logic network
    A Markov logic network is a probabilistic logic which applies the ideas of a Markov network to first-order logic, enabling uncertain inference...

  • Markov model
    Markov model
    In probability theory, a Markov model is a stochastic model that assumes the Markov property. Generally, this assumption enables reasoning and computation with the model that would otherwise be intractable.-Introduction:...

  • Markov network
    Markov network
    A Markov random field, Markov network or undirected graphical model is a set of variables having a Markov property described by an undirected graph. A Markov random field is similar to a Bayesian network in its representation of dependencies...

  • Markov process
    Markov process
    In probability theory and statistics, a Markov process, named after the Russian mathematician Andrey Markov, is a time-varying random phenomenon for which a specific property holds...

  • Markov property
    Markov property
    In probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process. It was named after the Russian mathematician Andrey Markov....

  • Markov random field
  • Markov's inequality
    Markov's inequality
    In probability theory, Markov's inequality gives an upper bound for the probability that a non-negative function of a random variable is greater than or equal to some positive constant...

  • Markovian arrival processes
    Markovian arrival processes
    In queueing theory, Markovian arrival processes are used to model the arrival of customers to a queue.Some of the most common include the Poisson process, Markov arrival process and the batch Markov arrival process.-Background:...

  • Marsaglia polar method
    Marsaglia polar method
    The polar method is a pseudo-random number sampling method for generating a pair of independent standard normal random variables...

  • Martingale (probability theory)
    Martingale (probability theory)
    In probability theory, a martingale is a model of a fair game where no knowledge of past events can help to predict future winnings. In particular, a martingale is a sequence of random variables for which, at a particular time in the realized sequence, the expectation of the next value in the...

  • Martingale difference sequence
    Martingale difference sequence
    In probability theory, a martingale difference sequence is related to the concept of the martingale. A stochastic series Y is an MDS if its expectation with respect to past values of another stochastic series X is zero...

  • Martingale representation theorem
    Martingale representation theorem
    In probability theory, the martingale representation theorem states that a random variable which is measurable with respect to the filtration generated by a Brownian motion can be written in terms of an Itô integral with respect to this Brownian motion....

  • Master equation
    Master equation
    In physics and chemistry and related fields, master equations are used to describe the time-evolution of a system that can be modelled as being in exactly one of countable number of states at any given time, and where switching between states is treated probabilistically...

  • Matched filter
    Matched filter
    In telecommunications, a matched filter is obtained by correlating a known signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template...

  • Matching pursuit
    Matching pursuit
    Matching pursuit is a type of numerical technique which involves finding the "best matching" projections of multidimensional data onto an over-complete dictionary D...

  • Matching (statistics)
    Matching (statistics)
    Matching is a statistical technique which is used to evaluate the effect of a treatment by comparing the treated and the non-treated in non experimental design . People use this technique with observational data...

  • Matérn covariance function
    Matérn covariance function
    In statistics, the Matérn covariance is a covariance function used in spatial statistics, geostatistics, machine learning, image analysis, and other applications of multivariate statistical analysis on metric spaces...

  • Mathematica
    Mathematica
    Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...

     – software
  • Mathematical biology
    Mathematical biology
    Mathematical and theoretical biology is an interdisciplinary scientific research field with a range of applications in biology, medicine and biotechnology...

  • Mathematical modelling in epidemiology
    Mathematical modelling in epidemiology
    It is possible to mathematically model the progress of most infectious diseases to discover the likely outcome of an epidemic or to help manage them by vaccination...

  • Mathematical modelling of infectious disease
  • Mathematical statistics
    Mathematical statistics
    Mathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and analysis...

  • Matthews correlation coefficient
    Matthews Correlation Coefficient
    The Matthews correlation coefficient is used in machine learning as a measure of the quality of binary classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes...

  • Matrix normal distribution
    Matrix normal distribution
    The matrix normal distribution is a probability distribution that is a generalization of the normal distribution to matrix-valued random variables.- Definition :...

  • Matrix population models
    Matrix population models
    Population models are used in population ecology to model the dynamics of wildlife or human populations. Matrix population models are a specific type of population model that uses matrix algebra...

  • Mauchly's sphericity test
    Mauchly's sphericity test
    Mauchly's sphericity test is a statistical test used to validate repeated measures factor ANOVAs. The test was introduced by ENIAC co-inventor John Mauchly in 1940.-What is sphericity?:...

  • Maximal ergodic theorem
  • Maximum a posteriori estimation
  • Maximum entropy classifier redirects to Logistic regression
    Logistic regression
    In statistics, logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logit function logistic curve. It is a generalized linear model used for binomial regression...

  • Maximum entropy Markov model
    Maximum entropy Markov model
    In machine learning, a maximum-entropy Markov model , or conditional Markov model , is a graphical model for sequence labeling that combines features of hidden Markov models and maximum entropy models...

  • Maximum entropy method redirects to Principle of maximum entropy
    Principle of maximum entropy
    In Bayesian probability, the principle of maximum entropy is a postulate which states that, subject to known constraints , the probability distribution which best represents the current state of knowledge is the one with largest entropy.Let some testable information about a probability distribution...

  • Maximum entropy probability distribution
    Maximum entropy probability distribution
    In statistics and information theory, a maximum entropy probability distribution is a probability distribution whose entropy is at least as great as that of all other members of a specified class of distributions....

  • Maximum entropy spectral estimation
    Maximum entropy spectral estimation
    The maximum entropy method applied to spectral density estimation. The overall idea is that the maximum entropy rate stochastic process that satisfies the given constant autocorrelation and variance constraints, is a linear Gauss-Markov process with i.i.d...

  • Maximum likelihood
    Maximum likelihood
    In statistics, maximum-likelihood estimation is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters....

  • Maximum likelihood sequence estimation
    Maximum Likelihood Sequence Estimation
    Maximum likelihood sequence estimation is a mathematical algorithm to extract useful data out of a noisy data stream.-Theory:For an optimized detector for digital signals the priority is not to reconstruct the transmitter signal, but it should do a best estimation of the transmitted data with the...

  • Maximum parsimony
    Maximum parsimony
    Parsimony is a non-parametric statistical method commonly used in computational phylogenetics for estimating phylogenies. Under parsimony, the preferred phylogenetic tree is the tree that requires the least evolutionary change to explain some observed data....

  • Maximum spacing estimation
    Maximum spacing estimation
    In statistics, maximum spacing estimation , or maximum product of spacing estimation , is a method for estimating the parameters of a univariate statistical model...

  • Maxwell speed distribution
    Maxwell Speed Distribution
    Classically, an ideal gas' molecules bounce around with somewhat arbitrary velocities, never interacting with each other. In reality, however, an ideal gas is subjected to intermolecular forces. It is to be noted that the aforementioned classical treatment of an ideal gas is only useful when...

  • Maxwell–Boltzmann distribution
  • Maxwell’s theorem
  • MCAR
    MCAR
    In statistical analysis, data-values in a data set are missing completely at random if the events that lead to any particular data-item being missing are independent both of observable variables and of unobservable parameters of interest....

     (missing completely at random)
  • McCullagh's parametrization of the Cauchy distributions
  • McDiarmid's inequality
  • McDonald–Kreitman test — statistical genetics
  • McNemar's test
    McNemar's test
    In statistics, McNemar's test is a non-parametric method used on nominal data. It is applied to 2 × 2 contingency tables with a dichotomous trait, with matched pairs of subjects, to determine whether the row and column marginal frequencies are equal...

  • Meadow's law
    Meadow's law
    Meadow's Law was a precept much in use until recently in the field of child protection, specifically by those investigating cases of multiple cot or crib death — SIDS — within a single family.-History:...

  • Mean
    Mean
    In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

  • Mean – see also expected value
    Expected value
    In probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...

  • Mean absolute error
    Mean absolute error
    In statistics, the mean absolute error is a quantity used to measure how close forecasts or predictions are to the eventual outcomes. The mean absolute error is given by...

  • Mean absolute percentage error
    Mean Absolute Percentage Error
    Mean absolute percentage error is measure of accuracy in a fitted time series value in statistics, specifically trending. It usually expresses accuracy as a percentage, and is defined by the formula:...

  • Mean absolute scaled error
    Mean absolute scaled error
    In statistics, the mean absolute scaled error is a measure of the accuracy of forecasts . It was proposed in 2006 by Australian statistician Rob Hyndman, who described it as a "generally applicable measurement of forecast accuracy without the problems seen in the other measurements."The mean...

  • Mean and predicted response
    Mean and predicted response
    In linear regression mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable...

  • Mean deviation
  • Mean difference
    Mean difference
    The mean difference is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean difference, which is the mean difference divided by the arithmetic mean...

  • Mean integrated squared error
  • Mean of circular quantities
    Mean of circular quantities
    In mathematics, a mean of circular quantities is a mean which is suited for quantities like angles, daytimes, and fractional parts of real numbers. This is necessary since most of the usual means fail on circular quantities...

  • Mean percentage error
    Mean Percentage Error
    In statistics, the mean percentage error is the computed average of percentage errors by which estimated forecasts differ from actual values of the quantity being forecast.Formula for mean percentage error calculation is:...

  • Mean preserving spread
  • Mean reciprocal rank
    Mean reciprocal rank
    Mean reciprocal rank is a statistic for evaluating any process that produces a list of possible responses to a query, ordered by probability of correctness. The reciprocal rank of a query response is the multiplicative inverse of the rank of the first correct answer...

  • Mean signed difference
  • Mean square quantization error
    Mean square quantization error
    Mean square quantization error is a figure of merit for the process of analog to digital conversion.As the input is varied, the input's value is recorded when the digital output changes. For each digital output, the input's difference from ideal is normalized to the value of the least significant...

  • Mean square weighted deviation
    Mean square weighted deviation
    Mean square weighted deviation is used extensively in geochronology, the science of obtaining information about the time of formation of, for example, rocks, minerals, bones, corals, or charcoal, or the time at which particular processes took place in a rock mass, for example recrystallization and...

  • Mean squared error
    Mean squared error
    In statistics, the mean squared error of an estimator is one of many ways to quantify the difference between values implied by a kernel density estimator and the true values of the quantity being estimated. MSE is a risk function, corresponding to the expected value of the squared error loss or...

  • Mean squared prediction error
  • Mean time between failures
  • Mean-reverting process — redirects to Ornstein–Uhlenbeck process
  • Mean value analysis
    Mean value analysis
    In queueing theory, a specialty within the mathematical theory of probability, mean value analysis is a technique for computing expected queue lengths in equilibrium for a closed separable system of queues...

  • Measurement, level of — see level of measurement
    Level of measurement
    The "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed by the psychologist Stanley Smith Stevens. Stevens proposed his theory in a 1946 Science article titled "On the theory of scales of measurement"...

    .
  • MedCalc
    MedCalc
    MedCalc is a statistical software package designed for the biomedical sciences. It has an integrated spreadsheet for data input and can import files in several formats...

     – software
  • Median
    Median
    In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...

  • Median absolute deviation
    Median absolute deviation
    In statistics, the median absolute deviation is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample....

  • Median polish
    Median polish
    The median polish is an exploratory data analysis procedure proposed by the statistician John Tukey. It finds an additively-fit model for data in a two-way layout table of the form row effect + column effect + overall median.-References:* Frederick Mosteller and John Tukey . "Data Analysis and...

  • Median test
    Median test
    In statistics, Mood's median test is a special case of Pearson's chi-squared test. It is a nonparametric test that tests the null hypothesis that the medians of the populations from which two samples are drawn are identical...

  • Mediation (statistics)
    Mediation (Statistics)
    In statistics, a mediation model is one that seeks to identify and explicate the mechanism that underlies an observed relationship between an independent variable and a dependent variable via the inclusion of a third explanatory variable, known as a mediator variable...

  • Medical statistics
    Medical statistics
    Medical statistics deals with applications of statistics to medicine and the health sciences, including epidemiology, public health, forensic medicine, and clinical research...

  • Medoid
    Medoid
    Medoids are representative objects of a data set or a cluster with a data set whose average dissimilarity to all the objects in the cluster is minimal. Medoids are similar in concept to means or centroids, but medoids are always members of the data set...

  • Memorylessness
    Memorylessness
    In probability and statistics, memorylessness is a property of certain probability distributions: the exponential distributions of non-negative real numbers and the geometric distributions of non-negative integers....

  • Mendelian randomization
    Mendelian randomization
    In epidemiology, Mendelian randomization is a method of using measured variation in genes of known function to examine the causal effect of a modifiable exposure on disease in non-experimental studies...

  • Mentor (statistics)
    Mentor (statistics)
    Mentor is a flexible and sophisticated statistical analysis system produced by CfMC. It specializes in the tabulation and graphical display of market and opinion research data, and is integrated with their Survent data collection software....

     – software
  • Meta-analysis
    Meta-analysis
    In statistics, a meta-analysis combines the results of several studies that address a set of related research hypotheses. In its simplest form, this is normally by identification of a common measure of effect size, for which a weighted average might be the output of a meta-analyses. Here the...

  • Meta-analytic thinking
  • Method of moments (statistics)
  • Method of simulated moments
    Method of simulated moments
    In econometrics, the method of simulated moments is a structural estimation technique introduced by Daniel McFadden. It extends the generalized method of moments to cases where theoretical moment functions cannot be evaluated directly, such as when moment functions involve high-dimensional...

  • Method of support
    Method of support
    In statistics, the method of support is a technique that is used to make inferences from datasets.According to A. W. F. Edwards, the method of support aims to make inferences about unknown parameters in terms of the relative support, or log likelihood, induced by a set of data for a particular...

  • Metropolis–Hastings algorithm
  • Mexican paradox
    Mexican paradox
    The Mexican paradox is the observation that the Mexican people exhibit a surprisingly low incidence of low birth mass, contrary to what would be expected from their socioeconomic status...

  • Microdata (statistics)
    Microdata (statistics)
    In the study of survey and census data, microdata is information at the level of individual respondents. For instance, a national census might collect age, home address, educational level, employment status, and many other variables, recorded separately for every person who responds; this is...

  • Midhinge
    Midhinge
    In statistics, the midhinge is the average of the first and third quartiles and is thus a measure of location.Equivalently, it is the 25% trimmed mid-range; it is an L-estimator....

  • Mid-range
  • MinHash
    MinHash
    In computer science, MinHash is a technique for quickly estimating how similar two sets are...

  • Minimax
    Minimax
    Minimax is a decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case scenario. Alternatively, it can be thought of as maximizing the minimum gain...

  • Minimax estimator
  • Minimisation (clinical trials)
  • Minimum distance estimation
    Minimum distance estimation
    Minimum distance estimation is a statistical method for fitting a mathematical model to data, usually the empirical distribution.-Definition:...

  • Minimum mean square error
  • Minimum-variance unbiased estimator
    Minimum-variance unbiased estimator
    In statistics a uniformly minimum-variance unbiased estimator or minimum-variance unbiased estimator is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter.The question of determining the UMVUE, if one exists, for a particular...

  • Minimum viable population
    Minimum Viable Population
    Minimum viable population is a lower bound on the population of a species, such that it can survive in the wild. This term is used in the fields of biology, ecology, and conservation biology...

  • Minitab
    Minitab
    Minitab is a statistics package. It was developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner in 1972...

  • MINQUE
    Minque
    In statistics, the theory of minimum norm quadratic unbiased estimation was developed by C.R. Rao. Its application was originally to the estimation of variance components in random effects models.The theory involves three stages:...

     – minimum norm quadratic unbiased estimation
  • Missing completely at random
  • Missing data
  • Missing values — redirects to Missing data
  • Mittag–Leffler distribution
  • Mixed logit
    Mixed logit
    Mixed logit is a fully general statistical model for examining discrete choices. The motivation for the mixed logit model arises from the limitations of the standard logit model...

  • Misuse of statistics
    Misuse of statistics
    A misuse of statistics occurs when a statistical argument asserts a falsehood. In some cases, the misuse may be accidental. In others, it is purposeful and for the gain of the perpetrator. When the statistical reason involved is false or misapplied, this constitutes a statistical fallacy.The false...

  • Mixed data sampling
    Mixed data sampling
    Mixed data sampling is an econometric regression or filtering method developed by Ghysels et al. A simple regression example has the regressor appearing at a higher frequency than the regressand:...

  • Mixed-design analysis of variance
    Mixed-design analysis of variance
    In statistics, a mixed-design analysis of variance model is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures...

  • Mixed model
    Mixed model
    A mixed model is a statistical model containing both fixed effects and random effects, that is mixed effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences....

  • Mixing (mathematics)
    Mixing (mathematics)
    In mathematics, mixing is an abstract concept originating from physics: the attempt to describe the irreversible thermodynamic process of mixing in the everyday world: mixing paint, mixing drinks, etc....

  • Mixture distribution
  • Mixture model
    Mixture model
    In statistics, a mixture model is a probabilistic model for representing the presence of sub-populations within an overall population, without requiring that an observed data-set should identify the sub-population to which an individual observation belongs...

  • Mixture (probability)
    Mixture (probability)
    In probability theory and statistics, a mixture is a combination of two or more probability distributions. The concept arises in two contexts:* A mixture defining a new probability distribution from some existing ones, as in a mixture density...

  • MLwiN
    MLwiN
    MLwiN is a statistical software package for fitting multilevel models. It uses both maximum likelihood estimation and Markov Chain Monte Carlo methods...

  • Mode (statistics)
    Mode (statistics)
    In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....

  • Model output statistics
    Model output statistics
    Model Output Statistics is an omnipresent statistical technique that forms the backbone of modern weather forecasting. The technique pioneered in the 1960s and early 1970s is used to post-process output from numerical weather forecast models...

  • Model selection
    Model selection
    Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered...

  • Moderator variable redirects to Moderation (statistics)
    Moderation (statistics)
    In statistics, moderation occurs when the relationship between two variables depends on a third variable. The third variable is referred to as the moderator variable or simply the moderator...

  • Modifiable areal unit problem
    Modifiable Areal Unit Problem
    The modifiable areal unit problem is a source of statistical bias that can radically affect the results of statistical hypothesis tests. It affects results when point-based measures of spatial phenomena are aggregated into districts. The resulting summary values are influenced by the choice of...

  • Moffat distribution
    Moffat distribution
    The Moffat distribution, named after the physicist Anthony Moffat, is a continuous probability distribution based upon the Lorentzian distribution...

  • Moment (mathematics)
    Moment (mathematics)
    In mathematics, a moment is, loosely speaking, a quantitative measure of the shape of a set of points. The "second moment", for example, is widely used and measures the "width" of a set of points in one dimension or in higher dimensions measures the shape of a cloud of points as it could be fit by...

  • Moment-generating function
    Moment-generating function
    In probability theory and statistics, the moment-generating function of any random variable is an alternative definition of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or...

  • Moments, method of — see method of moments (statistics)
  • Moment problem
  • Monotone likelihood ratio
  • Monte Carlo integration
  • Monte Carlo method
    Monte Carlo method
    Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical and mathematical systems...

  • Monte Carlo method for photon transport
    Monte Carlo method for photon transport
    Modeling photon propagation with Monte Carlo methods is a flexible yet rigorous approach to simulate photon transport. In the method, local rules of photon transport are expressed as probability distributions which describe the step size of photon movement between sites of photon-tissue interaction...

  • Monte Carlo methods for option pricing
  • Monte Carlo methods in finance
    Monte Carlo methods in finance
    Monte Carlo methods are used in finance and mathematical finance to value and analyze instruments, portfolios and investments by simulating the various sources of uncertainty affecting their value, and then determining their average value over the range of resultant outcomes. This is usually done...

  • Monte Carlo molecular modeling
    Monte Carlo molecular modeling
    Monte Carlo molecular modeling is the application of Monte Carlo methods to molecular problems. These problems can also be modeled by the molecular dynamics method. The difference is that this approach relies on statistical mechanics rather than molecular dynamics. Instead of trying to reproduce...

  • Moral graph
    Moral graph
    A moral graph is a concept in graph theory, used to find the equivalent undirected form of a directed acyclic graph. It is a key step of the junction tree algorithm, used in belief propagation on graphical models....

  • Moran process
    Moran process
    A Moran process, named after Patrick Moran, is a stochastic process used in biology to describe finite populations. It can be used to model variety-increasing processes such as mutation as well as variety-reducing effects such as genetic drift and natural selection...

  • Moran's I
    Moran's I
    In statistics, Moran's I is a measure of spatial autocorrelation developed by Patrick A.P. Moran. Spatial autocorrelation is characterized by a correlation in a signal among nearby locations in space. Spatial autocorrelation is more complex than one-dimensional autocorrelation because spatial...

  • Morisita's overlap index
  • Morris method
    Morris method
    In applied statistics, the Morris method for global sensitivity analysis is a so-called one-step-at-a-time method , meaning that in each run only one input parameter is given a new value. It facilitates a global sensitivity analysis by making a number r of local changes at different points x of the...

  • Mortality rate
    Mortality rate
    Mortality rate is a measure of the number of deaths in a population, scaled to the size of that population, per unit time...

  • Most probable number
    Most probable number
    The most probable number method, otherwise known as the method of Poisson zeroes, is a method of getting quantitative data on concentrations of discrete items from positive/negative data....

  • Moving average
  • Moving average model
    Moving average model
    In time series analysis, the moving-average model is a common approach for modeling univariate time series models. The notation MA refers to the moving average model of order q:...

  • Moving average representation — redirects to Wold's theorem
  • Moving least squares
    Moving least squares
    Moving least squares is a method of reconstructing continuous functions from a set of unorganized point samples via the calculation of a weighted least squares measure biased towards the region around the point at which the reconstructed value is requested....

  • Multi-armed bandit
    Multi-armed bandit
    In statistics, particularly in the design of sequential experiments, a multi-armed bandit takes its name from a traditional slot machine . Multiple levers are considered in the motivating applications in statistics. When pulled, each lever provides a reward drawn from a distribution associated...

  • Multi-vari chart
    Multi-vari chart
    In quality control, multi-vari charts are a visual way of presenting variability through a series of charts. The content and format of the charts has evolved over time.-Original concept:...

  • Multiclass classification
    Multiclass classification
    In machine learning, multiclass or multinomial classification is the problem of classifying instances into more than two classes.While some classification algorithms naturally permit the use of more than two classes, others are by nature binary algorithms; these can, however, be turned into...

  • Multiclass LDA (Linear discriminant analysis) — redirects to Linear discriminant analysis
    Linear discriminant analysis
    Linear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...

  • Multicollinearity
    Multicollinearity
    Multicollinearity is a statistical phenomenon in which two or more predictor variables in a multiple regression model are highly correlated. In this situation the coefficient estimates may change erratically in response to small changes in the model or the data...

  • Multidimensional analysis
    Multidimensional analysis
    In statistics, econometrics, and related fields, multidimensional analysis is a data analysis process that groups data into two or more categories: data dimensions and measurements. For example, a data set consisting of the number of wins for a single football team at each of several years is a...

  • Multidimensional Chebyshev's inequality
    Multidimensional Chebyshev's inequality
    In probability theory, the multidimensional Chebyshev's inequality is a generalization of Chebyshev's inequality, which puts a bound on the probability of the event that a random variable differs from its expected value by more than a specified amount....

  • Multidimensional panel data
    Multidimensional panel data
    In econometrics, panel data is data observed over two dimensions . A panel data set is termed "multidimensional" when the phenomenon is observed over three or more dimensions...

  • Multidimensional scaling
    Multidimensional scaling
    Multidimensional scaling is a set of related statistical techniques often used in information visualization for exploring similarities or dissimilarities in data. MDS is a special case of ordination. An MDS algorithm starts with a matrix of item–item similarities, then assigns a location to each...

  • Multifactor design of experiments software
    Multifactor design of experiments software
    Software that is used for designing factorial experiments plays an important role in scientific experiments generally and represents a route to the implementation of design of experiments procedures that derive from statistical and combinatoric theory...

  • Multifactor dimensionality reduction
    Multifactor dimensionality reduction
    Multifactor dimensionality reduction is a data mining approach for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable...

  • Multilevel model
    Multilevel model
    Multilevel models are statistical models of parameters that vary at more than one level...

  • Multinomial distribution
  • Multinomial logit
    Multinomial logit
    In statistics, economics, and genetics, a multinomial logit model, also known as multinomial logistic regression, is a regression model which generalizes logistic regression by allowing more than two discrete outcomes...

  • Multinomial probit
    Multinomial probit
    In econometrics and statistics, the multinomial probit model, a popular alternative to the multinomial logit model, is a generalization of the probit model that allows more than two discrete, unordered outcomes. It is not to be confused with the multivariate probit model, which is used to model...

  • Multinomial test
    Multinomial test
    In statistics, the multinomial test is the test of the null hypothesis that the parameters of a multinomial distribution equal specified values. It is used for categorical data; see Read and Cressie....

  • Multiple baseline design
    Multiple Baseline Design
    A multiple baseline design is a style of research involving the careful measurement of multiple persons, traits or settings both before and after a treatment. This design is used in medical, psychological and biological research to name a few areas. It has several advantages over AB designs which...

  • Multiple comparisons
    Multiple comparisons
    In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly...

  • Multiple correlation
    Multiple correlation
    In statistics, multiple correlation is a linear relationship among more than two variables. It is measured by the coefficient of multiple determination, denoted as R2, which is a measure of the fit of a linear regression...

  • Multiple correspondence analysis
    Multiple correspondence analysis
    In statistics, multiple correspondence analysis is a data analysis technique for nominal categorical data, used to detect and represent underlying structures in a data set. It does this by representing data as points in a low-dimensional Euclidean space. The procedure thus appears to be the...

  • Multiple discriminant analysis
    Multiple discriminant analysis
    Multiple Discriminant Analysis is a method for compressing a multivariate signal to yield a lower dimensional signal amenable to classification....

  • Multiple-indicator kriging
    Multiple-indicator kriging
    Multiple-indicator kriging is a recent advance on other techniques for mineral deposit modeling and resource block model estimation, such as ordinary kriging....

  • Multiple Indicator Cluster Survey
    Multiple Indicator Cluster Survey
    The Multiple Indicator Cluster Surveys are a survey program developed by the United Nations Children's Fund to provide internationally comparable, statistically rigorous data on the situation of children and women. The first round of surveys was carried out in over 60 countries in 1995 in...

  • Multiple of the median
    Multiple of the median
    A multiple of the median is a measure of how far an individual test result deviates from the median. MoM is commonly used to report the results of medical screening tests, particularly where the results of the individual tests are highly variable....

  • Multiple testing correction redirects to Multiple comparisons
    Multiple comparisons
    In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly...

  • Multiple-try Metropolis
    Multiple-try Metropolis
    In Markov chain Monte Carlo, the Metropolis–Hastings algorithm can be used to sample from a probability distribution which is difficult to sample from directly. However, the MH algorithm requires the user to supply a proposal distribution, which can be relatively arbitrary...

  • Multiresolution analysis
    Multiresolution analysis
    A multiresolution analysis or multiscale approximation is the design method of most of the practically relevant discrete wavelet transforms and the justification for the algorithm of the fast wavelet transform...

  • Multiscale decision making
    Multiscale decision making
    Multiscale decision making, also referred to as Multiscale decision theory , is a recently developed approach in operations research that fuses game theory, multi-agent influence diagrams, in particular dependency graphs, and Markov decision processes to solve multiscale challenges across...

  • Multiscale geometric analysis
    Multiscale geometric analysis
    Multiscale geometric analysis or geometric multiscale analysis is an emerging area of high-dimensional signal processing and data analysis.-See also:*Wavelet*Scale space*Multi-scale approaches*Multiresolution analysis*Singular value decomposition...

  • Multistage testing
    Multistage testing
    Multistage testing is an algorithm-based approach to administering tests. It is very similar to computer-adaptive testing in that items are interactively selected for each examinee by the algorithm, but rather than selecting individual items, groups of items are selected, building the test in stages...

  • Multitrait-multimethod matrix
    Multitrait-multimethod matrix
    The multitrait-multimethod matrix is an approach to examining Construct Validity developed by Campbell and Fiske. There are six major considerations when examining a construct's validity through the MTMM matrix, which are as follows:...

  • Multivariate adaptive regression splines
    Multivariate adaptive regression splines
    Multivariate adaptive regression splines is a form of regression analysis introduced by Jerome Friedman in 1991. It is a non-parametric regression techniqueand can be seen as an extension of linear models that...

  • Multivariate analysis
    Multivariate analysis
    Multivariate analysis is based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical variable at a time...

  • Multivariate analysis of variance
  • Multivariate distribution – redirects to Joint probability distribution
  • Multivariate kernel density estimation
    Multivariate kernel density estimation
    Kernel density estimation is a nonparametric technique for density estimation i.e., estimation of probability density functions, which is one of the fundamental questions in statistics. It can be viewed as a generalisation of histogram density estimation with improved statistical properties...

  • Multivariate normal distribution
  • Multivariate Pólya distribution
    Multivariate Polya distribution
    The multivariate Pólya distribution, named after George Pólya, also called the Dirichlet compound multinomial distribution, is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector \alpha, and a set of discrete samples is...

  • Multivariate probit
    Multivariate probit
    In statistics and econometrics, the multivariate probit model is a generalization of the probit model used to estimate several correlated binary outcomes jointly...

  • Multivariate random variable
    Multivariate random variable
    In mathematics, probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose values is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value.More formally, a multivariate random...

  • Multivariate stable distribution
  • Multivariate statistics
    Multivariate statistics
    Multivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one statistical variable. The application of multivariate statistics is multivariate analysis...

  • Multivariate Student distribution

N

  • n = 1 fallacy
  • Naive Bayes classifier
    Naive Bayes classifier
    A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong independence assumptions...

  • Nakagami distribution
  • National and international statistical services
  • Nash–Sutcliffe model efficiency coefficient
  • National Health Interview Survey
    National Health Interview Survey
    The National Health Interview Survey is an annual, cross-sectional survey intended to provide nationally-representative estimates on a wide range of health status and utilization measures among the nonmilitary, noninstitutionalized population of the United States...

  • Natural experiment
    Natural experiment
    A natural experiment is an observational study in which the assignment of treatments to subjects has been haphazard: That is, the assignment of treatments has been made "by nature", but not by experimenters. Thus, a natural experiment is not a controlled experiment...

  • Natural exponential family
    Natural exponential family
    In probability and statistics, the natural exponential family is a class of probability distributions that is a special case of an exponential family...

  • Natural process variation
  • NCSS (statistical software)
  • Negative binomial distribution
    Negative binomial distribution
    In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...

  • Negative multinomial distribution
  • Negative predictive value
    Negative predictive value
    In statistics and diagnostic testing, the negative predictive value is a summary statistic used to describe the performance of a diagnostic testing procedure. It is defined as the proportion of subjects with a negative test result who are correctly diagnosed. A high NPV means that when the test...

  • Negative relationship
    Negative relationship
    In statistics, a relationship between two variables is negative if the slope in a corresponding graph is negative, or—what is in some contexts equivalent—if the correlation between them is negative...

  • Negentropy
    Negentropy
    The negentropy, also negative entropy or syntropy, of a living system is the entropy that it exports to keep its own entropy low; it lies at the intersection of entropy and life...

  • Neighbourhood components analysis
    Neighbourhood components analysis
    Neighbourhood components analysis is a supervised learning method for clustering multivariate data into distinct classes according to a given distance metric over the data...

  • Nelson rules
    Nelson rules
    Nelson rules are a method in process control of determining if some measured variable is out of control . Rules, for detecting "out-of-control" or non-random conditions were first postulated by Walter A. Shewhart in the 1920s...

  • Nelson–Aalen estimator
    Nelson–Aalen estimator
    The Nelson–Aalen estimator is a non-parametric estimator of the cumulative hazard rate function in case of censored data or incomplete data. It is used in survival theory, reliability engineering and life insurance to estimate the cumulative number of expected events. An event can be a failure of a...

  • Nested case-control study
    Nested case-control study
    A nested case control study is a variation of a case-cohort study in which only a subset of controls from the cohort are compared to the incident cases. In a case-cohort study, all incident cases in the cohort are compared to a random subset of participants who do not develop the disease of interest...

  • Nested sampling algorithm
    Nested sampling algorithm
    The nested sampling algorithm is a computational approach to the problem of comparing models in Bayesian statistics, developed in 2004 by physicist John Skilling.-Background:...

  • Network probability matrix
    Network Probability Matrix
    The network probability matrix describes the probability structure of a network based on the historical presence or absence of edges in a network. For example, individuals in a social network are not connected to other individuals with uniform random probability. The probability structure is much...

  • Neural network
    Neural network
    The term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...

  • Neutral vector
    Neutral vector
    In statistics, and specifically in the study of the Dirichlet distribution, a neutral vector of random variables is one that exhibits a particular type of statistical independence amongst its elements...

  • Newcastle–Ottawa scale
    Newcastle–Ottawa scale
    In statistics, the Newcastle–Ottawa scale is a method for assessing the quality of nonrandomised studies in meta-analyses. The scales allocate stars, maximum of nine, for quality of selection, comparability, exposure and outcome of study participants. The method was developed as a collaboration...

  • Newey–West estimator
    Newey–West estimator
    A Newey–West estimator is used in statistics and econometrics to provide an estimate of the covariance matrix of the parameters of a regression-type model when this model is applied in situations where the standard assumptions of regression analysis do not apply. It was devised by Whitney K. Newey...

  • Newman–Keuls method
    Newman–Keuls method
    In statistics, the Newman–Keuls method is a post-hoc test used for comparisons after the performed F-test is found to be significant...

  • Neyer d-optimal test
    Neyer d-optimal test
    The Neyer D-Optimal Test is one way of analyzing a sensitivity test of explosives as described by Barry T. Neyer in 1994. This method has replaced the earlier Bruceton analysis or "Up and Down Test" that was devised by Dixon and Mood in 1948 to allow computation with pencil and paper. Samples are...

  • Neyman construction
    Neyman construction
    Neyman construction is a frequentist method to construct an interval at a confidence level C\, that if we repeat the experiment many times the interval will contain the true value a fraction C\, of the time. The probability that the intervals contains the true value is called the coverage.-...

  • Neyman–Pearson lemma
  • Nicholson–Bailey model
  • Nominal category
    Nominal category
    A nominal category or a nominal group is a group of objects or ideas that can be collectively grouped on the basis of shared, arbitrary characteristic....

  • Noncentral beta distribution
    Noncentral beta distribution
    In probability theory and statistics, the noncentral beta distribution is a continuous probability distribution that is a generalization of the beta distribution.- Probability density function :...

  • Noncentral chi distribution
  • Noncentral chi-squared distribution
  • Noncentral F-distribution
    Noncentral F-distribution
    In probability theory and statistics, the noncentral F-distribution is a continuous probability distribution that is a generalization of the F-distribution...

  • Noncentral hypergeometric distributions
    Noncentral hypergeometric distributions
    In statistics, the hypergeometric distribution is the discrete probability distribution generated by picking colored balls at random from an urn without replacement....

  • Noncentral t-distribution
    Noncentral t-distribution
    In probability and statistics, the noncentral t-distribution generalizes Student's t-distribution using a noncentrality parameter. Like the central t-distribution, the noncentral t-distribution is primarily used in statistical inference, although it may also be used in robust modeling for data...

  • Noncentrality parameter
    Noncentrality parameter
    Noncentrality parameters are parameters of families of probability distributions which are related to other "central" families of distributions. If the noncentrality parameter of a distribution is zero, the distribution is identical to a distribution in the central family...

  • Nonlinear autoregressive exogenous model
    Nonlinear autoregressive exogenous model
    In time series modeling, a nonlinear autoregressive exogenous model is a nonlinear autoregressive model which has exogenous inputs. This means that the model relates the current value of a time series which one would like to explain or predict to both:...

  • Nonlinear dimensionality reduction
    Nonlinear dimensionality reduction
    High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lies on an embedded non-linear manifold within the higher-dimensional space...

  • Non-linear iterative partial least squares
    Non-linear iterative partial least squares
    In statistics, non-linear iterative partial least squares is an algorithm for computing the first few components in a principal component or partial least squares analysis. For very high-dimensional datasets, such as those generated in the 'omics sciences it is usually only necessary to compute...

  • Nonlinear regression
    Nonlinear regression
    In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables...

  • Non-homogeneous Poisson process
    Non-homogeneous Poisson process
    In probability theory, a non-homogeneous Poisson process is a Poisson process with rate parameter \lambda such that the rate parameter of the process is a function of time...

  • Non-linear least squares
    Non-linear least squares
    Non-linear least squares is the form of least squares analysis which is used to fit a set of m observations with a model that is non-linear in n unknown parameters . It is used in some forms of non-linear regression. The basis of the method is to approximate the model by a linear one and to...

  • Non-negative matrix factorization
  • Non-parametric statistics
    Non-parametric statistics
    In statistics, the term non-parametric statistics has at least two different meanings:The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:...

  • Non-response bias
    Non-response bias
    Non-response bias occurs in statistical surveys if the answers of respondents differ from the potential answers of those who did not answer.- Example :...

  • Non-sampling error
    Non-sampling error
    In statistics, non-sampling error is a catch-all term for the deviations from the true value that are not a function of the sample chosen, including various systematic errors and any random errors that are not due to sampling. Non-sampling errors are much harder to quantify than sampling errors ....

  • Nonparametric regression
    Nonparametric regression
    Nonparametric regression is a form of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data...

  • Nonprobability sampling
    Nonprobability sampling
    Sampling is the use of a subset of the population to represent the whole population. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. Nonprobability sampling does not meet this criterion and should be...

  • Normal curve equivalent
    Normal curve equivalent
    In educational statistics, a normal curve equivalent , developed for the United States Department of Education by the RMC Research Corporation,NCE stands for Normal Curve Equivalent and was developed [for] the [US] Department of Education. is a way of standardizing scores received on a test. It is...

  • Normal distribution
  • Normal probability plot
    Normal probability plot
    The normal probability plot is a graphical technique for normality testing: assessing whether or not a data set is approximately normally distributed....

     – see also rankit
    Rankit
    In statistics, rankits of a set of data are the expected values of the order statistics of a sample from the standard normal distribution the same size as the data. They are primarily used in the normal probability plot, a graphical technique for normality testing.-Example:This is perhaps most...

  • Normal score
    Normal score
    The term normal score is used with two different meanings in statistics. One of them relates to creating a single value which can be treated as if it had arisen from a standard normal distribution...

     – see also rankit
    Rankit
    In statistics, rankits of a set of data are the expected values of the order statistics of a sample from the standard normal distribution the same size as the data. They are primarily used in the normal probability plot, a graphical technique for normality testing.-Example:This is perhaps most...

     and Z score
  • Normal variance-mean mixture
  • Normal-exponential-gamma distribution
    Normal-exponential-gamma distribution
    In probability theory and statistics, the normal-exponential-gamma distribution is a three-parameter family of continuous probability distributions...

  • Normal-gamma distribution
  • Normal-inverse Gaussian distribution
  • Normal-scaled inverse gamma distribution
  • Normality test
    Normality test
    In statistics, normality tests are used to determine whether a data set is well-modeled by a normal distribution or not, or to compute how likely an underlying random variable is to be normally distributed....

  • Normalization (statistics)
    Normalization (statistics)
    In one usage in statistics, normalization is the process of isolating statistical error in repeated measured data. A normalization is sometimes based on a property...

  • Normally distributed and uncorrelated does not imply independent
    Normally distributed and uncorrelated does not imply independent
    In probability theory, two random variables being uncorrelated does not imply their independence. In some contexts, uncorrelatedness implies at least pairwise independence ....

  • Notation in probability and statistics
  • Novikov's condition
    Novikov's condition
    In probability theory, Novikov's condition is the sufficient condition for a stochastic process which takes the form of the Radon-Nikodym derivative in Girsanov's theorem to be a martingale...

  • np-chart
  • Null distribution
    Null distribution
    In statistical hypothesis testing, the null distribution is the probability distribution of the test statistic when the null hypothesis is true.In an F-test, the null distribution is an F-distribution....

  • Null hypothesis
    Null hypothesis
    The practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...

  • Null result
    Null result
    In science, a null result is a result without the expected content: that is, the proposed result is absent. It is an experimental outcome which does not show an otherwise expected effect. This does not imply a result of zero or nothing, simply a result that does not support the hypothesis...

  • Nuisance parameter
  • Nuisance variable
    Nuisance variable
    In statistics, a nuisance parameter is any parameter which is not of immediate interest but which must be accounted for in the analysis of those parameters which are of interest...

  • Numerical data
    Numerical data
    Numerical data is data measured or identified on a numerical scale. Numerical data can be analyzed using statistical methods, and results can be displayed using tables, charts, histograms and graphs. For example, a researcher will ask a questions to a participant that include words how often, how...

  • Numerical methods for linear least squares
  • Numerical parameter
  • Numerical smoothing and differentiation
    Numerical smoothing and differentiation
    An experimental datum value can be conceptually described as the sum of a signal and some noise, but in practice the two contributions cannot be separated. The purpose of smoothing is to increase the Signal-to-noise ratio without greatly distorting the signal...

  • NumXL
    NumXL
    NumXL is an econometrics/time series analysis add-in for Microsoft Excel. Developed by Spider Financial, NumXL provides a wide variety of statistical and time series analysis techniques, including linear and nonlinear time series modeling, statistical tests and others...

     — software (Excel addin)
  • Nuremberg Code
    Nuremberg Code
    The Nuremberg Code is a set of research ethics principles for human experimentation set as a result of the Subsequent Nuremberg Trials at the end of the Second World War.-Background:...


O

  • Observable variable
    Observable variable
    In statistics, observable variables or manifest variables, as opposed to latent variables, are those variables that can be observed and directly measured.- See also :* Observables in physics* Observability in control theory* Latent variable model...

  • Observational equivalence
    Observational equivalence
    In econometrics, two parameter values are considered observationally equivalent if they both result in the same probability distribution of observable data...

  • Observational error
    Observational error
    Observational error is the difference between a measured value of quantity and its true value. In statistics, an error is not a "mistake". Variability is an inherent part of things being measured and of the measurement process.-Science and experiments:...

  • Observational study
    Observational study
    In epidemiology and statistics, an observational study draws inferences about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator...

  • Observed information
    Observed information
    In statistics, the observed information, or observed Fisher information, is the negative of the second derivative of the "log-likelihood"...

  • Occupancy frequency distribution
    Occupancy frequency distribution
    In macroecology and community ecology, an occupancy frequency distribution is the distribution of the numbers of species occupying different numbers of areas. It was first reported in 1918 by the Danish botanist Christen C. Raunkiær in his study on plant communities...

  • Odds
    Odds
    The odds in favor of an event or a proposition are expressed as the ratio of a pair of integers, which is the ratio of the probability that an event will happen to the probability that it will not happen...

  • Odds algorithm
    Odds algorithm
    The odds-algorithm is a mathematical method for computing optimalstrategies for a class of problems that belong to the domain of optimal stopping problems. Their solution follows from the odds-strategy, and the importance of the...

  • Odds ratio
    Odds ratio
    The odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...

  • Official statistics
    Official statistics
    Official statistics are statistics published by government agencies or other public bodies such as international organizations. They provide quantitative or qualitative information on all major areas of citizens' lives, such as economic and social development, living conditions, health, education,...

  • Ogden tables
    Ogden tables
    Ogden tables are a set of statistical tables and other information for use in court cases in the UK.Their purpose is to make it easier to calculate future losses in personal injury and fatal accident cases. The tables take into account life expectancy and provide a range of discount rates from...

  • Ogive
    Ogive
    An ogive is the roundly tapered end of a two-dimensional or three-dimensional object.-Applied physical science and engineering:In ballistics or aerodynamics, an ogive is a pointed, curved surface mainly used to form the approximately streamlined nose of a bullet or other projectile.The traditional...

  • Omitted-variable bias
    Omitted-variable bias
    In statistics, omitted-variable bias occurs when a model is created which incorrectly leaves out one or more important causal factors. The 'bias' is created when the model compensates for the missing factor by over- or under-estimating one of the other factors.More specifically, OVB is the bias...

  • Omnibus test
    Omnibus test
    Omnibus tests are a kind of statistical test. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of variance. There can be legitimate significant effects within a model even if the...

  • One-class classification
    One-class classification
    One-class classification tries to distinguish one class of objects from all other possible objects, by learning from a training set containing only the objects of that class. This is different from and more difficult than the traditional classification problem, which tries to distinguish between...

  • One-factor-at-a-time method
    One-factor-at-a-time method
    The one-factor-at-a-time method is a method of designing experiments involving the testing of factors, or causes, one at a time instead of all simultaneously. Prominent text books and academic papers currently favor factorial experimental designs, a method pioneered by Sir Ronald A. Fisher, where...

  • One-tailed test — redirects to two-tailed test
    Two-tailed test
    The two-tailed test is a statistical test used in inference, in which a given statistical hypothesis, H0 , will be rejected when the value of the test statistic is either sufficiently small or sufficiently large...

  • One-way ANOVA
    One-way ANOVA
    In statistics, one-way analysis of variance is a technique used to compare means of two or more samples . This technique can be used only for numerical data....

  • Online NMF
    Online NMF
    Online NMF is a recently developed method for real time data analysis in an online context. Non-negative matrix factorization in the past has been used for static data analysis and pattern recognition...

     Online Non-negative Matrix Factorisation
  • Open-label trial
    Open-label trial
    An open-label trial or open trial is a type of clinical trial in which both the researchers and participants know which treatment is being administered....

  • OpenEpi
    OpenEpi
    OpenEpi is a free, web-based, open source, operating system-independent series of programs for use in epidemiology, biostatistics, public health, and medicine, providing a number of epidemiologic and statistical tools for summary data. OpenEpi was developed in JavaScript and HTML, and can be run in...

     – software
  • OpenBUGS
    OpenBUGS
    OpenBUGS is a computer software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo methods. OpenBUGS is the open source variant of WinBUGS . It runs under Windows and Linux, as well as from inside the R statistical package...

      – software
  • Operational confound
  • Operational sex ratio
    Operational sex ratio
    In the evolutionary biology of sexual reproduction, the operational sex ratio is the ratio of sexually competing males that are ready to mate to sexually competing females that are ready to mate...

  • Operations research
    Operations research
    Operations research is an interdisciplinary mathematical science that focuses on the effective use of technology by organizations...

  • Opinion poll
    Opinion poll
    An opinion poll, sometimes simply referred to as a poll is a survey of public opinion from a particular sample. Opinion polls are usually designed to represent the opinions of a population by conducting a series of questions and then extrapolating generalities in ratio or within confidence...

  • Optimal decision
    Optimal decision
    An optimal decision is a decision such that no other available decision options will lead to a better outcome. It is an important concept in decision theory. In order to compare the different decision outcomes, one commonly assigns a relative utility to each of them...

  • Optimal design
    Optimal design
    Optimal designs are a class of experimental designs that are optimal with respect to some statistical criterion.In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated without bias and with minimum-variance...

  • Optimal discriminant analysis
    Optimal discriminant analysis
    Optimal discriminant analysis and the related classification tree analysis are statistical methods that maximize predictive accuracy...

  • Optimal matching
    Optimal matching
    Optimal matching is a sequence analysis method used in social science, to assess the dissimilarity of ordered arrays of tokens that usually represent a time-ordered sequence of socio-economic states two individuals have experienced. Once such distances have been calculated for a set of observations...

  • Optimal stopping
    Optimal stopping
    In mathematics, the theory of optimal stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be found in areas of statistics, economics, and mathematical finance...

  • Optimality criterion
    Optimality criterion
    In statistics, an optimality criterion provides a measure of the fit of the data to a given hypothesis. The selection process is determined by the solution that optimizes the criteria used to evaluate the alternative hypotheses...

  • Optional stopping theorem
    Optional stopping theorem
    In probability theory, the optional stopping theorem says that, under certain conditions, the expected value of a martingale at a stopping time is equal to its initial value...

  • Order of a kernel
    Order of a kernel
    The order of a kernel is the first non-zero moment of a kernel....

  • Order of integration
    Order of integration
    Order of integration, denoted I, is a summary statistic for a time series. It reports the minimum number of differences required to obtain a stationary series.- Integration of order zero :...

  • Order statistic
    Order statistic
    In statistics, the kth order statistic of a statistical sample is equal to its kth-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference....

  • Ordered logit
    Ordered logit
    In statistics, the ordered logit model , is a regression model for ordinal dependent variables...

  • Ordered probit
    Ordered probit
    In statistics, ordered probit is a generalization of the popular probit analysis to the case of more than two outcomes of an ordinal dependent variable. Similarly, the popular logit method also has a counterpart ordered logit....

  • Ordered subset expectation maximization
  • Ordinary least squares
    Ordinary least squares
    In statistics, ordinary least squares or linear least squares is a method for estimating the unknown parameters in a linear regression model. This method minimizes the sum of squared vertical distances between the observed responses in the dataset and the responses predicted by the linear...

  • Ordination (statistics)
    Ordination (statistics)
    In multivariate analysis, ordination is a method complementary to data clustering, and used mainly in exploratory data analysis . Ordination orders objects that are characterized by values on multiple variables so that similar objects are near each other and dissimilar objects are farther from...

  • Ornstein–Uhlenbeck process
  • Orthogonal array testing
  • Orthogonality
    Orthogonality
    Orthogonality occurs when two things can vary independently, they are uncorrelated, or they are perpendicular.-Mathematics:In mathematics, two vectors are orthogonal if they are perpendicular, i.e., they form a right angle...

  • Orthogonality principle
    Orthogonality principle
    In statistics and signal processing, the orthogonality principle is a necessary and sufficient condition for the optimality of a Bayesian estimator. Loosely stated, the orthogonality principle says that the error vector of the optimal estimator is orthogonal to any possible estimator...

  • Outlier
    Outlier
    In statistics, an outlier is an observation that is numerically distant from the rest of the data. Grubbs defined an outlier as: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs....

  • Outliers in statistics – redirects to Robust statistics
    Robust statistics
    Robust statistics provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not unduly affected by small departures from model assumptions.- Introduction :...

     (section)
  • Outliers ratio
    Outliers Ratio
    In objective video quality assessment, the outliers ratio is a measure of the performance of an objective video quality metric. It is the ratio of "false" scores given by the objective metric to the total number of scores. The "false" scores are the scores that lie outside the intervalwhere MOS...

  • Outline of probability
    Outline of probability
    Probability is the likelihood or chance that something is the case or will happen. Probability theory is used extensively in statistics, mathematics, science and philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of complex systems.The following...

  • Outline of regression analysis
    Outline of regression analysis
    In statistics, regression analysis includes any technique for learning about the relationship between one or more dependent variables Y and one or more independent variables X....

  • Outline of statistics
  • Overdispersion
    Overdispersion
    In statistics, overdispersion is the presence of greater variability in a data set than would be expected based on a given simple statistical model....

  • Overfitting
    Overfitting
    In statistics, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations...

  • Owen's T function
  • OxMetrics
    OxMetrics
    OxMetrics is an econometric software including the Ox programming language for econometrics and statistics, developed by Jurgen Doornik and David Hendry...

     – software

P

  • p-chart
  • p-rep
    P-rep
    In statistical hypothesis testing, p-rep or prep has been proposed as a statistical to the classic p-value. Whereas a p-value is the probability of obtaining a result under the null hypothesis, p-rep computes the probability of replicating an effect...

  • P-value
    P-value
    In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. One often "rejects the null hypothesis" when the p-value is less than the significance level α ,...

  • P-P plot
  • Page's trend test
    Page's trend test
    In statistics, the Page test for multiple comparisons between ordered correlated variables is the counterpart of Spearman's rank correlation coefficient which summarizes the association of continuous variables. It is also known as Page's trend test or Page's L test...

  • Paid survey
    Paid survey
    A paid or incentivized survey is a type of statistical survey where the participants/members are rewarded through an incentive program, generally entry into a sweepstakes program or a small cash reward, for completing one or more surveys.- Details :...

  • Paired comparison analysis
    Paired comparison analysis
    In paired-comparison analysis, also known as paired-choice analysis, a range of options are compared and the results are tallied to find an overall winner. A range of plausible options is listed. Each option is compared against each of the other options, determining the preferred option in each case...

  • Paired difference test
    Paired difference test
    In statistics, a paired difference test is a type of location test that is used when comparing two sets of measurements to assess whether their population means differ...

  • Pairwise comparison
    Pairwise comparison
    Pairwise comparison generally refers to any process of comparing entities in pairs to judge which of each entity is preferred, or has a greater amount of some quantitative property. The method of pairwise comparison is used in the scientific study of preferences, attitudes, voting systems, social...

  • Pairwise independence
    Pairwise independence
    In probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are independent. Any collection of mutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent...

  • Panel analysis
    Panel analysis
    Panel analysis is statistical method, widely used in social science, epidemiology, and econometrics, which deals with two-dimensional panel data. The data are usually collected over time and over the same individuals and then a regression is run over these two dimensions...

  • Panel data
    Panel data
    In statistics and econometrics, the term panel data refers to multi-dimensional data. Panel data contains observations on multiple phenomena observed over multiple time periods for the same firms or individuals....

  • Panjer recursion
    Panjer recursion
    The Panjer recursion is an algorithm to compute the probability distribution of a compound random variablewhere both N\, and X_i\, are random variables and of special types. In more general cases the distribution of S is a compound distribution. The recursion for the special cases considered was...

     – a class of discrete compound distributions
  • Paleostatistics
    Paleostatistics
    Paleontology often faces phenomena so vast and complex they can be described only through statistics.First applied to the study of a population in 1662 statistics is today a basic tool for natural sciences practitioners, and a solid acquaintance with methods and applications is essential for...

  • Paley–Zygmund inequality
    Paley–Zygmund inequality
    In mathematics, the Paley–Zygmund inequality bounds theprobability that a positive random variable is small, in terms ofits mean and variance...

  • Parabolic fractal distribution
    Parabolic fractal distribution
    In probability and statistics, the parabolic fractal distribution is a type of discrete probability distribution in which the logarithm of the frequency or size of entities in a population is a quadratic polynomial of the logarithm of the rank...

  • PARAFAC (parallel factor analysis)
  • Parallel factor analysis redirects to PARAFAC
  • Paradigm (experimental)
    Paradigm (experimental)
    In the behavioural sciences, e.g. Psychology, Biology, Neurosciences, an experimental paradigm is an experimental setup that is defined by certain fine-tuned standards and often has a theoretical background...

  • Parameter identification problem
    Parameter identification problem
    The parameter identification problem is a problem which can occur in the estimation of multiple-equation econometric models where the equations have variables in common....

  • Parameter space
    Parameter space
    In science, a parameter space is the set of values of parameters encountered in a particular mathematical model. Often the parameters are inputs of a function, in which case the technical term for the parameter space is domain of a function....

  • Parametric family
    Parametric family
    In mathematics and its applications, a parametric family or a parameterized family is a family of objects whose definitions depend on a set of parameters....

  • Parametric model
    Parametric model
    In statistics, a parametric model or parametric family or finite-dimensional model is a family of distributions that can be described using a finite number of parameters...

  • Parametric statistics
    Parametric statistics
    Parametric statistics is a branch of statistics that assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. Most well-known elementary statistical methods are parametric....

  • Pareto analysis
    Pareto analysis
    Pareto analysis is a statistical technique in decision making that is used for selection of a limited number of tasks that produce significant overall effect. It uses the Pareto principle – the idea that by doing 20% of work, 80% of the advantage of doing the entire job can be generated...

  • Pareto chart
  • Pareto distribution
  • Pareto index
    Pareto index
    In economics the Pareto index, named after the Italian economist and sociologist Vilfredo Pareto, is a measure of the breadth of income or wealth distribution. It is one of the parameters specifying a Pareto distribution and embodies the Pareto principle...

  • Pareto interpolation
    Pareto interpolation
    Pareto interpolation is a method of estimating the median and other properties of a population that follows a Pareto distribution. It is used in economics when analysing the distribution of incomes in a population, when one must base estimates on a relatively small random sample taken from the...

  • Pareto principle
    Pareto principle
    The Pareto principle states that, for many events, roughly 80% of the effects come from 20% of the causes.Business-management consultant Joseph M...

  • Partial autocorrelation — redirects to Partial autocorrelation function
    Partial autocorrelation function
    In time series analysis, the partial autocorrelation function plays an important role in data analyses aimed at identifying the extent of the lag in an autoregressive model...

  • Partial autocorrelation function
    Partial autocorrelation function
    In time series analysis, the partial autocorrelation function plays an important role in data analyses aimed at identifying the extent of the lag in an autoregressive model...

  • Partial correlation
    Partial correlation
    In probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed.-Formal definition:...

  • Partial least squares
  • Partial least squares regression
    Partial least squares regression
    Partial least squares regression is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a linear regression model by projecting the predicted variables and the...

  • Partial leverage
  • Partial regression plot
    Partial regression plot
    In applied statistics, a partial regression plot attempts to show the effect of adding an additional variable to the model...

  • Partial residual plot
    Partial residual plot
    In applied statistics, a partial residual plot is a graphical technique that attempts to show the relationship between a given independent variable and the response variable given that other independent variables are also in the model.-Background:...

  • Particle filter
    Particle filter
    In statistics, particle filters, also known as Sequential Monte Carlo methods , are sophisticated model estimation techniques based on simulation...

  • Partition of sums of squares
  • Parzen window
  • Path analysis (statistics)
  • Path coefficient
  • Path space
    Path space
    In mathematics, the term path space refers to any topological space of paths from one specified set into another. In particular, it may refer to* the classical Wiener space of continuous paths;* the Skorokhod space of càdlàg paths....

  • Pattern recognition
    Pattern recognition
    In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...

  • Pearson's chi-squared test
    Pearson's chi-squared test
    Pearson's chi-squared test is the best-known of several chi-squared tests – statistical procedures whose results are evaluated by reference to the chi-squared distribution. Its properties were first investigated by Karl Pearson in 1900...

     (one of various chi-squared tests)
  • Pearson distribution
    Pearson distribution
    The Pearson distribution is a family of continuous probability distributions. It was first published by Karl Pearson in 1895 and subsequently extended by him in 1901 and 1916 in a series of articles on biostatistics.- History :...

  • Pearson product-moment correlation coefficient
    Pearson product-moment correlation coefficient
    In statistics, the Pearson product-moment correlation coefficient is a measure of the correlation between two variables X and Y, giving a value between +1 and −1 inclusive...

  • People v. Collins
    People v. Collins
    The People of the State of California v. Collins was a 1968 jury trial in California, USA that made notorious forensic use of mathematics and probability.-Trial at first instance:...

     (prob/stats related court case)
  • Per capita
    Per capita
    Per capita is a Latin prepositional phrase: per and capita . The phrase thus means "by heads" or "for each head", i.e. per individual or per person...

  • Per-comparison error rate
  • Per-protocol analysis
    Per-protocol analysis
    In epidemiology, per-protocol analysis is a strategy of analysis in which only patients who complete the entire clinical trial are counted towards the final results. Intention to treat analysis uses data from all patients, including those who did not complete the study.- External links :* - of...

  • Percentile
    Percentile
    In statistics, a percentile is the value of a variable below which a certain percent of observations fall. For example, the 20th percentile is the value below which 20 percent of the observations may be found...

  • Percentile rank
    Percentile rank
    The percentile rank of a score is the percentage of scores in its frequency distribution that are the same or lower than it. For example, a test score that is greater than 75% of the scores of people taking the test is said to be at the 75th percentile....

  • Periodic variation — redirects to Seasonality
    Seasonality
    In statistics, many time series exhibit cyclic variation known as seasonality, periodic variation, or periodic fluctuations. This variation can be either regular or semi regular....

  • Periodogram
    Periodogram
    The periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898 as in the following quote:...

  • Peirce's criterion
    Peirce's criterion
    In robust statistics, Peirce's criterion is a rule for eliminating outliers from data sets, which was devised by Benjamin Peirce.-The problem of outliers:...

  • Pensim2
    Pensim2
    Pensim2 is a dynamic microsimulation model to simulate the income of pensioners, owned by the British Department for Work and Pensions.Pensim2 is the second version of Pensim which was developed in the 1990s. The time horizon of the model is 100 years, by which time today's school leavers will...

     — an econometric model
  • Percentage point
    Percentage point
    Percentage points are the unit for the arithmetic difference of two percentages.Consider the following hypothetical example: in 1980, 40 percent of the population smoked, and in 1990 only 30 percent smoked...

  • Permutation test — redirects to Resampling (statistics)
    Resampling (statistics)
    In statistics, resampling is any of a variety of methods for doing one of the following:# Estimating the precision of sample statistics by using subsets of available data or drawing randomly with replacement from a set of data points # Exchanging labels on data points when performing significance...

  • Pharmaceutical statistics
    Pharmaceutical Statistics
    Pharmaceutical Statistics is a peer-reviewed scientific journal that publishes papers related to pharmaceutical statistics. It is the official journal of Statisticians in the Pharmaceutical Industry and is published by John Wiley & Sons....

  • Phase dispersion minimization
    Phase dispersion minimization
    Phase dispersion minimization is a data analysis technique that searches for periodic components of a time series data set. It is useful for data sets with gaps, non-sinusoidal variations, poor time coverage or other problems that would make Fourier techniques unusable...

  • Phase-type distribution
    Phase-type distribution
    A phase-type distribution is a probability distribution that results from a system of one or more inter-related Poisson processes occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random...

  • Phi coefficient
    Phi coefficient
    In statistics, the phi coefficient is a measure of association for two binary variables introduced by Karl Pearson. This measure is similar to the Pearson correlation coefficient in its interpretation...

  • Phillips–Perron test
  • Philosophy of probability
  • Philosophy of statistics
    Philosophy of statistics
    The philosophy of statistics involves the meaning, justification, utility, use and abuse of statistics and its methodology, and ethical and epistemological issues involved in the consideration of choice and interpretation of data and methods of Statistics....

  • Pie chart
    Pie chart
    A pie chart is a circular chart divided into sectors, illustrating proportion. In a pie chart, the arc length of each sector , is proportional to the quantity it represents. When angles are measured with 1 turn as unit then a number of percent is identified with the same number of centiturns...

  • Pignistic probability
    Pignistic probability
    Pignistic probability, in decision theory, is a probability that a rational person will assign to an option when required to make a decision.A person may have, at one level certain beliefs or a lack of knowledge, or uncertainty, about the options and their actual likelihoods...

  • Pinsker's inequality
    Pinsker's inequality
    In information theory, Pinsker's inequality, named after its inventor Mark Semenovich Pinsker, is an inequality that relates Kullback-Leibler divergence and the total variation distance...

  • Pitman–Koopman–Darmois theorem
  • Pitman–Yor process
  • Pivotal quantity
    Pivotal quantity
    In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters whose probability distribution does not depend on unknown parameters....

  • Placebo-controlled study
  • Plackett–Burman design
  • Plate notation
    Plate notation
    Plate notation is a method of representing variables that repeat in a graphical model. Instead of drawing each repeated variable individually, a plate or rectangle is used to group variables into a subgraph that repeat together, and a number is drawn on the plate to represent the number of...

  • Player wins
    Player wins
    Player wins is a stat used to estimate the number of games a player won for his team developed by Dean Oliver, the first full-time statistical analyst in the NBA.The formula used to calculate player wins is Player Games * Player Winning Percentage....

  • Plot (graphics)
    Plot (graphics)
    A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a mechanical or electronic plotter. Graphs are a visual representation of the relationship between variables, very useful for...

  • Pocock boundary
    Pocock boundary
    The Pocock boundary is a method for determine whether to stop a clinical trial prematurely. The typical clinical trial compares two groups of patients. One group are given a placebo or conventional treatment, while the other group of patients are given the treatment that is being tested...

  • Poincaré plot
    Poincaré plot
    A Poincaré plot, named after Henri Poincaré, is used to quantify self-similarity in processes, usually periodic functions. It is also known as a return map.Given a time series of the form...

  • Point-biserial correlation coefficient
    Point-biserial correlation coefficient
    The point biserial correlation coefficient is a correlation coefficient used when one variable is dichotomous; Y can either be "naturally" dichotomous, like gender, or an artificially dichotomized variable. In most situations it is not advisable to artificially dichotomize variables...

  • Point estimation
    Point estimation
    In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter....

  • Point pattern analysis
  • Point process
    Point process
    In statistics and probability theory, a point process is a type of random process for which any one realisation consists of a set of isolated points either in time or geographical space, or in even more general spaces...

  • Poisson binomial distribution
  • Poisson distribution
    Poisson distribution
    In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

  • Poisson hidden Markov model
    Poisson hidden Markov model
    In statistics, Poisson hidden Markov models are a special case of hidden Markov models where a Poisson process has a rate which varies in association with changes between the different states of a Markov model...

  • Poisson limit theorem
    Poisson limit theorem
    The Poisson theorem gives a Poisson approximation to the binomial distribution, under certain conditions. The theorem was named after Siméon-Denis Poisson .- The theorem :If...

  • Poisson process
    Poisson process
    A Poisson process, named after the French mathematician Siméon-Denis Poisson , is a stochastic process in which events occur continuously and independently of one another...

  • Poisson regression
    Poisson regression
    In statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown...

  • Poisson random numbers — redirects to section of Poisson distribution
    Poisson distribution
    In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

  • Poisson sampling
    Poisson sampling
    In the theory of finite population sampling, Poisson sampling is a sampling process where each element of the population that is sampled is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample during the drawing of a single sample.Each element...

  • Polar distribution — redirects to Circular distribution
  • Policy capturing
    Policy capturing
    Policy capturing or "the PC technique" is a statistical method used in social psychology to quantify the relationship between a person's judgement and the information that was used to make that judgement. Policy capturing assessments rely upon regression analysis models...

  • Political forecasting
    Political forecasting
    Political forecasting aims at predicting the outcome of elections. Models include:- Opinion polls :Polls are an integral part of political forecasting. However, incorporating poll results into political forecasting models can cause problems in predicting the outcome of elections...

  • Pollaczek–Khinchine formula
  • Pollyanna Creep
    Pollyanna creep
    Pollyanna Creep is a phrase that originated with John Williams, a California-based economic analyst and statistician. It describes the way the U.S. government has modified the way important economic measures are calculated with the purpose of giving a better impression of economic development. This...

  • Poly-Weibull distribution
    Poly-Weibull distribution
    In probability theory and statistics, the poly-Weibull distribution is a continuous probability distribution. The distribution is defined to be that of a random variable defined to be the smallest of a number of statistically independent random variables having non-identical Weibull...

  • Polychoric correlation
    Polychoric correlation
    In statistics, polychoric correlation is a technique for estimating the correlation between two theorised normally distributed continuous latent variables, from two observed ordinal variables. Tetrachoric correlation is a special case of the polychoric correlation applicable when both observed...

  • Polynomial and rational function modeling
    Polynomial and rational function modeling
    In statistical modeling , polynomial functions and rational functions are sometimes used as an empirical technique for curve fitting.-Polynomial function models:A polynomial function is one that has the form...

  • Polynomial chaos
    Polynomial chaos
    Polynomial chaos , also called "Wiener Chaos expansion", is a non-sampling based method to determine evolution of uncertainty in dynamical system, when there is probabilistic uncertainty in the system parameters....

  • Polynomial regression
    Polynomial regression
    In statistics, polynomial regression is a form of linear regression in which the relationship between the independent variable x and the dependent variable y is modeled as an nth order polynomial...

  • Polytree
    Polytree
    In graph theory, a polytree is a directed graph with at most one undirected path between any two vertices. In other words, a polytree is a directed acyclic graph for which there are no undirected cycles either...

      (Bayesian networks)
  • Pooled standard deviation redirects to Pooled variance
    Pooled variance
    In statistics, many times, data are collected for a dependent variable, y, over a range of values for the independent variable, x. For example, the observation of fuel consumption might be studied as a function of engine speed while the engine load is held constant...

  • Pooling design
    Pooling design
    A pooling design is an algorithm to intelligently classify items by testing them in groups or pools rather than individually. The result from the pools is usually binary — either positive or negative. A negative result can imply that all the items tested in that pool were failures, if the...

  • Popoviciu's inequality on variances
    Popoviciu's inequality on variances
    In probability theory, Popoviciu's inequality, named after Tiberiu Popoviciu, is an upper bound on the variance of any bounded probability distribution. Let M and m be upper and lower bounds on the values of any random variable with a particular probability distribution...

  • Population
    Statistical population
    A statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. For example, if we were interested in generalizations about crows, then we would describe the set of crows that is of interest...

  • Population dynamics
    Population dynamics
    Population dynamics is the branch of life sciences that studies short-term and long-term changes in the size and age composition of populations, and the biological and environmental processes influencing those changes...

  • Population ecology
    Population ecology
    Population ecology is a sub-field of ecology that deals with the dynamics of species populations and how these populations interact with the environment. It is the study of how the population sizes of species living together in groups change over time and space....

      – application
  • Population modeling
    Population modeling
    A population model is a type of mathematical model that is applied to the study of population dynamics.Models allow a better understanding of how complex interactions and processes work. Modeling of dynamic interactions in nature can provide a manageable way of understanding how numbers change over...

  • Population process
    Population process
    In applied probability, a population process is a Markov chain in which the state of the chain is analogous to the number of individuals in a population , and changes to the state are analogous to the addition or removal of individuals from the population.Although named by analogy to biological...

  • Population pyramid
    Population pyramid
    A population pyramid, also called an age structure diagram, is a graphical illustration that shows the distribution of various age groups in a population , which forms the shape of a pyramid when the population is growing...

  • Population statistics
    Population statistics
    Population statistics is the use of statistics to analyze characteristics or changes to a population. It is related to social demography and demography.Population statistics can analyze anything from global demographic changes to local small scale changes...

  • Population variance
  • Population viability analysis
    Population viability analysis
    Population viability analysis is a species-specific method of risk assessment frequently used in conservation biology.It is traditionally defined as the process that determines the probability that a population will go extinct within a given number of years.More recently, PVA has been described...

  • Portmanteau test
    Portmanteau test
    A portmanteau test is a type of statistical hypothesis test in which the null hypothesis is well specified, but the alternative hypothesis is more loosely specified. Tests constructed in this context can have the property of being at least moderately powerful against a wide range of departures from...

  • Positive predictive value
    Positive predictive value
    In statistics and diagnostic testing, the positive predictive value, or precision rate is the proportion of subjects with positive test results who are correctly diagnosed. It is a critical measure of the performance of a diagnostic method, as it reflects the probability that a positive test...

  • Post-hoc analysis
    Post-hoc analysis
    Post-hoc analysis , in the context of design and analysis of experiments, refers to looking at the data—after the experiment has concluded—for patterns that were not specified a priori. It is sometimes called by critics data dredging to evoke the sense that the more one looks the more likely...

  • Posterior probability
    Posterior probability
    In Bayesian statistics, the posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant evidence is taken into account...

  • Power law
    Power law
    A power law is a special kind of mathematical relationship between two quantities. When the frequency of an event varies as a power of some attribute of that event , the frequency is said to follow a power law. For instance, the number of cities having a certain population size is found to vary...

  • Power transform
    Power transform
    In statistics, the power transform is from a family of functions that are applied to create a rank-preserving transformation of data using power functions. This is a useful data processing technique used to stabilize variance, make the data more normal distribution-like, improve the correlation...

  • Prais–Winsten estimation
  • Pre- and post-test probability
    Pre- and post-test probability
    Pre-test probability and post-test probability are the subjective probabilities of the presence of a condition before and after a diagnostic test, respectively...

  • Precision (statistics)
    Precision (statistics)
    In statistics, the term precision can mean a quantity defined in a specific way. This is in addition to its more general meaning in the contexts of accuracy and precision and of precision and recall....

  • Precision and recall
    Precision and recall
    In pattern recognition and information retrieval, precision is the fraction of retrieved instances that are relevant, while recall is the fraction of relevant instances that are retrieved. Both precision and recall are therefore based on an understanding and measure of relevance...

  • Prediction interval
    Prediction interval
    In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed...

  • Predictive analytics
    Predictive analytics
    Predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, data mining and game theory that analyze current and historical facts to make predictions about future events....

  • Predictive inference
    Predictive inference
    Predictive inference is an approach to statistical inference that emphasizes the prediction of future observations based on past observations.Initially, predictive inference was based on observable parameters and it was the main purpose of studying probability, but it fell out of favor in the 20th...

  • Predictive informatics
    Predictive informatics
    Predictive informatics is the combination of predictive modeling and informatics applied to healthcare, pharmaceutical, life sciences and business industries....

  • Predictive modelling
    Predictive modelling
    Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an...

  • Predictive validity
    Predictive validity
    In psychometrics, predictive validity is the extent to which a score on a scale or test predicts scores on some criterion measure.For example, the validity of a cognitive test for job performance is the correlation between test scores and, for example, supervisor performance ratings...

  • Preference regression (in marketing)
    Preference regression (in marketing)
    Preference regression is a statistical technique used by marketers to determine consumers’ preferred core benefits. It usually supplements product positioning techniques like multi dimensional scaling or factor analysis and is used to create ideal vectors on perceptual maps.-Application:Starting...

  • Preferential attachment process — redirects to Preferential attachment
    Preferential attachment
    A preferential attachment process is any of a class of processes in which some quantity, typically some form of wealth or credit, is distributed among a number of individuals or objects according to how much they already have, so that those who are already wealthy receive more than those who are not...

  • Prevalence
    Prevalence
    In epidemiology, the prevalence of a health-related state in a statistical population is defined as the total number of cases of the risk factor in the population at a given time, or the total number of cases in the population, divided by the number of individuals in the population...

  • Principal component analysis
    • Multilinear principal-component analysis
  • Principal component regression
    Principal component regression
    In statistics, principal component regression is a regression analysis that uses principal component analysis when estimating regression coefficients...

  • Principal geodesic analysis
    Principal geodesic analysis
    In geometric data analysis and statistical shape analysis, principal geodesic analysis is a generalization of principal component analysis to a non-Euclidean, non-linear setting of manifolds suitable for use with shape descriptors such as medial representations....

  • Principal stratification
    Principal stratification
    Principal stratification is a statistical technique used in causal inference.-References: * Zhang, Junni L.; Rubin, Donald B. "Estimation of Causal Effects via Principal Stratification When Some Outcomes are Truncated by “Death”", Journal of Educational and Behavioral Statistics, 28: 353–368...

  • Principle of indifference
    Principle of indifference
    The principle of indifference is a rule for assigning epistemic probabilities.Suppose that there are n > 1 mutually exclusive and collectively exhaustive possibilities....

  • Principle of marginality
    Principle of marginality
    In statistics, the principle of marginality refers to the fact that the average effects, of variables in an analysis are marginal to their interaction effect...

  • Principle of maximum entropy
    Principle of maximum entropy
    In Bayesian probability, the principle of maximum entropy is a postulate which states that, subject to known constraints , the probability distribution which best represents the current state of knowledge is the one with largest entropy.Let some testable information about a probability distribution...

  • Prior knowledge for pattern recognition
    Prior knowledge for pattern recognition
    Pattern recognition is a very active field of research intimately bound to machine learning. Also known as classification or statistical classification, pattern recognition aims at building a classifier that can determine the class of an input pattern...

  • Prior probability
    Prior probability
    In Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity p is the probability distribution that would express one's uncertainty about p before the "data"...

  • Prior probability distribution redirects to Prior probability
    Prior probability
    In Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity p is the probability distribution that would express one's uncertainty about p before the "data"...

  • Probabilistic causation
    Probabilistic causation
    Probabilistic causation designates a group of philosophical theories that aim to characterize the relationship between cause and effect using the tools of probability theory...

  • Probabilistic design
    Probabilistic design
    Probabilistic design is a discipline within engineering design. It deals primarily with the consideration of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects are related to quality and reliability...

  • Probabilistic forecasting
    Probabilistic forecasting
    Probabilistic forecasting summarises what is known, or opinions about, future events. In contrast to a single-valued forecasts , probabilistic forecasts assign a probability to each of a number of different outcomes,...

  • Probabilistic latent semantic analysis
    Probabilistic latent semantic analysis
    Probabilistic latent semantic analysis , also known as probabilistic latent semantic indexing is a statistical technique for the analysis of two-mode and co-occurrence data. PLSA evolved from latent semantic analysis, adding a sounder probabilistic model...

  • Probabilistic metric space
    Probabilistic metric space
    A probabilistic metric space is a generalization of metric spaces where the distance is no longer defined on positive real numbers, but on distribution functions....

  • Probabilistic proposition
    Probabilistic proposition
    A probabilistic proposition is a proposition with a measured probability of being true for an arbitrary person at an arbitrary time.These are some examples of probabilistic propositions collected by the Mindpixel project:* You are not human 0.17...

  • Probabilistic relational model
    Probabilistic relational model
    A Probabilistic relational model is the counterpart of a Bayesian network in statistical relational learning.-References:*Friedman N, Getoor L, Koller D, Pfeffer A....

  • Probability
    Probability
    Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

  • Probability and statistics
    Probability and statistics
    See the separate articles on probability or the article on statistics. Statistical analysis often uses probability distributions, and the two topics are often studied together. However, probability theory contains much that is of mostly mathematical interest and not directly relevant to statistics...

  • Probability density function
    Probability density function
    In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

  • Probability distribution
    Probability distribution
    In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....

  • Probability distribution function
    Probability distribution function
    Depending upon which text is consulted, a probability distribution function is any of:* a probability distribution function,* a cumulative distribution function,* a probability mass function, or* a probability density function....

     (disambiguation)
  • Probability integral transform
    Probability integral transform
    In statistics, the probability integral transform or transformation relates to the result that data values that are modelled as being random variables from any given continuous distribution can be converted to random variables having a uniform distribution...

  • Probability interpretations
    Probability interpretations
    The word probability has been used in a variety of ways since it was first coined in relation to games of chance. Does probability measure the real, physical tendency of something to occur, or is it just a measure of how strongly one believes it will occur? In answering such questions, we...

  • Probability mass function
    Probability mass function
    In probability theory and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value...

  • Probability matching
    Probability matching
    Probability matching is a suboptimal decision strategy in which predictions of class membership are proportional to the class base rates. Thus, if in the training set positive examples are observed 60% of the time, and negative examples are observed 40% of the time, the observer using a...

  • Probability metric
  • Probability of error
    Probability of error
    In statistics, the term "error" arises in two ways. Firstly, it arises in the context of decision making, where the probability of error may be considered as being the probability of making a wrong decision and which would have a different value for each type of error...

  • Probability of precipitation
    Probability of Precipitation
    A probability of precipitation is a formal measure of the likelihood of precipitation that is often published from weather forecasting models. Its definition varies.-U.S. usage:...

  • Probability plot
    Probability plot
    In statistics, a P-P plot is a probability plot for assessing how closely two data sets agree, which plots the two cumulative distribution functions against each other....

  • Probability plot correlation coefficient — redirects to Q-Q plot
    Q-Q plot
    In statistics, a Q-Q plot is a probability plot, which is a graphical method for comparing two probability distributions by plotting their quantiles against each other. First, the set of intervals for the quantiles are chosen...

  • Probability plot correlation coefficient plot
    Probability plot correlation coefficient plot
    Many statistical analyses are based on distributional assumptions about the population from which the data have been obtained. However, distributional families can have radically different shapes depending on the value of the shape parameter. Therefore, finding a reasonable choice for the shape...

  • Probability space
    Probability space
    In probability theory, a probability space or a probability triple is a mathematical construct that models a real-world process consisting of states that occur randomly. A probability space is constructed with a specific kind of situation or experiment in mind...

  • Probability theory
    Probability theory
    Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

  • Probability-generating function
    Probability-generating function
    In probability theory, the probability-generating function of a discrete random variable is a power series representation of the probability mass function of the random variable...

  • Probable error
    Probable error
    -Statistics:In statistics, the probable error of a quantity is a value describing the probability distribution of that quantity. It defines the half-range of an interval about a cental point for the distribution, such that half of the values from the distribution will lie within the interval and...

  • Probit
    Probit
    In probability theory and statistics, the probit function is the inverse cumulative distribution function , or quantile function associated with the standard normal distribution...

  • Probit model
    Probit model
    In statistics, a probit model is a type of regression where the dependent variable can only take two values, for example married or not married....

  • Procedural confound
  • Process Window Index
    Process Window Index
    Process Window Index is a statistical measure that quantifies the robustness of a manufacturing process which involves heating and cooling, known as a thermal process...

  • Procrustes analysis
    Procrustes analysis
    In statistics, Procrustes analysis is a form of statistical shape analysis used to analyse the distribution of a set of shapes. The name Procrustes refers to a bandit from Greek mythology who made his victims fit his bed either by stretching their limbs or cutting them off.To compare the shape of...

  • Proebsting's paradox
    Proebsting's paradox
    In probability theory, Proebsting's paradox is an argument that appears to show that the Kelly criterion can lead to ruin. Although it can be resolved mathematically, it raises some interesting issues about the practical application of Kelly, especially in investing. It was named and first...

  • Product distribution
    Product distribution
    A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions...

  • Product form solution
    Product form solution
    In probability theory, a product form solution is a particularly efficient form of solution for determining some metric of a system with distinct sub-components, where the metric for the collection of components can be written as a product of the metric across the different components...

  • Profile likelihood redirects to Likelihood function
    Likelihood function
    In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...

  • Progressively measurable process
    Progressively measurable process
    In mathematics, progressive measurability is a property of stochastic processes. A progressively measurable process is one for which events defined in terms of values of the process across a range of times can be assigned probabilities . Being progressively measurable is a strictly stronger...

  • Prognostics
    Prognostics
    Prognostics is an engineering discipline focused on predicting the time at which a system or a component will no longer perform its intended function . This lack of performance is most often a failure beyond which the system can no longer be used to meet desired performance...

  • Projection pursuit
    Projection pursuit
    Projection pursuit is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a Normal distribution are considered to be more interesting...

  • Projection pursuit regression
    Projection pursuit regression
    In statistics, projection pursuit regression is a statistical model developed by Jerome H. Friedman and Werner Stuetzle which is an extension of additive models...

  • Proof of Stein's example
    Proof of Stein's example
    Stein's example is an important result in decision theory which can be stated asThe following is an outline of its proof. The reader is referred to the main article for more information.-Sketched proof:...

  • Propagation of uncertainty
    Propagation of uncertainty
    In statistics, propagation of error is the effect of variables' uncertainties on the uncertainty of a function based on them...

  • Propensity probability
    Propensity probability
    The propensity theory of probability is one interpretation of the concept of probability. Theorists who adopt this interpretation think of probability as a physical propensity, or disposition, or tendency of a given type of physical situation to yield an outcome of a certain kind, or to yield a...

  • Propensity score
    Propensity score
    In the design of experiments, a propensity score is the probability of a unit being assigned to a particular condition in a study given a set of known covariates...

  • Propensity score matching
    Propensity score matching
    In the statistical analysis of observational data, propensity score matching is a methodology attempting to provide unbiased estimation of treatment-effects...

  • Proper linear model
    Proper linear model
    In statistics, a proper linear model is a linear regression model in which the weights given to the predictor variables are chosen in such a way as to optimize the relationship between the prediction and the criterion. Simple regression analysis is the most common example of a proper linear model...

  • Proportional hazards models
    Proportional hazards models
    Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes before some event occurs to one or more covariates that may be associated with that quantity. In a proportional hazards model, the unique effect of a unit increase in a covariate...

  • Proportional reduction in loss
    Proportional reduction in loss
    Proportional reduction in loss refers to a general framework for developing and evaluating measures of the reliability of particular ways of making observations which are possibly subject to errors of all types...

  • Prosecutor's fallacy
    Prosecutor's fallacy
    The prosecutor's fallacy is a fallacy of statistical reasoning made in law where the context in which the accused has been brought to court is falsely assumed to be irrelevant to judging how confident a jury can be in evidence against them with a statistical measure of doubt...

  • Proxy (statistics)
    Proxy (statistics)
    In statistics, a proxy variable is something that is probably not in itself of any great interest, but from which a variable of interest can be obtained...

  • Psephology
    Psephology
    Psephology is that branch of political science which deals with the study and scientific analysis of elections. Psephology uses historical precinct voting data, public opinion polls, campaign finance information and similar statistical data. The term was coined in the United Kingdom in 1952 by...

  • Pseudo-determinant
    Pseudo-determinant
    In linear algebra and statistics, the pseudo-determinant is the product of all non-zero eigenvalues of a square matrix. It coincides with the regular determinant when the matrix is non-singular.- Definition :...

  • Pseudocount
    Pseudocount
    A pseudocount is an amount added to the number of observed cases in order to change the expected probability in a model of those data, when not known to be zero. Depending on the prior knowledge, which is sometimes a subjective value, a pseudocount may have any non-negative finite value...

  • Pseudolikelihood
    Pseudolikelihood
    In statistical theory, a pseudolikelihood is an approximation to the joint probability distribution of a collection of random variables. The practical use of this is that it can provide an approximation to the likelihood function of a set of observed data which may either provide a computationally...

  • Pseudomedian
    Pseudomedian
    In statistics, the pseudomedian is defined as the median of all possible midpoints of pairs of observations. It is the Hodges–Lehmann one-sample estimate of the central location for a probability distribution.-References:...

  • Pseudoreplication
    Pseudoreplication
    Hurlbert defined pseudoreplication as the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated or replicates are not statistically independent....

  • PSPP
    PSPP
    PSPP is a free software application for analysis of sampled data. It has a graphical user interface and conventional command line interface. It is written in C, uses GNU Scientific Library for its mathematical routines, and plotutils for generating graphs....

     (free software)
  • Psychological statistics
    Psychological statistics
    Psychological statistics is the application of statistics to psychology. Some of the more common applications include:#psychometrics#learning theory#perception#human development#abnormal psychology#Personality test#psychological tests...

  • Psychometrics
    Psychometrics
    Psychometrics is the field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, personality traits, and educational measurement...

  • Pythagorean expectation
    Pythagorean expectation
    Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs they scored and allowed. Comparing a team's actual and Pythagorean winning percentage can be used to evaluate how lucky that team was...


Q

  • Q test
    Q test
    In statistics, Dixon's Q test, or simply the Q test, is used for identification and rejection of outliers. Per Dean and Dixon, and others, this test should be used sparingly and never more than once in a data set...

  • Q research software
    Q research software
    Q research software is computer software for the analysis of market research data. Launched in 2007, Q is developed by Numbers International Pty Ltd.- Interactive data analysis :...

  • Q-exponential distribution
  • Q-function
    Q-function
    In statistics, the Q-function is the tail probability of the standard normal distribution. In other words, Q is the probability that a standard normal random variable will obtain a value larger than x...

  • Q-Gaussian distribution
    Q-Gaussian distribution
    In q-analog theory, the q-Gaussian is a probability distribution arising from the maximization of the Tsallis entropy under appropriate constraints. It is one example of a Tsallis distribution. The q-Gaussian is a generalization of the Gaussian in the same way that Tsallis entropy is a...

  • Q-Q plot
    Q-Q plot
    In statistics, a Q-Q plot is a probability plot, which is a graphical method for comparing two probability distributions by plotting their quantiles against each other. First, the set of intervals for the quantiles are chosen...

  • Q-statistic
    Q-statistic
    The Q-statistic is a test statistic output by either the Box-Pierce test or, in a modified version which provides better small sample properties, by the Ljung-Box test. It follows the chi-squared distribution...

  • Quadrat
    Quadrat
    A quadrat is a square used in ecology and geography to isolate a sample, usually about 1m2 or 0.25m2. The quadrat is suitable for sampling plants, slow-moving animals , and some aquatic organisms.When an ecologist wants to know how many organisms there are in a particular habitat, it would not be...

  • Quadratic classifier
    Quadratic classifier
    A quadratic classifier is used in machine learning and statistical classification to separate measurements of two or more classes of objects or events by a quadric surface...

  • Quadratic form (statistics)
    Quadratic form (statistics)
    If \epsilon is a vector of n random variables, and \Lambda is an n-dimensional symmetric matrix, then the scalar quantity\epsilon^T\Lambda\epsilonis known as a quadratic form in \epsilon.-Expectation:It can be shown that...

  • Quadratic variation
    Quadratic variation
    In mathematics, quadratic variation is used in the analysis of stochastic processes such as Brownian motion and martingales. Quadratic variation is just one kind of variation of a process.- Definition :...

  • Qualitative comparative analysis
    Qualitative comparative analysis
    Qualitative Comparative Analysis is a technique, developed by Charles Ragin in 1987, for solving the problems that are caused by making causal inferences on the basis of only a small number of cases...

  • Qualitative data
  • Qualitative variation
    Qualitative variation
    An index of qualitative variation is a measure of statistical dispersion in nominal distributions. There are a variety of these, but they have been relatively little-studied in the statistics literature...

  • Quality control
    Quality control
    Quality control, or QC for short, is a process by which entities review the quality of all factors involved in production. This approach places an emphasis on three aspects:...

  • Quantile
    Quantile
    Quantiles are points taken at regular intervals from the cumulative distribution function of a random variable. Dividing ordered data into q essentially equal-sized data subsets is the motivation for q-quantiles; the quantiles are the data values marking the boundaries between consecutive subsets...

  • Quantile function
    Quantile function
    In probability and statistics, the quantile function of the probability distribution of a random variable specifies, for a given probability, the value which the random variable will be at, or below, with that probability...

  • Quantile normalization
    Quantile normalization
    In statistics, quantile normalization is a technique for making two distributions identical in statistical properties. To quantile-normalize a test distribution to a reference distribution of the same length, sort the test distribution and sort the reference distribution...

  • Quantile regression
    Quantile regression
    Quantile regression is a type of regression analysis used in statistics. Whereas the method of least squares results in estimates that approximate the conditional mean of the response variable given certain values of the predictor variables, quantile regression results in estimates approximating...

  • Quantitative marketing research
    Quantitative marketing research
    Quantitative marketing research is the application of quantitative research techniques to the field of marketing. It has roots in both the positivist view of the world, and the modern marketing viewpoint that marketing is an interactive process in which both the buyer and seller reach a satisfying...

  • Quantitative parasitology
    Quantitative parasitology
    -Counting parasites:Quantifying parasites in a sample of hosts or comparing measures of infection across two or more samples can be challenging.The parasitic infection of a sample of hosts inherently exhibits a complex pattern that cannot be adequately quantified by a single statistical measure...

  • Quantitative psychological research
    Quantitative psychological research
    Quantitative psychological research is defined as psychological research which performs mathematical modeling and statistical estimation or statistical inference. This definition distinguishes it from so-called qualitative psychological research; however, many psychologists do not acknowledge any...

  • Quantitative research
    Quantitative research
    In the social sciences, quantitative research refers to the systematic empirical investigation of social phenomena via statistical, mathematical or computational techniques. The objective of quantitative research is to develop and employ mathematical models, theories and/or hypotheses pertaining to...

  • Quantum (Statistical programming language)
  • Quartile
    Quartile
    In descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...

  • Quartile coefficient of dispersion
    Quartile coefficient of dispersion
    In statistics, the quartile coefficient of dispersion is a descriptive statistic which measures dispersion and which is used to make comparisons within and between data sets....

  • Quasi-birth–death process
  • Quasi-experiment
    Quasi-experiment
    A quasi-experiment is an empirical study used to estimate the causal impact of an intervention on its target population. Quasi-experimental research designs share many similarities with the traditional experimental design or randomized controlled trial, but they specifically lack the element of...

  • Quasi-experimental design — redirects to Design of quasi-experiments
  • Quasi-likelihood
    Quasi-likelihood
    In statistics, quasi-likelihood estimation is one way of allowing for overdispersion, that is, greater variability in the data than would be expected from the statistical model used. It is most often used with models for count data or grouped binary data, i.e...

  • Quasi-maximum likelihood
    Quasi-maximum likelihood
    A quasi-maximum likelihood estimate is an estimate of a parameter θ in a statistical model that is formed by maximizing a function that is related to the logarithm of the likelihood function, but is not equal to it...

  • Quasireversibility
    Quasireversibility
    In probability theory, specifically queueing theory, quasireversibility is a property of some queues. The concept was first identified by Richard R. Muntz and further developed by Frank Kelly. Quasireversibility differs from reversibility in that a stronger condition is imposed on arrival rates...

  • Queueing model
    Queueing model
    In queueing theory, a queueing model is used to approximate a real queueing situation or system, so the queueing behaviour can be analysed mathematically...

  • Queueing theory
    Queueing theory
    Queueing theory is the mathematical study of waiting lines, or queues. The theory enables mathematical analysis of several related processes, including arriving at the queue, waiting in the queue , and being served at the front of the queue...

  • Queuing delay
    Queuing delay
    In telecommunication and computer engineering, the queuing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay....

  • Queuing theory in teletraffic engineering
  • Quota sampling
    Quota sampling
    Quota sampling is a method for selecting survey participants. In quota sampling, a population is first segmented into mutually exclusive sub-groups, just as in stratified sampling. Then judgment is used to select the subjects or units from each segment based on a specified proportion. For example,...


R

  • R programming language — redirects to R (programming language)
    R (programming language)
    R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

  • R v Adams
    R v Adams
    R v Adams [1996] 2 Cr App R 467, [1996] Crim LR 898, CA and R v Adams [1998] 1 Cr App R 377, The Times, 3 November 1997, CA, are rulings that ousted explicit Bayesian statistics from the reasoning admissible before a jury in DNA cases.-Facts:...

     (prob/stats related court case)
  • Radar chart
    Radar chart
    A radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point...

  • Rademacher distribution
  • Radial basis function network
    Radial basis function network
    A radial basis function network is an artificial neural network that uses radial basis functions as activation functions. It is a linear combination of radial basis functions...

  • Raikov's theorem
    Raikov's theorem
    In probability theory, Raikov’s theorem, named after Dmitry Raikov, states that if the sum of two independent random variables X and Y has a Poisson distribution, then both X and Y themselves must have the Poisson distribution. It says the same thing about the Poisson distribution that Cramér's...

  • Raised cosine distribution
  • Ramsey RESET test
    Ramsey reset test
    The Ramsey Regression Equation Specification Error Test test is a general specification test for the linear regression model. More specifically, it tests whether non-linear combinations of the estimated values help explain the endogenous variable...

     — the Ramsey Regression Equation Specification Error Test
  • Rand index
    Rand index
    The Rand index or Rand measure in statistics, and in particular in data clustering, is a measure of the similarity between two data clusterings...

  • Random assignment
    Random assignment
    Random assignment or random placement is an experimental technique for assigning subjects to different treatments . The thinking behind random assignment is that by randomizing treatment assignment, then the group attributes for the different treatments will be roughly equivalent and therefore any...

  • Random compact set
    Random compact set
    In mathematics, a random compact set is essentially a compact set-valued random variable. Random compact sets are useful in the study of attractors for random dynamical systems.-Definition:...

  • Random data — see randomness
    Randomness
    Randomness has somewhat differing meanings as used in various fields. It also has common meanings which are connected to the notion of predictability of events....

  • Random effects estimation — redirects to Random effects model
  • Random effects model
  • Random element
    Random element
    In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line...

  • Random field
    Random field
    A random field is a generalization of a stochastic process such that the underlying parameter need no longer be a simple real or integer valued "time", but can instead take values that are multidimensional vectors, or points on some manifold....

  • Random graph
    Random graph
    In mathematics, a random graph is a graph that is generated by some random process. The theory of random graphs lies at the intersection between graph theory and probability theory, and studies the properties of typical random graphs.-Random graph models:...

  • Random matrix
    Random matrix
    In probability theory and mathematical physics, a random matrix is a matrix-valued random variable. Many important properties of physical systems can be represented mathematically as matrix problems...

  • Random measure
  • Random multinomial logit
    Random multinomial logit
    In statistics and machine learning, random multinomial logit is a technique for statistical classification using repeated multinomial logit analyses via Leo Breiman's random forests.-Rationale for the new method:...

  • Random naive Bayes
    Random naive Bayes
    Random naive Bayes extends the Naive Bayes classifier by adopting the random forest principles: random input selection, bagging , and random feature selection .- Naive Bayes classifier :...

  • Random permutation statistics
    Random permutation statistics
    The statistics of random permutations, such as the cycle structure of a random permutation are of fundamental importance in the analysis of algorithms, especially of sorting algorithms, which operate on random permutations. Suppose, for example, that we are using quickselect to select a random...

  • Random regular graph
  • Random sample
    Random sample
    In statistics, a sample is a subject chosen from a population for investigation; a random sample is one chosen by a method involving an unpredictable component...

  • Random sampling
  • Random sequence
    Random sequence
    The concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let X1,...,Xn be independent random variables...". Yet as D. H. Lehmer stated in...

  • Random variable
    Random variable
    In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

  • Random variate
    Random variate
    A random variate is a particular outcome of a random variable: the random variates which are other outcomes of the same random variable would have different values. Random variates are used when simulating processes driven by random influences...

  • Random walk
    Random walk
    A random walk, sometimes denoted RW, is a mathematical formalisation of a trajectory that consists of taking successive random steps. For example, the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging animal, the price of a fluctuating stock and the...

  • Random walk hypothesis
    Random walk hypothesis
    The random walk hypothesis is a financial theory stating that stock market prices evolve according to a random walk and thus the prices of the stock market cannot be predicted. It is consistent with the efficient-market hypothesis....

  • Randomization
    Randomization
    Randomization is the process of making something random; this means:* Generating a random permutation of a sequence .* Selecting a random sample of a population ....

  • Randomized block design
    Randomized block design
    In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups that are similar to one another. Typically, a blocking factor is a source of variability that is not of primary interest to the experimenter...

  • Randomized controlled trial
    Randomized controlled trial
    A randomized controlled trial is a type of scientific experiment - a form of clinical trial - most commonly used in testing the safety and efficacy or effectiveness of healthcare services or health technologies A randomized controlled trial (RCT) is a type of scientific experiment - a form of...

  • Randomized experiment
    Randomized experiment
    In science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects...

  • Randomized response
    Randomized response
    Randomized response is a research method used in structured survey interview. It was first proposed by S. L. Warner in 1965 and later modified by B. G. Greenberg in 1969. It allows respondents to respond to sensitive issues while maintaining confidentiality...

  • Randomness
    Randomness
    Randomness has somewhat differing meanings as used in various fields. It also has common meanings which are connected to the notion of predictability of events....

  • Randomness tests
    Randomness tests
    The issue of randomness is an important philosophical and theoretical question.Many random number generators in use today generate what are called "random sequences" but they are actually the result of prescribed algorithms and so they are called pseudo-random number generators.These generators do...

  • Range (statistics)
    Range (statistics)
    In the descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...

  • Rank abundance curve
    Rank abundance curve
    A rank abundance curve or "Whittaker plot" is a chart used by ecologists to display relative species abundance, a component of biodiversity. It can also be used to visualize species richness and species evenness...

  • Rank correlation
    Rank correlation
    In statistics, a rank correlation is the relationship between different rankings of the same set of items. A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to assess its significance....

     mainly links to two following
    • Spearman's rank correlation coefficient
      Spearman's rank correlation coefficient
      In statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often denoted by the Greek letter \rho or as r_s, is a non-parametric measure of statistical dependence between two variables. It assesses how well the relationship between two variables can...

    • Kendall tau rank correlation coefficient
      Kendall tau rank correlation coefficient
      In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's tau coefficient, is a statistic used to measure the association between two measured quantities...

  • Rank product
    Rank product
    The rank product is a biologically motivated test for the detection of differentially expressed genes in replicated microarray experiments.It is a simple non-parametric statistical method based on ranks of fold changes...

  • Rank-size distribution
    Rank-size distribution
    Rank-size distribution or the rank-size rule describes the remarkable regularity in many phenomena including the distribution of city sizes around the world, sizes of businesses, particle sizes , lengths of rivers, frequencies of word usage, wealth among individuals, etc...

  • Ranking
    Ranking
    A ranking is a relationship between a set of items such that, for any two items, the first is either 'ranked higher than', 'ranked lower than' or 'ranked equal to' the second....

  • Rankit
    Rankit
    In statistics, rankits of a set of data are the expected values of the order statistics of a sample from the standard normal distribution the same size as the data. They are primarily used in the normal probability plot, a graphical technique for normality testing.-Example:This is perhaps most...

  • Ranklet
    Ranklet
    In statistics, a ranklet is an orientation-selective non-parametric feature which is based on the computation of Mann–Whitney–Wilcoxon rank-sum test statistics...

  • RANSAC
    RANSAC
    RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliers. It is a non-deterministic algorithm in the sense that it produces a reasonable result only with a certain...

  • Rational quadratic covariance function
    Rational quadratic covariance function
    In statistics, the rational quadratic covariance function is used in spatial statistics, geostatistics, machine learning, image analysis, and other fields where multivariate statistical analysis is conducted on metric spaces. It is commonly used to define the statistical covariance between...

  • Rao–Blackwell theorem
    Rao–Blackwell theorem
    In statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar...

  • Rao-Blackwellisation — redirects to *Rao–Blackwell theorem
    Rao–Blackwell theorem
    In statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar...

  • Rasch model
    Rasch model
    Rasch models are used for analysing data from assessments to measure variables such as abilities, attitudes, and personality traits. For example, they may be used to estimate a student's reading ability from answers to questions on a reading assessment, or the extremity of a person's attitude to...

    • Polytomous Rasch model
      Polytomous Rasch model
      The polytomous Rasch model is generalization of the dichotomous Rasch model. It is a measurement model that has potential application in any context in which the objective is to measure a trait or ability through a process in which responses to items are scored with successive integers...

  • Rasch model estimation
    Rasch model estimation
    Estimation of a Rasch model is used to estimate the parameters of the Rasch model. Various techniques are employed to estimate the parameters from matrices of response data. The most common approaches are types of maximum likelihood estimation, such as joint and conditional maximum likelihood...

  • Ratio distribution
    Ratio distribution
    A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions....

  • Rayleigh distribution
  • Raw score
    Raw score
    In statistics and data analysis, a raw score is an original datum that has not been transformed. This may include, for example, the original result obtained by a student on a test as opposed to that score after transformation to a standard score or percentile rank or the like.Often the conversion...

  • Realization (probability)
    Realization (probability)
    In probability and statistics, a realization, or observed value, of a random variable is the value that is actually observed . The random variable itself should be thought of as the process how the observation comes about...

  • Recall bias
    Recall bias
    In psychology, recall bias is a type of systematic bias which occurs when the way a survey respondent answers a question is affected not just by the correct answer, but also by the respondent's memory. This can affect the results of the survey. As a hypothetical example, suppose that a survey in...

  • Receiver operating characteristic
    Receiver operating characteristic
    In signal detection theory, a receiver operating characteristic , or simply ROC curve, is a graphical plot of the sensitivity, or true positive rate, vs. false positive rate , for a binary classifier system as its discrimination threshold is varied...

  • Rectified Gaussian distribution
    Rectified Gaussian Distribution
    In probability theory, the rectified Gaussian distribution is a modification of the Gaussian distribution when its negative elements are reset to 0...

  • Recurrence period density entropy
    Recurrence period density entropy
    Recurrence period density entropy is a method, in the fields of dynamical systems, stochastic processes, and time series analysis, for determining the periodicity, or repetitiveness of a signal.- Overview :...

  • Recurrence plot
    Recurrence plot
    In descriptive statistics and chaos theory, a recurrence plot is a plot showing, for a given moment in time, the times at which a phase space trajectory visits roughly the same area in the phase space...

  • Recurrence quantification analysis
    Recurrence quantification analysis
    Recurrence quantification analysis is a method of nonlinear data analysis for the investigation of dynamical systems. It quantifies the number and duration of recurrences of a dynamical system presented by its phase space trajectory....

  • Recursive Bayesian estimation
    Recursive Bayesian estimation
    Recursive Bayesian estimation, also known as a Bayes filter, is a general probabilistic approach for estimating an unknown probability density function recursively over time using incoming measurements and a mathematical process model.-In robotics:...

  • Recursive least squares
  • Recursive partitioning
    Recursive partitioning
    Recursive partitioning is a statistical method for multivariable analysis. Recursive partitioning creates a decision tree that strives to correctly classify members of the population based on several dichotomous dependent variables....

  • Reduced form
    Reduced form
    In statistics, and particularly in econometrics, the reduced form of a system of equations is the result of solving the system for the endogenous variables. This gives the latter as a function of the exogenous variables, if any...

  • Reference class problem
    Reference class problem
    In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case...

  • Regenerative process
    Regenerative process
    In applied probability, a regenerative process is a special type of stochastic process that is defined by having a property whereby certain portions of the process can be treated as being statistically independent of each other...

  • Regression analysis
    Regression analysis
    In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...

     — see also linear regression
    Linear regression
    In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple regression...

  • Regression Analysis of Time Series — proprietary software
  • Regression control chart
    Regression control chart
    In statistical quality control, the regression control chart allows for monitoring a change in a process where two or more variables are correlated. The change in a dependent variable can be detected and compensatory change in the independent variable can be recommended...

  • Regression dilution
    Regression dilution
    Regression dilution is a statistical phenomenon also known as "attenuation".Consider fitting a straight line for the relationship of an outcome variable y to a predictor variable x, and estimating the gradient of the line...

  • Regression discontinuity design
  • Regression estimation
    Regression estimation
    Regression estimation is a technique used to replace missing values in data. The variable with missing data is treated as the dependent variable, while the rest of the cases are treated as independent variables. A regression equation is then generated which can be used to predict missing values...

  • Regression fallacy
    Regression fallacy
    The regression fallacy is an informal fallacy. It ascribes cause where none exists. The flaw is failing to account for natural fluctuations. It is frequently a special kind of the post hoc fallacy.-Explanation:...

  • Regression model validation
  • Regression toward the mean
    Regression toward the mean
    In statistics, regression toward the mean is the phenomenon that if a variable is extreme on its first measurement, it will tend to be closer to the average on a second measurement, and—a fact that may superficially seem paradoxical—if it is extreme on a second measurement, will tend...

  • Regret (decision theory)
    Regret (decision theory)
    Regret is defined as the difference between the actual payoff and the payoff that would have been obtained if a different course of action had been chosen. This is also called difference regret...

  • Reification (statistics)
    Reification (statistics)
    In statistics, reification is the use of an idealized model of a statistical process. The model is then used to make inferences connecting model results, which imperfectly represent the actual process, with experimental observations....

  • Rejection sampling
    Rejection sampling
    In mathematics, rejection sampling is a basic pseudo-random number sampling technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm"....

  • Relationships among probability distributions
    Relationships among probability distributions
    Many statistical distributions have close relationships. Some examples include:* Bernoulli distribution, binomial distribution, and normal distribution.* exponential distribution and Poisson distribution....

  • Relative change and difference
    Relative change and difference
    The relative difference, percent difference, relative percent difference, or percentage difference between two quantities is the difference between them, expressed as a comparison to the size of one or both of them. Such measures are unitless numbers...

  • Relative efficiency redirects to Efficiency (statistics)
    Efficiency (statistics)
    In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors...

  • Relative index of inequality
    Relative index of inequality
    The relative index of inequality is a regression-based index which summarizes the magnitude of socio-economic status as a source of inequalities in health. RII is useful because it takes into account the size of the population and the relative diadvantage experienced by different groups...

  • Relative risk
    Relative risk
    In statistics and mathematical epidemiology, relative risk is the risk of an event relative to exposure. Relative risk is a ratio of the probability of the event occurring in the exposed group versus a non-exposed group....

  • Relative risk reduction
    Relative risk reduction
    In epidemiology, the relative risk reduction is a measure calculated by dividing the absolute risk reduction by the control event rate.The relative risk reduction can be more useful than the absolute risk reduction in determining an appropriate treatment plan, because it accounts not only for the...

  • Relative standard deviation
    Relative standard deviation
    In probability theory and statistics, the relative standard deviation is the absolute value of the coefficient of variation. It is often expressed as a percentage. A similar term that is sometimes used is the relative variance which is the square of the coefficient of variation...

  • Relative standard error — redirects to Relative standard deviation
    Relative standard deviation
    In probability theory and statistics, the relative standard deviation is the absolute value of the coefficient of variation. It is often expressed as a percentage. A similar term that is sometimes used is the relative variance which is the square of the coefficient of variation...

  • Relative variance — redirects to Relative standard deviation
    Relative standard deviation
    In probability theory and statistics, the relative standard deviation is the absolute value of the coefficient of variation. It is often expressed as a percentage. A similar term that is sometimes used is the relative variance which is the square of the coefficient of variation...

  • Relative survival
    Relative survival
    When describing the survival experience of a group of people or patients typically the method of overall survival is used, and it presents estimates of the proportion of people or patients alive at a certain point in time...

  • Relativistic Breit–Wigner distribution
  • Relevance vector machine
    Relevance Vector Machine
    Relevance vector machine is a machine learning technique that uses Bayesian inference to obtain parsimonious solutions for regression and classification...

  • Reliability (statistics)
    Reliability (statistics)
    In statistics, reliability is the consistency of a set of measurements or of a measuring instrument, often used to describe a test. Reliability is inversely related to random error.-Types:There are several general classes of reliability estimates:...

  • Reliability block diagram
    Reliability block diagram
    A reliability block diagram is a diagrammatic method for showing how component reliability contributes to the success or failure of a complex system. RBD is also known as a dependence diagram ....

  • Reliability engineering
    Reliability engineering
    Reliability engineering is an engineering field, that deals with the study, evaluation, and life-cycle management of reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time. It is often measured as a probability of...

  • Reliability theory
    Reliability theory
    Reliability theory describes the probability of a system completing its expected function during an interval of time. It is the basis of reliability engineering, which is an area of study focused on optimizing the reliability, or probability of successful functioning, of systems, such as airplanes,...

  • Reliability theory of aging and longevity
    Reliability theory of aging and longevity
    Reliability theory of aging and longevity is a scientific approach aimed to gain theoretical insights into mechanisms of biological aging and species survival patterns by applying a general theory of systems failure, known as reliability theory.-Overview:...

  • Rencontres numbers — a discrete distribution
  • Renewal theory
    Renewal theory
    Renewal theory is the branch of probability theory that generalizes Poisson processes for arbitrary holding times. Applications include calculating the expected time for a monkey who is randomly tapping at a keyboard to type the word Macbeth and comparing the long-term benefits of different...

  • Repeatability
    Repeatability
    Repeatability or test-retest reliability is the variation in measurements if they would have been taken by a single person or instrument on the same item and under the same conditions. A less-than-perfect test-retest reliability causes test-retest variability. Such variability can be caused by, for...

  • Repeated measures design
    Repeated measures design
    The repeated measures design uses the same subjects with every condition of the research, including the control. For instance, repeated measures are collected in a longitudinal study in which change over time is assessed. Other studies compare the same measure under two or more different conditions...

  • Replication (statistics)
    Replication (statistics)
    In engineering, science, and statistics, replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. ASTM, in standard E1847, defines replication as "the repetition of the set of all the treatment combinations to be compared in...

  • Representation validity
    Representation validity
    Representation validity is concerned about how well the constructs or abstractions translate into observable measures. There are two primary questions to be answered:...

  • Reproducibility
    Reproducibility
    Reproducibility is the ability of an experiment or study to be accurately reproduced, or replicated, by someone else working independently...

  • Resampling (statistics)
    Resampling (statistics)
    In statistics, resampling is any of a variety of methods for doing one of the following:# Estimating the precision of sample statistics by using subsets of available data or drawing randomly with replacement from a set of data points # Exchanging labels on data points when performing significance...

  • Rescaled range
    Rescaled range
    The rescaled range is a statistical measure of the variability of a time series introduced by the British hydrologist Harold Edwin Hurst...

  • Resentful demoralization
    Resentful demoralization
    Resentful demoralization is an issue in controlled experiments in which those in the control group become resentful of not receiving the experimental treatment. Alternatively, the experimental group could be resentful of the control group, if the experimental group perceive its treatment as...

     – experimental design
  • Residual. See errors and residuals in statistics
    Errors and residuals in statistics
    In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...

    .
  • Residual sum of squares
    Residual sum of squares
    In statistics, the residual sum of squares is the sum of squares of residuals. It is also known as the sum of squared residuals or the sum of squared errors of prediction . It is a measure of the discrepancy between the data and an estimation model...

  • Response bias
    Response bias
    Response bias is a type of cognitive bias which can affect the results of a statistical survey if respondents answer questions in the way they think the questioner wants them to answer rather than according to their true beliefs...

  • Response rate
    Response rate
    Response rate in survey research refers to the number of people who answered the survey divided by the number of people in the sample...

  • Response surface methodology
    Response surface methodology
    In statistics, response surface methodology explores the relationships between several explanatory variables and one or more response variables. The method was introduced by G. E. P. Box and K. B. Wilson in 1951. The main idea of RSM is to use a sequence of designed experiments to obtain an...

  • Response variable
  • Restricted maximum likelihood
    Restricted maximum likelihood
    In statistics, the restricted maximum likelihood approach is a particular form of maximum likelihood estimation which does not base estimates on a maximum likelihood fit of all the information, but instead uses a likelihood function calculated from a transformed set of data, so that nuisance...

  • Restricted randomization
    Restricted randomization
    Many processes have more than one source of variation in them. In order to reduce variation in processes, these multiple sources must be understood, and that often leads to the concept of nested or hierarchical data structures. For example, in the semiconductor industry, a batch process may operate...

  • Reversible-jump Markov chain Monte Carlo
  • Reversible dynamics
    Reversible dynamics
    - Mathematics :In mathematics, a dynamical system is invertible if the forward evolution is one-to-one, not many-to-one; so that for every state there exists a well-defined reverse-time evolution operator....

  • Rind et al. controversy – interpretations of paper involving meta-analysis
  • Rice distribution
    Rice distribution
    In probability theory, the Rice distribution or Rician distribution is the probability distribution of the absolute value of a circular bivariate normal random variable with potentially non-zero mean. It was named after Stephen O...

  • Richardson–Lucy deconvolution
  • Ridge regression  redirects to Tikhonov regularization
    Tikhonov regularization
    Tikhonov regularization, named for Andrey Tikhonov, is the most commonly used method of regularization of ill-posed problems. In statistics, the method is known as ridge regression, and, with multiple independent discoveries, it is also variously known as the Tikhonov-Miller method, the...

  • Risk factor
    Risk factor
    In epidemiology, a risk factor is a variable associated with an increased risk of disease or infection. Sometimes, determinant is also used, being a variable associated with either increased or decreased risk.-Correlation vs causation:...

  • Risk function
    Risk function
    In decision theory and estimation theory, the risk function R of a decision rule, δ, is the expected value of a loss function L:...

  • Risk perception
    Risk perception
    Risk perception is the subjective judgment that people make about the characteristics and severity of a risk. The phrase is most commonly used in reference to natural hazards and threats to the environment or health, such as nuclear power. Several theories have been proposed to explain why...

  • Risk theory
    Risk theory
    Risk theory connotes the study usually by actuaries and insurers of the financial impact on a carrier of a portfolio of insurance policies. For example, if the carrier has 100 policies that insures against a total loss of $1000, and if each policy's chance of loss is independent and has a...

  • Risk-benefit analysis
    Risk-benefit analysis
    Risk–benefit analysis is the comparison of the risk of a situation to its related benefits. Exposure to personal risk is recognized as a normal aspect of everyday life. We accept a certain level of risk in our lives as necessary to achieve certain benefits. In most of these risks we feel as though...

  • Robbins lemma
    Robbins lemma
    In statistics, the Robbins lemma, named after Herbert Robbins, states that if X is a random variable with a Poisson distribution, and f is any function for which the expected value E exists, then...

  • Robin Hood index
    Robin Hood index
    The Hoover index is a measure of income inequality. It is equal to the portion of the total community income that would have to be redistributed for there to be perfect equality....

  • Robust confidence intervals
    Robust confidence intervals
    In statistics a robust confidence interval is a robust modification of confidence intervals, meaning that one modifies the non-robust calculations of the confidence interval so that they are not badly affected by outlying or aberrant observations in a data-set.- Example :In the process of weighing...

  • Robust regression
    Robust regression
    In robust statistics, robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. Regression analysis seeks to find the effect of one or more independent variables upon a dependent variable...

  • Robust statistics
    Robust statistics
    Robust statistics provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not unduly affected by small departures from model assumptions.- Introduction :...

  • Root mean square
    Root mean square
    In mathematics, the root mean square , also known as the quadratic mean, is a statistical measure of the magnitude of a varying quantity. It is especially useful when variates are positive and negative, e.g., sinusoids...

  • Root mean square deviation
    Root mean square deviation
    The root-mean-square deviation is the measure of the average distance between the atoms of superimposed proteins...

  • Root mean square deviation (bioinformatics)
  • Root mean square fluctuation
  • Robust measures of scale
    Robust measures of scale
    In statistics, a robust measure of scale is a robust statistic that quantifies the statistical dispersion in a set of quantitative data. Robust measures of scale are used to complement or replace conventional estimates of scale such as the sample variance or sample standard deviation...

  • Rossmo's formula
    Rossmo's formula
    Rossmo's formula is a geographic profiling formula to predict where a serial criminal lives. The formula was developed by criminologist Kim Rossmo.-Formula:...

  • Rothamsted Experimental Station
    Rothamsted Experimental Station
    The Rothamsted Experimental Station, one of the oldest agricultural research institutions in the world, is located at Harpenden in Hertfordshire, England. It is now known as Rothamsted Research...

  • Round robin test
    Round robin test
    In experimental methodology, a round robin test is an interlaboratory test performed independently several times. This can involve multiple independent scientists performing the test with the use of the same method in different equipment, or a variety of methods and equipment...

  • Rubin causal model
    Rubin Causal Model
    The Rubin Causal Model is an approach to the statistical analysis of cause and effect based on the framework of potential outcomes. RCM is named after Donald Rubin, Professor of Statistics at Harvard University...

  • Ruin theory
    Ruin theory
    Ruin theory, sometimes referred to as collective risk theory, is a branch of actuarial science that studies an insurer's vulnerability to insolvency based on mathematical modeling of the insurer's surplus....

  • Rule of succession
    Rule of succession
    In probability theory, the rule of succession is a formula introduced in the 18th century by Pierre-Simon Laplace in the course of treating the sunrise problem....

  • Rule of three (medicine)
    Rule of three (medicine)
    In the statistical analysis of clinical trials, the rule of three states that if no major adverse events occurred in a group of n people, there can be 95% confidence that the chance of major adverse events is less than one in n / 3...

  • Run chart
    Run Chart
    A run chart, also known as a run-sequence plot is a graph that displays observed data in a time sequence. Often, the data displayed represent some aspect of the output or performance of a manufacturing or other business process.- Overview :...

  • RV coefficient
    RV coefficient
    In statistics, the RV coefficientis a multivariate generalization of the Pearson correlation coefficient.It measures the closeness of two set of points that may each be represented in a matrix....


S

  • S (programming language)
  • S-PLUS
    S-PLUS
    S-PLUS is a commercial implementation of the S programming language sold by TIBCO Software Inc..It features object-oriented programming capabilities and advanced analytical algorithms.-Historical timeline:...

  • Safety in numbers
    Safety in numbers
    Safety in numbers is the hypothesis that, by being part of a large physical group or mass, an individual is proportionally less likely to be the victim of a mishap, accident, attack, or other bad event...

  • Sally Clark
    Sally Clark
    Sally Clark was a British solicitor who became the victim of an infamous miscarriage of justice when she was wrongly convicted of the murder of two of her sons in 1999...

     (prob/stats related court case)
  • Sammon projection
  • Sample mean and covariance redirects to Sample mean and sample covariance
    Sample mean and sample covariance
    The sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...

  • Sample mean and sample covariance
    Sample mean and sample covariance
    The sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...

  • Sample maximum and minimum
    Sample maximum and minimum
    In statistics, the maximum and sample minimum, also called the largest observation, and smallest observation, are the values of the greatest and least elements of a sample....

  • Sample size determination
  • Sample space
  • Sample standard deviation — disambiguation
  • Sample (statistics)
    Sample (statistics)
    In statistics, a sample is a subset of a population. Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of manageable size...

  • Sample-continuous process
  • Sampling (statistics)
    Sampling (statistics)
    In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....

    • simple random sampling
    • Snowball sampling
      Snowball sampling
      In sociology and statistics research, snowball sampling is a non-probability sampling technique where existing study subjects recruit future subjects from among their acquaintances. Thus the sample group appears to grow like a rolling snowball...

    • systematic sampling
      Systematic sampling
      Systematic sampling is a statistical method involving the selection of elements from an ordered sampling frame. The most common form of systematic sampling is an equal-probability method, in which every kth element in the frame is selected, where k, the sampling interval , is calculated as:k =...

    • stratified sampling
      Stratified sampling
      In statistics, stratified sampling is a method of sampling from a population.In statistical surveys, when subpopulations within an overall population vary, it is advantageous to sample each subpopulation independently. Stratification is the process of dividing members of the population into...

    • cluster sampling
      Cluster sampling
      Cluster Sampling is a sampling technique used when "natural" groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into these groups and a sample of the groups is selected. Then the required information is...

    • multistage sampling
      Multistage sampling
      Multistage sampling is a complex form of cluster sampling.Advantages * cost and speed that the survey can be done in* convenience of finding the survey sample* normally more accurate than cluster sampling for the same size sampleDisadvantages...

    • nonprobability sampling
      Nonprobability sampling
      Sampling is the use of a subset of the population to represent the whole population. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. Nonprobability sampling does not meet this criterion and should be...

    • slice sampling
      Slice sampling
      Slice sampling is a type of Markov chain Monte Carlo algorithm for pseudo-random number sampling, i.e. for drawing random samples from a statistical distribution...

  • Sampling bias
  • Sampling design
    Sampling design
    In the theory of finite population sampling, a sampling design specifies for every possible sample its probability of being drawn.Mathematically, a sampling design is denoted by the function P which gives the probability of drawing a sample S....

  • Sampling distribution
    Sampling distribution
    In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given statistic based on a random sample. Sampling distributions are important in statistics because they provide a major simplification on the route to statistical inference...

  • Sampling error
    Sampling error
    -Random sampling:In statistics, sampling error or estimation error is the error caused by observing a sample instead of the whole population. The sampling error can be found by subtracting the value of a parameter from the value of a statistic...

  • Sampling fraction
    Sampling fraction
    In sampling theory, sampling fraction is the ratio of sample size to population size or, in the context of stratified sampling, the ratio of the sample size to the size of the stratum....

  • Sampling frame
    Sampling frame
    In statistics, a sampling frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include individuals, households or institutions....

  • Sampling risk
    Sampling risk
    In auditing, sampling is an inevitable means of testing. However, sampling is always associated with sampling risks which auditors have to control....

  • Samuelson's inequality
    Samuelson's inequality
    In statistics, Samuelson's inequality, named after the economist Paul Samuelson, also called the Laguerre–Samuelson inequality, after the mathematician Edmond Laguerre, states that every one of any collection x1, ..., xn, is within √ standard deviations of their mean...

  • Sargan test
    Sargan test
    The Sargan test is a statistical test used to check for over-identifying restrictions in a statistical model. The Sargan test is based on the observation that the residuals should be uncorrelated with the set of exogenous variables if the instruments are truly exogenous...

  • SAS (software)
  • SAS language
    SAS language
    The SAS language is a data processing and statistical analysis .See more on origins of SAS language at SAS System and at Barr Systems .-Structure:The SAS language basically divides data processing and analysis into two kinds of steps....

  • SAS System — redirects to SAS (software)
  • Savitzky–Golay smoothing filter
    Savitzky–Golay smoothing filter
    The Savitzky–Golay smoothing filter is a type of filter first described in 1964 by Abraham Savitzky and Marcel J. E. Golay.The Savitzky–Golay method essentially performs a local polynomial regression on a series of values to determine the smoothed value for each point...

  • Sazonov's theorem
    Sazonov's theorem
    In mathematics, Sazonov's theorem, named after Vyacheslav Vasilievich Sazonov , is a theorem in functional analysis.It states that a bounded linear operator between two Hilbert spaces is γ-radonifying if it is Hilbert–Schmidt...

  • Saturated array
    Saturated array
    In experiments in which additional factors are not likely to interact with any of the other factors, a saturated array can be used. In a saturated array, a controllable factor is substituted for the interaction of two or more by-products. Using a saturated array, a two-factor test matrix could be...

  • Scale analysis (statistics)
    Scale analysis (statistics)
    In statistics, scale analysis is a set of methods to analyse survey data, in which responses to questions are combined to measure a latent variable. These items can be dichotomous or polytomous...

  • Scale parameter
    Scale parameter
    In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions...

  • Scaled-inverse-chi-squared distribution
  • Scaling pattern of occupancy
    Scaling pattern of occupancy
    In spatial ecology and macroecology, scaling pattern of occupancy , also known as the area-of-occupancy is the way in which species distribution changes across spatial scales. In physical geography and image analysis, it is similar to the modifiable areal unit problem. Simon A...

  • Scatter matrix
    Scatter matrix
    In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix of the multivariate normal distribution.-Definition:...

  • Scatter plot
  • Scatterplot smoothing
  • Scheffé's method
    Scheffé's method
    In statistics, Scheffé's method, named after Henry Scheffé, is a method for adjusting significance levels in a linear regression analysis to account for multiple comparisons...

  • Schilder's theorem
    Schilder's theorem
    In mathematics, Schilder's theorem is a result in the large deviations theory of stochastic processes. Roughly speaking, Schilder's theorem gives an estimate for the probability that a sample path of Brownian motion will stray far from the mean path . This statement is made precise using rate...

  • Schramm–Loewner evolution
  • Schuette–Nesbitt formula
    Schuette–Nesbitt formula
    In probability theory, the Schuette–Nesbitt formula is a generalization of the probabilistic version of the inclusion-exclusion principle. It is named after Donald R. Schuette and Cecil J...

  • Schwarz criterion
    Schwarz criterion
    In statistics, the Bayesian information criterion or Schwarz criterion is a criterion for model selection among a finite set of models...

  • Score (statistics)
    Score (statistics)
    In statistics, the score, score function, efficient score or informant plays an important role in several aspects of inference...

  • Score test
    Score test
    A score test is a statistical test of a simple null hypothesis that a parameter of interest \theta isequal to some particular value \theta_0. It is the most powerful test when the true value of \theta is close to \theta_0. The main advantage of the Score-test is that it does not require an...

  • Scoring algorithm
    Scoring algorithm
    In statistics, Fisher's scoring algorithm is a form of Newton's method used to solve maximum likelihood equations numerically.-Sketch of Derivation:...

  • Scoring rule
    Scoring rule
    In decision theory a score function, or scoring rule, is a measure of the performance of an entity, be it person or machine, that repeatedly makes decisions under uncertainty. For example, every evening a TV weather forecaster may give the probability of rain on the next day, in a type of...

  • SCORUS
    SCORUS
    An acronym for "Standing Committee of Regional and Urban Statistics", SCORUS is a sub-committee of the International Association for Official Statistics which is a section of the International Statistical Institute. The sub-committee has specific responsibility for regional and urban statistics...

  • Scott's Pi
    Scott's Pi
    Scott's pi is a statistic for measuring inter-rater reliability for nominal data in communication studies. Textual entities are annotated with categories by different annotators, and various measures are used to assess the extent of agreement between the annotators, one of which is Scott's pi...

  • SDMX
    SDMX
    SDMX is an initiative to foster standards for the exchange of statistical information. It started in 2001 and aims at fostering standards for Statistical Data and Metadata eXchange...

     – a standard for exchanging statistical data
  • Seasonal adjustment
    Seasonal adjustment
    Seasonal adjustment is a statistical method for removing the seasonal component of a time series that is used when analyzing non-seasonal trends. It is normal to report un-adjusted data for current unemployment rates, as these reflect the actual current situation...

  • Seasonality
    Seasonality
    In statistics, many time series exhibit cyclic variation known as seasonality, periodic variation, or periodic fluctuations. This variation can be either regular or semi regular....

  • Seasonal subseries plot
    Seasonal subseries plot
    Seasonal subseries plots are a tool for detecting seasonality in a time series. This plot allows one to detect both between-group and within-group patterns. This plot is only useful if the period of the seasonality is already known. In many cases, this will in fact be known. For example, monthly...

  • Seasonal variation
  • Seasonally adjusted annual rate
    Seasonally adjusted annual rate
    The Seasonally Adjusted Annual Rate refers to the rate adjustment employed when drawing comparisons between various sets of statistical data. As the name suggests, it takes into account fluctuations of values in such data which might occur due to seasonality...

  • Second moment method
    Second moment method
    In mathematics, the second moment method is a technique used in probability theory and analysis to show that a random variable has positive probability of being positive...

  • Secretary problem
    Secretary problem
    The secretary problem is one of many names for a famous problem of theoptimal stopping theory.The problem has been studied extensively in the fields ofapplied probability, statistics, and decision theory...

  • Secular trend
  • Secular variation
    Secular variation
    The secular variation of a time series is its long-term non-periodic variation . Whether something is perceived as a secular variation or not depends on the available timescale: a secular variation over a time scale of centuries may be part of a periodic variation over a time scale of millions of...

  • Seemingly unrelated regressions
  • Seismic to simulation
    Seismic to simulation
    Seismic to Simulation is the process and associated techniques used to develop highly accurate static and dynamic 3D models of hydrocarbon reservoirs for use in predicting future production, placing additional wells, and evaluating alternative reservoir management scenarios...

  • Selection bias
    Selection bias
    Selection bias is a statistical bias in which there is an error in choosing the individuals or groups to take part in a scientific study. It is sometimes referred to as the selection effect. The term "selection bias" most often refers to the distortion of a statistical analysis, resulting from the...

  • Selective recruitment
    Selective recruitment
    Selective recruitment is an observed effect in traffic safety. When safety belt laws are passed, belt wearing rates increase, but casualties decline by smaller percentages than estimated in a simple calculation. This is because those converted from non-use to use are not “recruited” random...

  • Self-selection bias
  • Self-similar process
    Self-similar process
    Self-similar processes are types of stochastic processes that exhibit the phenomenon of self-similarity. A self-similar phenomenon behaves the same when viewed at different degrees of magnification, or different scales on a dimension . Self-similar processes can sometimes be described using...

  • Segmented regression
    Segmented regression
    Segmented regression is a method in regression analysis in which the independent variable is partitioned into intervals and a separate line segment is fit to each interval. Segmented or piecewise regression analysis can also be performed on multivariate data by partitioning the various independent...

  • Seismic inversion
    Seismic inversion
    Seismic inversion, in Geophysics , is the process of transforming seismic reflection data into a quantitative rock-property description of a reservoir...

  • Self-similarity matrix
    Self-similarity matrix
    In data analysis, the self-similarity matrix is a graphical representation of similar sequences in a data series. Similarity can be explained by different measures, like spatial distance , correlation, or comparison of local histograms or spectral properties...

  • Semantic mapping (statistics)
    Semantic mapping (statistics)
    The semantic mapping is a dimensionality reduction method that extracts new features by clustering the original features in semantic clusters and combining features mapped in the same cluster to generate an extracted feature...

  • Semantic relatedness
  • Semantic similarity
    Semantic similarity
    Semantic similarity or semantic relatedness is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content....

  • Semi-Markov process
    Semi-Markov process
    A continuous-time stochastic process is called a semi-Markov process or 'Markov renewal process' if the embedded jump chain is a Markov chain, and where the holding times are random variables with any distribution, whose distribution function may depend on the two states between which the move is...

  • Semi-log graph
  • Semidefinite embedding
    Semidefinite embedding
    Semidefinite embedding or maximum variance unfolding is an algorithm in computer science, that uses semidefinite programming to perform non-linear dimensionality reduction of high-dimensional vectorial input data....

  • Semimartingale
    Semimartingale
    In probability theory, a real valued process X is called a semimartingale if it can be decomposed as the sum of a local martingale and an adapted finite-variation process....

  • Semiparametric model
  • Semiparametric regression
    Semiparametric regression
    In statistics, semiparametric regression includes regression models that combine parametric and nonparametric models. They are often used in situations where the fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with...

  • Semivariance
  • Sensitivity (tests)
  • Sensitivity analysis
    Sensitivity analysis
    Sensitivity analysis is the study of how the variation in the output of a statistical model can be attributed to different variations in the inputs of the model. Put another way, it is a technique for systematically changing variables in a model to determine the effects of such changes.In any...

  • Sensitivity and specificity
    Sensitivity and specificity
    Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in statistics as classification function. Sensitivity measures the proportion of actual positives which are correctly identified as such Sensitivity and specificity are statistical...

  • Separation test
    Separation test
    A separation test is a statistical procedure for early-phase research, to decide whether or not to pursue further research. It is designed to avoid the prevalent situation in early-phase research, when a statistically underpowered test gives a negative result....

  • Sequential analysis
    Sequential analysis
    In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results...

  • Sequential estimation
    Sequential estimation
    In statistics, sequential estimation refers to estimation methods in sequential analysis where the sample size is not fixed in advance. Instead, data is evaluated as it is collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results are...

  • Sequential Monte Carlo methods redirects to Particle filter
    Particle filter
    In statistics, particle filters, also known as Sequential Monte Carlo methods , are sophisticated model estimation techniques based on simulation...

  • Sequential probability ratio test
    Sequential probability ratio test
    The sequential probability ratio test is a specific sequential hypothesis test, developed by Abraham Wald. Neyman and Pearson's 1933 result inspired Wald to reformulate it as a sequential analysis problem...

  • Serial dependence
    Serial dependence
    In statistics and signal processing, random variables in a time series have serial dependence if the value at some time t in the series is statistically dependent on the value at another time s...

  • Seriation (archaeology)
    Seriation (archaeology)
    In archaeology, seriation is a relative dating method in which assemblages or artifacts from numerous sites, in the same culture, are placed in chronological order. Where absolute dating methods, such as carbon dating, cannot be applied, archaeologists have to use relative dating methods to date...

  • SETAR (model)
    SETAR (model)
    In statistics, Self-Exciting Threshold AutoRegressive models are typically applied to time series data as an extension of autoregressive models, in order to allow for higher degree of flexibility in model parameters through a regime switching behaviour.Given a time series of data xt, the SETAR...

     — a time series model
  • Sethi model
    Sethi model
    The Sethi model was developed by Suresh P. Sethi and describes the process of how sales evolve over time in response to advertising. The rate of change in sales depend on three effects: response to advertising that acts positively on the unsold portion of the market, the loss due to forgetting or...

  • Seven-number summary
    Seven-number summary
    In descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is a modification or extension of the five-number summary...

  • Sexual dimorphism measures
    Sexual dimorphism measures
    Although the subject of sexual dimorphism is not in itself controversial, the measures by which it is assessed differ widely. Most of the measures are used on the assumption that a random variable is considered so that probability distributions should be taken into account...

  • Shannon–Hartley theorem
    Shannon–Hartley theorem
    In information theory, the Shannon–Hartley theorem tells the maximum rate at which information can be transmitted over a communications channel of a specified bandwidth in the presence of noise. It is an application of the noisy channel coding theorem to the archetypal case of a continuous-time...

  • Shape of the distribution
    Shape of the distribution
    In statistics, the concept of the shape of the distribution refers to the shape of a probability distribution and it most often arises in questions of finding an appropriate distribution to use to model the statistical properties of a population, given a sample from that population...

  • Shape parameter
    Shape parameter
    In probability theory and statistics, a shape parameter is a kind of numerical parameter of a parametric family of probability distributions.- Definition :...

  • Shapiro–Wilk test
  • Sharpe ratio
    Sharpe ratio
    The Sharpe ratio or Sharpe index or Sharpe measure or reward-to-variability ratio is a measure of the excess return per unit of deviation in an investment asset or a trading strategy, typically referred to as risk , named after William Forsyth Sharpe...

  • SHAZAM (software)
    SHAZAM (software)
    SHAZAM is a comprehensive econometrics and statistics package for estimating, testing, simulating and forecasting many types of econometrics and statistical models...

  • Shewhart individuals control chart
  • Shifted Gompertz distribution
  • Shifted log-logistic distribution
  • Shifting baseline
  • Shrinkage (statistics)
    Shrinkage (statistics)
    In statistics, shrinkage has two meanings:*In relation to the general observation that, in regression analysis, a fitted relationship appears to perform less well on a new data set than on the data set used for fitting. In particular the value of the coefficient of determination 'shrinks'...

  • Shrinkage estimator
    Shrinkage estimator
    In statistics, a shrinkage estimator is an estimator that, either explicitly or implicitly, incorporates the effects of shrinkage. In loose terms this means that a naïve or raw estimate is improved by combining it with other information. The term relates to the notion that the improved estimate is...

  • Sichel distribution
  • Siegel–Tukey test
  • Sieve estimator
    Sieve estimator
    In statistics, sieve estimators are a class of nonparametric estimator which use progressively more complex models to estimate an unknown high-dimensional function as more data becomes available, with the aim of asymptotically reducing error towards zero as the amount of data increases. This method...

  • Sigma-algebra
    Sigma-algebra
    In mathematics, a σ-algebra is a technical concept for a collection of sets satisfying certain properties. The main use of σ-algebras is in the definition of measures; specifically, the collection of sets over which a measure is defined is a σ-algebra...

  • SigmaStat
    SigmaStat
    SigmaStat is a statistical software package, which was originally developed by Jandel Scientific Software in the 1980s. As of October 1996, Systat Software is now based in San Jose, California. SigmaStat users have the ability to compare effects among groups. This includes before and after or...

     – software
  • Sign test
    Sign test
    In statistics, the sign test can be used to test the hypothesis that there is "no difference in medians" between the continuous distributions of two random variables X and Y, in the situation when we can draw paired samples from X and Y...

  • Signal-to-noise ratio
    Signal-to-noise ratio
    Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise...

  • Signal-to-noise statistic
  • Signed differential mapping
    Signed differential mapping
    Signed differential mapping or SDM is a statistical technique for meta-analyzing studies on differences in brain activity or structure which used neuroimaging techniques such as fMRI, VBM, DTI or PET...

  • Significance analysis of microarrays
    Significance Analysis of Microarrays
    Significance analysis of microarrays is a statistical technique, established in 2001 by Tusher, Tibshirani and Chu, for determining whether changes in gene expression are statistically significant. With the advent of DNA microarrays it is now possible to measure the expression of thousands of...

  • Silhouette (clustering)
    Silhouette (clustering)
    Silhouette refers to a method of interpretation and validation of clusters of data. The technique provides a succinct graphical representation of how well each object lies within its cluster. It was first described by Peter J. Rousseeuw in 1986.- Method :...

  • Simfit
    Simfit
    Simfit is a free Open Source Windows package for simulation, curve fitting, statistics, and plotting, using a library of models or user-defined equations. Simfit has been in continuous development for many years by Dr Bill Bardsley of the University of Manchester...

      – software
  • Similarity matrix
    Similarity matrix
    A similarity matrix is a matrix of scores which express the similarity between two data points. Similarity matrices are strongly related to their counterparts, distance matrices and substitution matrices.-Use in sequence alignment:...

  • Simon model
    Simon model
    -Motivation:Aiming to account for the wide range of empirical distributions following a power-law, Herbert Simon proposed a class of stochastic models that results in a power-law distribution function. It models the dynamics of a system...

  • Simple linear regression
    Simple linear regression
    In statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. In other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model as...

  • Simple moving average crossover
    Simple moving average crossover
    In the statistics of time series, and in particular the analysis of financial time series for stock trading purposes, a moving-average crossover occurs when, on plotting two moving averages each based on different degrees of smoothing, the traces of these moving averages cross...

  • Simple random sample
    Simple random sample
    In statistics, a simple random sample is a subset of individuals chosen from a larger set . Each individual is chosen randomly and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has...

  • Simpson's paradox
    Simpson's paradox
    In probability and statistics, Simpson's paradox is a paradox in which a correlation present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequencydata are hastily given causal...

  • Simulated annealing
    Simulated annealing
    Simulated annealing is a generic probabilistic metaheuristic for the global optimization problem of locating a good approximation to the global optimum of a given function in a large search space. It is often used when the search space is discrete...

  • Simultaneous equation methods (econometrics)
  • Simultaneous equations model
    Simultaneous equations model
    Simultaneous equation models are a form of statistical model in the form of a set of linear simultaneous equations. They are often used in econometrics.- Structural and reduced form :...

  • Single equation methods (econometrics)
    Single equation methods (econometrics)
    A variety of methods are used in econometrics to estimate models consisting of a single equation. The oldest and still the most commonly used is the ordinary least squares method used to estimate linear regressions....

  • Singular distribution
    Singular distribution
    In probability, a singular distribution is a probability distribution concentrated on a set of Lebesgue measure zero, where the probability of each point in that set is zero. These distributions are sometimes called singular continuous distributions...

  • Singular spectrum analysis
    Singular Spectrum Analysis
    Singular spectrum analysis combines elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems and signal processing...

  • Sinusoidal model
    Sinusoidal model
    In statistics, signal processing, and time series analysis, a sinusoidal model to approximate a sequence Yi is:Y_i = C + \alpha\sin + E_i...

  • Sinkov statistic
    Sinkov statistic
    Sinkov statistics, also known as log-weight statistics, is a specialized field of statistics that was developed by Abraham Sinkov, while working for the small Signal Intelligence Service organization, the primary mission of which was to compile codes and ciphers for use by the U.S. Army...

  • Skellam distribution
  • Skew normal distribution
  • Skewness
    Skewness
    In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. The skewness value can be positive or negative, or even undefined...

  • Skorokhod's representation theorem
    Skorokhod's representation theorem
    In mathematics and statistics, Skorokhod's representation theorem is a result that shows that a weakly convergent sequence of probability measures whose limit measure is sufficiently well-behaved can be represented as the distribution/law of a pointwise convergent sequence of random variables...

  • Slash distribution
    Slash distribution
    In probability theory, the slash distribution is the probability distribution of a standard normal variate divided by an independent standard uniform variate...

  • Slice sampling
    Slice sampling
    Slice sampling is a type of Markov chain Monte Carlo algorithm for pseudo-random number sampling, i.e. for drawing random samples from a statistical distribution...

  • Sliced inverse regression
  • Slutsky's theorem
    Slutsky's theorem
    In probability theory, Slutsky’s theorem extends some properties of algebraic operations on convergent sequences of real numbers to sequences of random variables.The theorem was named after Eugen Slutsky. Slutsky’s theorem is also attributed to Harald Cramér....

  • Small area estimation
    Small area estimation
    Small area estimation is any of several statistical techniques involving the estimation of parameters for small sub-populations, generally used when the sub-population of interest is included in a larger survey....

  • Smearing retransformation
    Smearing retransformation
    The Smearing retransformation is used in regression analysis, after estimating the logarithm of a variable. Estimating the logarithm of a variable instead of the variable itself is a common technique to more closely approximate normality...

  • Smoothing
    Smoothing
    In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. Many different algorithms are used in smoothing...

  • Smoothing spline
    Smoothing spline
    The smoothing spline is a method of smoothing using a spline function.-Definition:Let ;x_1...

  • Smoothness (probability theory)
    Smoothness (probability theory)
    In probability theory and statistics, smoothness of a density function is a measure which determines how many times the density function can be differentiated, or equivalently the limiting behavior of distribution’s characteristic function....

  • Snowball sampling
    Snowball sampling
    In sociology and statistics research, snowball sampling is a non-probability sampling technique where existing study subjects recruit future subjects from among their acquaintances. Thus the sample group appears to grow like a rolling snowball...

  • Social network change detection
    Social network change detection
    Social network change detection is a process of monitoring social networks to determine when significant changes to their organizational structure occur and what caused them. This scientific approach combines analytical techniques from social network analysis with those from statistical process...

  • Social statistics
    Social statistics
    Social statistics is the use of statistical measurement systems to study human behavior in a social environment. This can be accomplished through polling a particular group of people, evaluating a particular subset of data obtained about a group of people, or by observation and statistical...

  • SOFA Statistics
    SOFA Statistics
    SOFA Statistics is an open-source statistical package, with an emphasis on ease of use, learn as you go, and beautiful output. The name stands for Statistics Open For All. It has a graphical user interface and can connect directly to MySQL, PostgreSQL, SQLite, MS Access, and Microsoft SQL Server...

     – software
  • Soliton distribution
    Soliton distribution
    A soliton distribution is a type of discrete probability distribution that arises in the theory of erasure correcting codes. A paper by Luby introduced two forms of such distributions, the ideal soliton distribution and the robust soliton distribution.-Ideal distribution:The ideal soliton...

     – redirects to Luby transform code
  • Sørensen similarity index
    Sørensen similarity index
    The Sørensen index, also known as Sørensen’s similarity coefficient, is a statistic used for comparing the similarity of two samples. It was developed by the botanist Thorvald Sørensen and published in 1948....

  • Spaghetti plot
  • Sparse binary polynomial hashing
    Sparse binary polynomial hashing
    Sparse binary polynomial hashing is a generalization of Bayesian filtering that can match mutating phrases as well as single words. SBPH is a way of generating a large number of features from an incoming text automatically, and then using statistics to determine the weights for each of those...

  • Sparse PCA
    Sparse PCA
    Sparse PCA is a specialised technique used in statistical analysis and, in particular, in the analysis of multivariate data sets....

     – sparse principal components analysis
  • Sparsity-of-effects principle
    Sparsity-of-effects principle
    The sparsity-of-effects principle states that a system is usually dominated by main effects and low-order interactions. Thus it is most likely that main effects and two-factor interactions are the most significant responses . In other words, higher order interactions such as three-factor...

  • Spatial analysis
    Spatial analysis
    Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties...

  • Spatial dependence
    Spatial dependence
    In applications of statistics, spatial dependence is the existence of statistical dependence in a collection of random variables or a collection time series of random variables, each of which is associated with a different geographical location...

  • Spatial descriptive statistics
    Spatial descriptive statistics
    Spatial descriptive statistics are used for a variety of purposes in geography, particularly in quantitative data analyses involving Geographic Information Systems .-Types of spatial data:...

  • Spatial distribution
    Spatial distribution
    A spatial distribution is the arrangement of a phenomenon across the Earth's surface and a graphical display of such an arrangement is an important tool in geographical and environmental statistics. A graphical display of a spatial distribution may summarize raw data directly or may reflect the...

  • Spatial econometrics
    Spatial econometrics
    Spatial Econometrics is the field where spatial analysis and econometrics intersect. In general, econometrics differs from other branches of statistics in focusing on theoretical models, whose parameters are estimated using regression analysis...

  • Spatial statistics redirects to Spatial analysis
    Spatial analysis
    Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties...

  • Spatial variability
    Spatial variability
    Spatial variability occurs when a quantity that is measured at different spatial locations exhibits values that differ across the locations. Spatial variability can be assessed using spatial descriptive statistics such as the range.- References :...

  • SPC XL
    SPC XL
    SPC XL is a statistical add-in for Microsoft Excel. SPC XL is a replacement for SPC KISS which was released in 1993 making it one of the oldest statistical addons to Excel...

     – software
  • Spearman's rank correlation coefficient
    Spearman's rank correlation coefficient
    In statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often denoted by the Greek letter \rho or as r_s, is a non-parametric measure of statistical dependence between two variables. It assesses how well the relationship between two variables can...

  • Spearman–Brown prediction formula
  • Species discovery curve
    Species discovery curve
    In ecology, the species discovery curve is a graph recording the cumulative number of species of living things recorded in a particular environment as a function of the cumulative effort expended searching for them...

  • Specification (regression)
    Specification (regression)
    In regression analysis and related fields such as econometrics, specification is the process of converting a theory into a regression model. This process consists of selecting an appropriate functional form for the model and choosing which variables to include. Model specification is one of the...

  • Specificity (tests)
  • Spectral density estimation
    Spectral density estimation
    In statistical signal processing, the goal of spectral density estimation is to estimate the spectral density of a random signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal...

  • Spectrum bias
    Spectrum bias
    Initially identified in 1978, spectrum bias refers to the phenomenon that the performance of a diagnostic test may change between different clinical settings owing to changes in the patient case-mix thereby affecting the transferability of study results in clinical practice...

  • Spectrum continuation analysis
    Spectrum continuation analysis
    Spectrum continuation analysis is a generalization of the concept of Fourier series to non-periodic functions of which only a fragment has been sampled in the time domain....

  • Speed prior
    Speed prior
    Jürgen Schmidhuber's speed prior is a complexity measure similar to Kolmogorov complexity, except that it is based on computation speed as well as programlength.The speed prior complexity of a program is its...

  • Spherical design
    Spherical design
    A spherical design, part of combinatorial design theory in mathematics, is a finite set of N points on the d-dimensional unit hypersphere Sd such that the average value of any polynomial f of degree t or less on the set equals the average value of f on the whole sphere...

  • Split normal distribution
    Split normal distribution
    In probability theory and statistics, the split normal distribution also known as the two-piece normal distribution results from joining at the mode the corresponding halves of two normal distributions with the same mode but different variances...

  • SPRT — redirects to Sequential probability ratio test
    Sequential probability ratio test
    The sequential probability ratio test is a specific sequential hypothesis test, developed by Abraham Wald. Neyman and Pearson's 1933 result inspired Wald to reformulate it as a sequential analysis problem...

  • SPSS
    SPSS
    SPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....

     – software
  • SPSS Clementine
    SPSS Clementine
    SPSS Modeler is a data mining software tool by SPSS Inc., an IBM company. It was originally named SPSS Clementine by SPSS, after which it was renamed PASW Modeler in 2009 by SPSS. It was since acquired by IBM in its acquisition of SPSS Inc.-Overview:...

     – software (data mining)
  • Spurious relationship
    Spurious relationship
    In statistics, a spurious relationship is a mathematical relationship in which two events or variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor In statistics, a spurious relationship...

  • Square root biased sampling
    Square root biased sampling
    Square root biased sampling is a sampling method proposed by William H. Press, a professor in the fields of computer sciences and computational biology, for use in airport screenings as a mathematically efficient compromise between simple random sampling and strong profiling.Using this method, if a...

  • Squared deviations
    Squared deviations
    In probability theory and statistics, the definition of variance is either the expected value , or average value , of squared deviations from the mean. Computations for analysis of variance involve the partitioning of a sum of squared deviations...

  • St. Petersburg paradox
    St. Petersburg paradox
    In economics, the St. Petersburg paradox is a paradox related to probability theory and decision theory. It is based on a particular lottery game that leads to a random variable with infinite expected value, i.e., infinite expected payoff, but would nevertheless be considered to be worth only a...

  • Stability (probability)
    Stability (probability)
    In probability theory, the stability of a random variable is the property that a linear combination of two independent copies of the variable has the same distribution, up to location and scale parameters. The distributions of random variables having this property are said to be "stable...

  • Stable distribution
  • Stable and tempered stable distributions with volatility clustering – financial applications
  • Standard deviation
    Standard deviation
    Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...

  • Standard error (statistics)
    Standard error (statistics)
    The standard error is the standard deviation of the sampling distribution of a statistic. The term may also be used to refer to an estimate of that standard deviation, derived from a particular sample used to compute the estimate....

  • Standard normal deviate
    Standard normal deviate
    A standard normal deviate is a normally distributed random variable with expected value 0 and variance 1. A fuller term is standard normal random variable...

  • Standard normal table
    Standard normal table
    A standard normal table also called the "Unit Normal Table" is a mathematical table for the values of Φ, the cumulative distribution function of the normal distribution....

  • Standard probability space
    Standard probability space
    In probability theory, a standard probability space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940...

  • Standard score
    Standard score
    In statistics, a standard score indicates how many standard deviations an observation or datum is above or below the mean. It is a dimensionless quantity derived by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation...

  • Standardized coefficient
    Standardized coefficient
    In statistics, standardized coefficients or beta coefficients are the estimates resulting from an analysis carried out on variables that have been standardized so that their variances are 1. Therefore, standardized coefficients refer to how many standard deviations a dependent variable will change,...

  • Standardized moment
  • Standardised mortality rate
    Standardised mortality rate
    Standardized mortality ratio tells how many persons, per thousand of the population, will die in a given year and what the causes of death will be...

  • Standardized mortality ratio
    Standardized mortality ratio
    The standardized mortality ratio or SMR in epidemiology is the ratio of observed deaths to expected deaths, where expected deaths are calculated for a typical area with the same age and gender mix by looking at the death rates for different ages and genders in the larger population.The SMR may be...

  • Standardized rate
    Standardized rate
    Standardized rates are a statistical measure of any rates in a population. The most common are birth, death and unemployment rates.The formula for standardized rates is as follows:...

  • Stanine
    Stanine
    Stanine is a method of scaling test scores on a nine-point standard scale with a mean of five and a standard deviation of two.Some web sources attribute stanines to the U.S. Army Air Forces during World War II...

  • STAR model
    STAR model
    In statistics, Smooth Transition Autoregressive models are typically applied to time series data as an extension of autoregressive models, in order to allow for higher degree of flexibility in model parameters through a smooth transition.Given a time series of data xt, the STAR model is a tool for...

     — a time series model
  • Star plot — redirects to Radar chart
    Radar chart
    A radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point...

  • Stata
    Stata
    Stata is a general-purpose statistical software package created in 1985 by StataCorp. It is used by many businesses and academic institutions around the world...

  • Statgraphics
    Statgraphics
    Statgraphics is a statistics package that performs and explains basic and advanced statistical functions. The software was created in 1980 by Dr. Neil Polhemus...

     – software
  • Static analysis
    Static analysis
    Static analysis, static projection, and static scoring are terms for simplified analysis wherein the effect of an immediate change to a system is calculated without respect to the longer term response of the system to that change...

  • Stationary distribution
    Stationary distribution
    Stationary distribution may refer to:* The limiting distribution in a Markov chain* The marginal distribution of a stationary process or stationary time series* The set of joint probability distributions of a stationary process or stationary time series...

  • Stationary ergodic process
    Stationary ergodic process
    In probability theory, stationary ergodic process is a stochastic process which exhibits both stationarity and ergodicity. In essence this implies that the random process will not change its statistical properties with time and that its statistical properties can be deduced from a single,...

  • Stationary process
    Stationary process
    In the mathematical sciences, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space...

  • Stationary sequence
    Stationary sequence
    In probability theory – specifically in the theory of stochastic processes, a stationary sequence is a random sequence whose joint probability distribution is invariant over time...

  • Stationary subspace analysis
    Stationary subspace analysis
    Stationary Subspace Analysis is a blind source separation algorithm which factorizes a multivariate time series into stationary and non-stationary components.- Introduction :...

  • Statistic
    Statistic
    A statistic is a single measure of some attribute of a sample . It is calculated by applying a function to the values of the items comprising the sample which are known together as a set of data.More formally, statistical theory defines a statistic as a function of a sample where the function...

  • STATISTICA
    STATISTICA
    STATISTICA is a statistics and analytics software package developed by StatSoft. STATISTICA provides data analysis, data management, data mining, and data visualization procedures...

     – software
  • Statistical arbitrage
    Statistical arbitrage
    In the world of finance and investments, statistical arbitrage is used in two related but distinct ways:* In academic literature, "statistical arbitrage" is opposed to arbitrage. In deterministic arbitrage, a sure profit can be obtained from being long some securities and short others...

  • Statistical assembly
    Statistical assembly
    In statistics, for example in statistical quality control, a statistical assembly is a collection of parts or components which makes up a statistical unit. Thus a statistical unit, which would be the prime item of concern, is made of discrete components like organs or machine parts...

  • Statistical assumption
  • Statistical benchmarking
    Statistical benchmarking
    In statistics, benchmarking is a method of using auxiliary information to adjust the sampling weights used in an estimation process, in order to yield more accurate estimates of totals....

  • Statistical classification
  • Statistical conclusion validity
    Statistical conclusion validity
    Statistical conclusion validity refers to the appropriate use of statistics to infer whether the presumed independent and dependent variables covary...

  • Statistical consultant
    Statistical consultant
    A statistical consultant provides statistical advice andguidance to clients interested in making decisions through theanalysis or collection of data. Clients often need statistical advice to answer questions in business, medicine, biology, genetics, forestry, agriculture, fisheries, wildlife...

  • Statistical deviance—see deviance (statistics)
  • Statistical dispersion
    Statistical dispersion
    In statistics, statistical dispersion is variability or spread in a variable or a probability distribution...

  • Statistical distance
    Statistical distance
    In statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two samples, two random variables, or two probability distributions, for example.-Metrics:...

  • Statistical efficiency
  • Statistical epidemiology
    Statistical epidemiology
    Statistical epidemiology is an emerging branch of the disciplines of epidemiology and biostatistics that aims to:* Bring more statistical rigour to bear in the field of epidemiology...

  • Statistical estimation — redirects to Estimation theory
    Estimation theory
    Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...

  • Statistical finance
    Statistical finance
    Statistical finance, sometimes called econophysics, is an empirical attempt to shift finance from its normative roots to a positivist framework using exemplars from statistical physics with an emphasis on emergent or collective properties of financial markets...

  • Statistical genetics — redirects to population genetics
    Population genetics
    Population genetics is the study of allele frequency distribution and change under the influence of the four main evolutionary processes: natural selection, genetic drift, mutation and gene flow. It also takes into account the factors of recombination, population subdivision and population...

  • Statistical geography
    Statistical geography
    Statistical geography is the study and practice of collecting, analysing and presenting data that has a geographic or areal dimension, such as census or demographics data. It uses techniques from spatial analysis, but also encompasses geographical activities such as the defining and naming of...

  • Statistical graphics
    Statistical graphics
    Statistical graphics, also known as graphical techniques, are information graphics in the field of statistics used to visualize quantitative data.- Overview :...

  • Statistical hypothesis testing
    Statistical hypothesis testing
    A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...

  • Statistical independence
    Statistical independence
    In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...

  • Statistical inference
    Statistical inference
    In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...

  • Statistical interference
    Statistical interference
    When two probability distributions overlap, statistical interference exists. Knowledge of the distributions can be used to determine the likelihood that one parameter exceeds another, and by how much....

  • Statistical Lab
    Statistical Lab
    The computer program Statistical Lab is an explorative and interactive toolbox for statistical analysis and visualization of data. It supports educational applications of statistics in business sciences, economics, social sciences and humanities. The program is developed and constantly advanced by...

     – software
  • Statistical learning theory
  • Statistical literacy
    Statistical literacy
    Statistical literacy is a term used to describe an individual's or group's ability to understand statistics. Statistical literacy is necessary for citizens to understand material presented in publications such as newspapers, television, and the Internet. Numeracy is a prerequisite to being...

  • Statistical model
    Statistical model
    A statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but...

  • Statistical model validation — redirects to Regression model validation
  • Statistical noise
    Statistical noise
    Statistical noise is the colloquialism for recognized amounts of unexplained variation in a sample. See errors and residuals in statistics....

  • Statistical package
  • Statistical parameter
    Statistical parameter
    A statistical parameter is a parameter that indexes a family of probability distributions. It can be regarded as a numerical characteristic of a population or a model....

  • Statistical parametric mapping
    Statistical parametric mapping
    Statistical parametric mapping or SPM is a statistical technique created by Karl Friston for examining differences in brain activity recorded during functional neuroimaging experiments using neuroimaging technologies such as fMRI or PET...

  • Statistical parsing
    Statistical parsing
    Statistical parsing is a group of parsing methods within natural language processing. The methods have in common that they associate grammar rules with a probability. Grammar rules are traditionally viewed in computational linguistics as defining the valid sentences in a language...

  • Statistical population
    Statistical population
    A statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. For example, if we were interested in generalizations about crows, then we would describe the set of crows that is of interest...

  • Statistical power
    Statistical power
    The power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is actually false . The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis...

  • Statistical probability
  • Statistical process control
    Statistical process control
    Statistical process control is the application of statistical methods to the monitoring and control of a process to ensure that it operates at its full potential to produce conforming product. Under SPC, a process behaves predictably to produce as much conforming product as possible with the least...

  • Statistical process control software
    Statistical process control software
    There are a number of software programs designed to aid in statistical process control .Typically the software program undertakes two functions: data collection and data analysis.-Data collection:...

  • Statistical proof
    Statistical proof
    .Statistical proof is the rational demonstration of degree of certainty for a proposition, hypothesis or theory to convince others subsequent to a statistical test of the increased understanding of the facts. Statistical methods are used to demonstrate the validity and logic of inference with...

  • Statistical randomness
    Statistical randomness
    A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll, or the digits of π exhibit statistical randomness....

  • Statistical range – see range (statistics)
    Range (statistics)
    In the descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...

  • Statistical regularity
    Statistical regularity
    Statistical regularity is a notion in statistics and probability theory that random events exhibit regularity when repeated enough times or that enough sufficiently similar random events exhibit regularity...

  • Statistical sample
  • Statistical semantics
    Statistical semantics
    Statistical semantics is the study of "how the statistical patterns of human word usage can be used to figure out what people mean, at least to a level sufficient for information access"...

  • Statistical shape analysis
    Statistical shape analysis
    Statistical shape analysis is a geometrical analysis from a set of shapes in which statistics are measured to describe geometrical properties from similar shapes or different groups, for instance, the difference between male and female Gorilla skull shapes, normal and pathological bone shapes, etc...

  • Statistical signal processing
    Statistical signal processing
    Statistical signal processing is an area of Applied Mathematics and Signal Processing that treats signals as stochastic processes, dealing with their statistical properties...

  • Statistical significance
    Statistical significance
    In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....

  • Statistical survey
    Statistical survey
    Survey methodology is the field that studies surveys, that is, the sample of individuals from a population with a view towards making statistical inferences about the population using the sample. Polls about public opinion, such as political beliefs, are reported in the news media in democracies....

  • Statistical syllogism
    Statistical syllogism
    A statistical syllogism is a non-deductive syllogism. It argues from a generalization true for the most part to a particular case .-Introduction:Statistical syllogisms may use qualifying words like "most", "frequently", "almost never", "rarely",...

  • Statistical theory
    Statistical theory
    The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that...

  • Statistical unit
    Statistical unit
    A unit in a statistical analysis refers to one member of a set of entities being studied. It is the material source for the mathematical abstraction of a "random variable"...

  • Statisticians' and engineers' cross-reference of statistical terms
    Statisticians' and engineers' cross-reference of statistical terms
    The following terms are used by electrical engineers in statistical signal processing studies instead of typical statistician's terms.The following terms are used by electrical engineers in statistical signal processing studies instead of typical statistician's terms.The following terms are used by...

  • Statistics
    Statistics
    Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

  • Statistics education
    Statistics education
    Statistics education is concerned with the teaching and learning of statistics.Statistics is both a formal science and a practical theory of scientific inquiry, and both aspects are considered in statistics education. Education in statistics has similar concerns as does education in other...

  • Statistics Online Computational Resource – training materials
  • StatPlus
    StatPlus
    StatPlus is a software product that includes basic and multivariate statistical analysis , including time series analysis, nonparametric statistics, survival analysis and ability to build different charts ....

  • StatXact
    StatXact
    StatXact is a statistical software package for exact statistics. It calculates exact p-values and confidence intervals for contingency tables and non-parametric procedures. It is marketed by Cytel Inc.-References:...

     – software
  • Stein's example
    Stein's example
    Stein's example , in decision theory and estimation theory, is the phenomenon that when three or more parameters are estimated simultaneously, there exist combined estimators more accurate on average than any method that handles the parameters separately...

    • Proof of Stein's example
      Proof of Stein's example
      Stein's example is an important result in decision theory which can be stated asThe following is an outline of its proof. The reader is referred to the main article for more information.-Sketched proof:...

  • Stein's lemma
    Stein's lemma
    Stein's lemma, named in honor of Charles Stein, is a theorem of probability theory that is of interest primarily because of its applications to statistical inference — in particular, to James–Stein estimation and empirical Bayes methods — and its applications to portfolio choice...

  • Stein's unbiased risk estimate
    Stein's unbiased risk estimate
    In statistics, Stein's unbiased risk estimate is an unbiased estimator of the mean-squared error of "a nearly arbitrary, nonlinear biased estimator." In other words, it provides an indication of the accuracy of a given estimator...

  • Steiner system
    Steiner system
    250px|right|thumbnail|The [[Fano plane]] is an S Steiner triple system. The blocks are the 7 lines, each containing 3 points. Every pair of points belongs to a unique line....

  • Stemplot
    Stemplot
    A stemplot , in statistics, is a device for presenting quantitative data in a graphical format, similar to a histogram, to assist in visualizing the shape of a distribution. They evolved from Arthur Bowley's work in the early 1900s, and are useful tools in exploratory data analysis...

  • Stepwise regression
    Stepwise regression
    In statistics, stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure...

  • Stetson–Harrison method
  • Stieltjes moment problem
  • Stimulus-response model
    Stimulus-response model
    The stimulus–response model is a characterization of a statistical unit as a black box model, predicting a quantitative response to a quantitative stimulus, for example one administered by a researcher.-Fields of application:...

  • Stochastic
    Stochastic
    Stochastic refers to systems whose behaviour is intrinsically non-deterministic. A stochastic process is one whose behavior is non-deterministic, in that a system's subsequent state is determined both by the process's predictable actions and by a random element. However, according to M. Kac and E...

  • Stochastic approximation
    Stochastic approximation
    Stochastic approximation methods are a family of iterative stochastic optimization algorithms that attempt to find zeroes or extrema of functions which cannot be computed directly, but only estimated via noisy observations....

  • Stochastic calculus
    Stochastic calculus
    Stochastic calculus is a branch of mathematics that operates on stochastic processes. It allows a consistent theory of integration to be defined for integrals of stochastic processes with respect to stochastic processes...

  • Stochastic convergence
  • Stochastic differential equation
    Stochastic differential equation
    A stochastic differential equation is a differential equation in which one or more of the terms is a stochastic process, thus resulting in a solution which is itself a stochastic process....

  • Stochastic dominance
    Stochastic dominance
    Stochastic dominance is a form of stochastic ordering. The term is used in decision theory and decision analysis to refer to situations where one gamble can be ranked as superior to another gamble. It is based on preferences regarding outcomes...

  • Stochastic drift
    Stochastic drift
    In probability theory, stochastic drift is the change of the average value of a stochastic process. A related term is the drift rate which is the rate at which the average changes. This is in contrast to the random fluctuations about this average value...

  • Stochastic gradient descent
    Stochastic gradient descent
    Stochastic gradient descent is an optimization method for minimizing an objective function that is written as a sum of differentiable functions.- Background :...

  • Stochastic grammar
    Stochastic grammar
    A stochastic grammar is a grammar framework with a probabilistic notion of grammaticality:*Stochastic context-free grammar*Statistical parsing*Data-oriented parsing*Hidden Markov model*Estimation theory...

  • Stochastic kernel estimation
  • Stochastic matrix
    Stochastic matrix
    In mathematics, a stochastic matrix is a matrix used to describe the transitions of a Markov chain. It has found use in probability theory, statistics and linear algebra, as well as computer science...

  • Stochastic modelling (insurance)
  • Stochastic optimization
    Stochastic optimization
    Stochastic optimization methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involve random objective functions or random constraints, for example. Stochastic...

  • Stochastic ordering
    Stochastic ordering
    In probability theory and statistics, a stochastic order quantifies the concept of one random variable being "bigger" than another. These are usually partial orders, so that one random variable A may be neither stochastically greater than, less than nor equal to another random variable B...

  • Stochastic process
    Stochastic process
    In probability theory, a stochastic process , or sometimes random process, is the counterpart to a deterministic process...

  • Stochastic rounding
  • Stochastic simulation
    Stochastic simulation
    Stochastic simulation algorithms and methods were initially developed to analyse chemical reactions involving large numbers of species with complex reaction kinetics. The first algorithm, the Gillespie algorithm was proposed by Dan Gillespie in 1977...

  • Stopped process
    Stopped process
    In mathematics, a stopped process is a stochastic process that is forced to assume the same value after a prescribed time.-Definition:Let* be a probability space;...

  • Stopping time
  • Stratified sampling
    Stratified sampling
    In statistics, stratified sampling is a method of sampling from a population.In statistical surveys, when subpopulations within an overall population vary, it is advantageous to sample each subpopulation independently. Stratification is the process of dividing members of the population into...

  • Stratonovich integral
    Stratonovich integral
    In stochastic processes, the Stratonovich integral is a stochastic integral, the most common alternative to the Itō integral...

  • Stress majorization
    Stress majorization
    Stress majorization is an optimization strategy used in multidimensional scaling where, for a set of n, m-dimensional data items, a configuration X of n points in rStress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of n, m-dimensional data...

  • Strong Law of Small Numbers
    Strong Law of Small Numbers
    "The Strong Law of Small Numbers" is a humorous paper by mathematician Richard K. Guy and also the so-called law that it proclaims: "There aren't enough small numbers to meet the many demands made of them." In other words, any given small number appears in far more contexts than may seem...

  • Strong prior
    Strong prior
    A Strong prior is a preceding assumption, theory, concept or idea upon which a current assumption, theory, concept or idea is founded.In Bayesian statistics, the term is used to contrast the case of a weak or uniformative prior probability...

  • Structural break
    Structural break
    A structural break is a concept in econometrics. A structural break appears when we see an unexpected shift in a time series. This can lead to huge forecasting errors and unreliability of the model in general...

  • Structural equation modeling
    Structural equation modeling
    Structural equation modeling is a statistical technique for testing and estimating causal relations using a combination of statistical data and qualitative causal assumptions...

  • Structural estimation
    Structural estimation
    Structural estimation is a technnique for estimating deep "structural" parameters of theoretical economic models. In this sense, "structural estimation" is contrasted with "reduced-form estimation," which generally provides evidence about partial equilibrium relationships in a regression...

  • Structured data analysis (statistics)
    Structured data analysis (statistics)
    Structured data analysis is the statistical data analysis of structured data. This can arise either in the form of an a priori structure such as multiple-choice questionnaires or in situations with the need to search for structure that fits the given data, either exactly or approximately...

  • Studentized range
  • Studentized residual
    Studentized residual
    In statistics, a studentized residual is the quotient resulting from the division of a residual by an estimate of its standard deviation. Typically the standard deviations of residuals in a sample vary greatly from one data point to another even when the errors all have the same standard...

  • Student's t-distribution
  • Student's t-statistic
    Student's t-statistic
    In statistics, the t-statistic is a ratio of the departure of an estimated parameter from its notional value and its standard error. It is used in hypothesis testing, for example in the Student's t-test, in the augmented Dickey–Fuller test, and in bootstrapping.-Definition:Let \scriptstyle\hat\beta...

  • Student's t-test
    Student's t-test
    A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known...

  • Student’s t-test for Gaussian scale mixture distributions — redirects to Location testing for Gaussian scale mixture distributions
    Location testing for Gaussian scale mixture distributions
    In statistics, the topic of location testing for Gaussian scale mixture distributions arises in some particular types of situations where the more standard Student's t-test is inapplicable...

  • Studentization
    Studentization
    In statistics, Studentization, named after William Sealy Gosset, who wrote under the pseudonym Student, is the adjustment consisting of division of a first-degree statistic derived from a sample, by a sample-based estimate of a population standard deviation...

  • Study design
    Study design
    Clinical study design is the formulation of trials and experiments in medical and epidemiological research, sometimes known as clinical trials. Many of the considerations here are shared under the more general topic of design of experiments but there can be others, in particular related to patient...

  • Study heterogeneity
    Study heterogeneity
    In statistics, study heterogeneity is a problem that can arise when attempting to undertake a meta-analysis. Ideally, the studies whose results are being combined in the meta-analysis should all be undertaken in the same way and to the same experimental protocols: study heterogeneity is a term used...

  • Subcontrary mean redirects to Harmonic mean
    Harmonic mean
    In mathematics, the harmonic mean is one of several kinds of average. Typically, it is appropriate for situations when the average of rates is desired....

  • Subgroup analysis
    Subgroup analysis
    Subgroup analysis, in the context of design and analysis of experiments, refers to looking for pattern in a subset of the subjects....

  • Subindependence
  • Substitution model
    Substitution model
    In biology, a substitution model describes the process from which a sequence of characters changes into another set of traits. For example, in cladistics, each position in the sequence might correspond to a property of a species which can either be present or absent. The alphabet could then consist...

  • SUDAAN
    SUDAAN
    SUDAAN is a statistical software package for the analysis of correlated data, including correlated data encountered in complex sample surveys. SUDAAN originated in 1972 at .-Current version:...

     – software
  • Sufficiency (statistics) — redirects to Sufficient statistic
  • Sufficient dimension reduction
    Sufficient dimension reduction
    In statistics, sufficient dimension reduction is a paradigm for analyzing data that combines the ideas of dimension reduction with the concept of sufficiency.Dimension reduction has long been a primary goal of regression analysis...

  • Sufficient statistic
  • Sum of normally distributed random variables
    Sum of normally distributed random variables
    In probability theory, calculation of the sum of normally distributed random variables is an instance of the arithmetic of random variables, which can be quite complex based on the probability distributions of the random variables involved and their relationships.-Independent random variables:If X...

  • Sum of squares — general disambiguation
  • Sum of squares (statistics) — redirects to Partition of sums of squares
  • Summary statistic
  • Superstatistics
    Superstatistics
    Superstatistics is a branch of statistical mechanics or statistical physics devoted to the study of non-linear and non-equilibrium systems. It is characterized by using the superposition of multiple differing statistical models to achieve the desired non-linearity...

  • Support curve
    Support curve
    Support curve is a statistical term, coined by A. W. F. Edwards, to describe the graph of the natural logarithm of the likelihood function. The function being plotted is used in the computation of the score and Fisher information, and the graph has a direct interpretation in the context of maximum...

  • Support vector machine
    Support vector machine
    A support vector machine is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis...

  • Surrogate model
    Surrogate model
    Most engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the air flow around the wing for...

  • Survey data collection
    Survey data collection
    The methods involved in survey data collection are any of a number of ways in which data can be collected for a statistical survey. These are methods that are used to collect information from a sample of individuals in a systematic way....

  • Survey sampling
    Survey sampling
    In statistics, survey sampling describes the process of selecting a sample of elements from a target population in order to conduct a survey.A survey may refer to many different types or techniques of observation, but in the context of survey sampling it most often involves a questionnaire used to...

  • Survey methodology
    Survey Methodology
    Survey Methodology is a peer-reviewed open access scientific journal that publishes papers related to the development and application of survey techniques...

  • Survival analysis
    Survival analysis
    Survival analysis is a branch of statistics which deals with death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, and duration analysis or duration modeling in economics or sociology...

  • Survival rate
    Survival rate
    In biostatistics, survival rate is a part of survival analysis, indicating the percentage of people in a study or treatment group who are alive for a given period of time after diagnosis...

  • Survival function
    Survival function
    The survival function, also known as a survivor function or reliability function, is a property of any random variable that maps a set of events, usually associated with mortality or failure of some system, onto time. It captures the probability that the system will survive beyond a specified time...

  • Survivorship bias
    Survivorship bias
    Survivorship bias is the logical error of concentrating on the people or things that "survived" some process and inadvertently overlooking those that didn't because of their lack of visibility. This can lead to false conclusions in several different ways...

  • Symmetric design
    Symmetric design
    In combinatorial mathematics, a symmetric design is a block design with equal numbers of points and blocks. Thus, it has the fewest possible blocks given the number of points . They are also known as projective designs....

  • Symmetric mean absolute percentage error
    Symmetric mean absolute percentage error
    Symmetric mean absolute percentage error is an accuracy measure based on percentage errors. It is usually defined as follows:where At is the actual value and Ft is the forecast value....

  • SYSTAT
    SYSTAT
    SYSTAT is a statistics and statistical graphics software package, developed by Leland Wilkinson in the late 1970s, who was at the time an assistant professor of psychology at the University of Illinois at Chicago...

     – software
  • System dynamics
    System dynamics
    System dynamics is an approach to understanding the behaviour of complex systems over time. It deals with internal feedback loops and time delays that affect the behaviour of the entire system. What makes using system dynamics different from other approaches to studying complex systems is the use...

  • System identification
    System identification
    In control engineering, the field of system identification uses statistical methods to build mathematical models of dynamical systems from measured data...

  • Systematic error
    Systematic error
    Systematic errors are biases in measurement which lead to the situation where the mean of many separate measurements differs significantly from the actual value of the measured attribute. All measurements are prone to systematic errors, often of several different types...

     (also see bias (statistics)
    Bias (statistics)
    A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter of interest. The following lists some types of, or aspects of, bias which should not be considered mutually exclusive:...

     and errors and residuals in statistics
    Errors and residuals in statistics
    In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...

    )
  • Systematic review
    Systematic review
    A systematic review is a literature review focused on a research question that tries to identify, appraise, select and synthesize all high quality research evidence relevant to that question. Systematic reviews of high-quality randomized controlled trials are crucial to evidence-based medicine...


T

  • t-distribution; see Student's t-distribution (includes table)
  • T distribution — disambiguation
  • t-statistic
  • Tag cloud
    Tag cloud
    A tag cloud is a visual representation for text data, typically used to depict keyword metadata on websites, or to visualize free form text. 'Tags' are usually single words, and the importance of each tag is shown with font size or color...

     – graphical display of info
  • Taguchi loss function
    Taguchi loss function
    The Taguchi Loss Function is a graphical depiction of loss developed by the Japanese business statistician Genichi Taguchi to describe a phenomenon affecting the value of products produced by a company. Praised by Dr. W...

  • Taguchi methods
    Taguchi methods
    Taguchi methods are statistical methods developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to, engineering, biotechnology, marketing and advertising...

  • Tajima's D
    Tajima's D
    Tajima's D is a statistical test created by and named after the Japanese researcher Fumio Tajima. The purpose of the test is to distinguish between a DNA sequence evolving randomly and one evolving under a non-random process, including directional selection or balancing selection, demographic...

  • Taleb distribution
    Taleb Distribution
    In economics and finance, a Taleb distribution is a term coined by U.K. economists/journalists Martin Wolf and John Kay to describe a returns profile that appears at times deceptively low-risk with steady returns, but experiences periodically catastrophic drawdowns. It does not describe a...

  • Tampering (quality control)
    Tampering (quality control)
    Tampering in the context of a controlled process is when adjustments to the process are made based on outcomes which are within the expected range of variability. The net result is to re-align the process so that an increased proportion of the output is out of specification. The term was introduced...

  • Taylor expansions for the moments of functions of random variables
    Taylor expansions for the moments of functions of random variables
    In probability theory, it is possible to approximate the moments of a function f of a random variable X using Taylor expansions, provided that f is sufficiently differentiable and that the moments of X are finite...

  • Telegraph process
    Telegraph process
    In probability theory, the telegraph process is a memoryless continuous-time stochastic process that shows two distinct values.If these are called a and b, the process can be described by the following master equations:...

  • Test for structural change
    Test for structural change
    Test for structural change is an econometric test. It is used to verify the equality of coefficients in separate subsamples. See Chow test....

  • Test-retest (disambiguation)
  • Test score
    Test score
    A test score is a piece of information, usually a number, that conveys the performance of an examinee on a test. One formal definition is that it is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being...

  • Test set
    Test set
    A test set is a set of data used in various areas of information science to assess the strength and utility of a predictive relationship. Test sets are used in artificial intelligence, machine learning, genetic programming, intelligent systems, and statistics...

  • Test statistic
    Test statistic
    In statistical hypothesis testing, a hypothesis test is typically specified in terms of a test statistic, which is a function of the sample; it is considered as a numerical summary of a set of data that...

  • Testimator
    Testimator
    A testimator is an estimator whose value depends on the result of a test for statistical significance. In the simplest case the value of the final estimator is that of the basic estimator if the test result is significant, and otherwise the value is zero...

  • Testing hypotheses suggested by the data
    Testing hypotheses suggested by the data
    In statistics, hypotheses suggested by the data, if tested using the data set that suggested them, are likely to be accepted even when they are not true...

  • Text analytics
    Text analytics
    The term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation. The term is roughly synonymous with text mining;...

  • The Long Tail
    The Long Tail
    The Long Tail or long tail refers to the statistical property that a larger share of population rests within the tail of a probability distribution than observed under a 'normal' or Gaussian distribution...

     — possibly seminal magazine article
  • The Unscrambler
    The Unscrambler
    The Unscrambler is a commercial software product for multivariate data analysis, used primarily for calibration in the application of near infrared spectroscopy and development of predictive models for use in real-time spectroscopic analysis of materials. The software was originally developed in...

     — software
  • Theil index
    Theil index
    The Theil index is a statistic used to measure economic inequality. It has also been used to measure the lack of racial diversity. The basic Theil index TT is the same as redundancy in information theory which is the maximum possible entropy of the data minus the observed entropy. It is a special...

  • Theil–Sen estimator
    Theil–Sen estimator
    In non-parametric statistics, the Theil–Sen estimator, also known as Sen's slope estimator, slope selection, the single median method, or the Kendall robust line-fit method, is a method for robust linear regression that chooses the median slope among all lines through pairs of two-dimensional...

  • Theory of conjoint measurement
    Theory of conjoint measurement
    The theory of conjoint measurement is a general, formal theory of continuous quantity. It was independently discovered by the French economist Gerard Debreu and by the American mathematical psychologist R...

  • Therapeutic effect
    Therapeutic effect
    A therapeutic effect is a consequence of a medical treatment of any kind, the results of which are judged to be desirable and beneficial. This is true whether the result was expected, unexpected, or even an unintended consequence of the treatment...

  • Three-point estimation
    Three-point estimation
    The three-point estimation technique is used in management and information systems applications for the construction of an approximate probability distribution representing the outcome of future events, based on very limited information...

  • Three-stage least squares
  • Threshold model
    Threshold model
    In mathematical or statistical modelling a threshold model is any model where a threshold value, or set of threshold values, is used to distinguish ranges of values where the behaviour predicted by the model differs in some important way...

  • Thurstone scale
    Thurstone scale
    In psychology, the Thurstone scale was the first formal technique for measuring an attitude. It was developed by Louis Leon Thurstone in 1928, as a means of measuring attitudes towards religion. It is made up of statements about a particular issue, and each statement has a numerical value...

  • Time-frequency analysis
    Time-frequency analysis
    In signal processing, time–frequency analysis comprises those techniques that study a signal in both the time and frequency domains simultaneously, using various time–frequency representations...

  • Time–frequency representation
  • Time reversibility
    Time reversibility
    Time reversibility is an attribute of some stochastic processes and some deterministic processes.If a stochastic process is time reversible, then it is not possible to determine, given the states at a number of points in time after running the stochastic process, which state came first and which...

  • Time series
    Time series
    In statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones index or the annual flow volume of the...

  • Time-series regression
  • Time use survey
    Time use survey
    A Time Use Survey is a statistical survey which aims to report data on how, on average, people spend their time.- Objectives :The objective is to identify, classify and quantify the main types of activity that people engage in during a definitive time period, e.g...

  • Time-varying covariate
    Time-varying covariate
    A time-varying covariate is a term used in statistics, particularly in survival analyses. It reflects the phenomenon that a covariate is not necessarily constant through the whole study...

  • Timeline of probability and statistics
    Timeline of probability and statistics
    A timeline of probability and statistics-Before 1600:* 9th Century - Al-Kindi was the first to use statistics to decipher encrypted messages and developed the first code breaking algorithm in the House of Wisdom in Baghdad, based on frequency analysis...

  • TinkerPlots
    TinkerPlots
    TinkerPlots is exploratory data analysis software designed for use by students in grades 4-8. It was designed by Clifford Konold and Craig Miller at the University of Massachusetts Amherst and is published by Key Curriculum Press. It has some similarities with Fathom, and runs on Windows XP or...

     — proprietary software for schools
  • Tobit model
    Tobit model
    The Tobit model is a statistical model proposed by James Tobin to describe the relationship between a non-negative dependent variable y_i and an independent variable x_i....

  • Tolerance interval
    Tolerance interval
    A tolerance interval is a statistical interval within which, with some confidence level, a specified proportion of a population falls.A tolerance interval can be seen as a statistical version of a probability interval. If we knew a population's exact parameters, we would be able to compute a range...

  • Top-coded
    Top-coded
    In econometrics and statistics, a top-coded dataset is one for which the upper bound is not known. This is often done to preserve the anonymity of people participating in the survey .-Example: Top-coding of wealth:Jacob S...

  • Topic model
    Topic model
    In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. An early topic model was probabilistic latent semantic indexing , created by Thomas Hofmann in 1999...

     (statistical natural language processing)
  • Topological data analysis
    Topological data analysis
    Topological data analysis is a new area of study aimed at having applications in areas such as data mining and computer vision.The main problems are:# how one infers high-dimensional structure from low-dimensional representations; and...

  • Tornqvist index
    Tornqvist index
    In economics the Törnqvist index is a price or quantity index. Using price and quantity data, a Tornqvist index is a discrete approximation to a continuous Divisia index. A Divisia index is a weighted sum of the growth rates of the various components, where the weights are the component's shares in...

  • Total correlation
    Total correlation
    In probability theory and in particular in information theory, total correlation is one of several generalizations of the mutual information. It is also known as the multivariate constraint or multiinformation...

  • Total least squares
  • Total sum of squares
    Total sum of squares
    In statistical data analysis the total sum of squares is a quantity that appears as part of a standard way of presenting results of such analyses...

  • Total variation distance — a statistical distance measure
  • TPL Tables
    TPL Tables
    TPL Tables is a cross tabulation system used to generate statistical tables for analysis or publication.- Background / History :TPL Tables has its roots in the Table Producing Language system, developed at the Bureau of Labor Statistics in the 1970s and early 1980s to run on IBM mainframes. It...

      – software
  • Tracy–Widom distribution
    Tracy–Widom distribution
    The Tracy–Widom distribution, introduced by , is the probability distribution of the largest eigenvalue of a random hermitian matrix in the edge scaling limit. It also appears in the distribution of the length of the longest increasing subsequence of random permutations and in current fluctuations...

  • Traffic equations
    Traffic equations
    In queueing theory, a discipline within the mathematical theory of probability, traffic equations are equations that describe the mean arrival rate of traffic, allowing the arrival rates at individual nodes to be determined...

  • Training set
    Training set
    A training set is a set of data used in various areas of information science to discover potentially predictive relationships. Training sets are used in artificial intelligence, machine learning, genetic programming, intelligent systems, and statistics...

  • Transect
    Transect
    A transect is a path along which one records and counts occurrences of the phenomena of study .It requires an observer to move along a fixed path and to count occurrences along the path and, at the same time, obtain the distance of the object from the path...

  • Transferable belief model
    Transferable belief model
    The transferable belief model is an elaboration on the Dempster-Shafer theory of evidence.-Context:Consider the following classical problem of information fusion. A patient has an illness that can be caused by three different factors A, B and C...

  • Transiogram
    Transiogram
    Transiogram is the accompanying spatial correlation measure of Markov chain random fields and an important part of Markov chain geostatistics. It is defined as a transition probability function over the distance lag. Simply, a transiogram refers to a transition probability diagram. Transiograms...

  • Transmission risks and rates
    Transmission risks and rates
    Transmission of an infection requires three conditions:*an infectious individual*a susceptible individual*an effective contact between themAn effective contact is defined as any kind of contact between two individuals such that, if one individual is infectious and the other susceptible, then the...

  • Treatment group
  • Trend analysis
    Trend analysis
    Trend Analysis is the practice of collecting information and attempting to spot a pattern, or trend, in the information. In some fields of study, the term "trend analysis" has more formally-defined meanings....

  • Trend estimation
    Trend estimation
    Trend estimation is a statistical technique to aid interpretation of data. When a series of measurements of a process are treated as a time series, trend estimation can be used to make and justify statements about tendencies in the data...

  • Trend stationary
  • Treynor ratio
    Treynor ratio
    The Treynor ratio , named after Jack L. Treynor, is a measurement of the returns earned in excess of that which could have been earned on an investment that has no diversifiable risk , per each unit of market risk assumed.The Treynor ratio relates...

  • Triangular distribution
  • Trimean
    Trimean
    In statistics the trimean , or Tukey's trimean, is a measure of a probability distribution's location defined as a weighted average of the distribution's median and its two quartiles:This is equivalent to the average of the median and the midhinge:...

  • Trimmed estimator
    Trimmed estimator
    Given an estimator, a trimmed estimator is obtained by excluding some of the extreme values. This is generally done to obtain a more robust statistic: the extreme values are considered outliers....

  • Trispectrum
    Trispectrum
    In mathematics, in the area of statistical analysis, the trispectrum is a statistic used to search for nonlinear interactions. The Fourier transform of the second-order cumulant, i.e., the autocorrelation function, is the traditional power spectrum...

  • True experiment
    True experiment
    A true experiment is a method of social research in which there are two kinds of variables. The independent variable is manipulated by the experimenter, and the dependent variable is measured...

  • True variance
  • Truncated distribution
    Truncated distribution
    In statistics, a truncated distribution is a conditional distribution that results from restricting the domain of some other probability distribution. Truncated distributions arise in practical statistics in cases where the ability to record, or even to know about, occurrences is limited to values...

  • Truncated mean
    Truncated mean
    A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both.For...

  • Truncated normal distribution
    Truncated normal distribution
    In probability and statistics, the truncated normal distribution is the probability distribution of a normally distributed random variable whose value is either bounded below or above . The truncated normal distribution has wide applications in statistics and econometrics...

  • Truncated regression model
    Truncated regression model
    Truncated regression models arise in many applications of statistics, for example in econometrics, in cases where observations with values in the outcome variable below or above certain thresholds systematically excluded from the sample...

  • Truncation (statistics)
    Truncation (statistics)
    In statistics, truncation results in values that are limited above or below, resulting in a truncated sample. Truncation is similar to but distinct from the concept of statistical censoring. A truncated sample can be thought of as being equivalent to an underlying sample with all values outside the...

  • Tsallis distribution
    Tsallis distribution
    In q-analog theory and statistical mechanics, a Tsallis distribution is a probability distribution derived from the maximization of the Tsallis entropy under appropriate constraints. There are several different families of Tsallis distributions, yet different sources may reference an individual...

  • Tsallis statistics
    Tsallis statistics
    The term Tsallis statistics usually refers to the collection of q-analogs of mathematical functions and associated probability distributions that were originated by Constantino Tsallis. Using these tools, it is possible to derive Tsallis distributions from the optimization of the Tsallis entropic...

  • Tschuprow's T
    Tschuprow's T
    In statistics, Tschuprow's T is a measure of association between two nominal variables, giving a value between 0 and 1 . It is closely related to Cramér's V, coinciding with it for square contingency tables....

  • Tucker decomposition
    Tucker decomposition
    In mathematics, Tucker decomposition decomposes a tensor into a set of matrices and one small core tensor. It is named after Ledyard R. Tuckeralthough it goes back to Hitchcock in 1927....

  • Tukey's range test — multiple comparisons
  • Tukey's test of additivity
    Tukey's test of additivity
    In statistics, Tukey's test of additivity, named for John Tukey, is an approach used in two-way anova to assess whether the factor variables are additively related to the expected value of the response variable...

     — interaction in two-way anova
  • Tukey–Kramer method
  • Tukey lambda distribution
  • Tweedie distributions
    Tweedie distributions
    In probability and statistics, the Tweedie distributions are a family of probability distributions which include continuous distributions such as the normal and gamma, the purely discrete scaled Poisson distribution, and the class of mixed compound Poisson-Gamma distributions which have positive...

  • Twisting properties
  • Two stage least squares — redirects to Instrumental variable
    Instrumental variable
    In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....

  • Two-tailed test
    Two-tailed test
    The two-tailed test is a statistical test used in inference, in which a given statistical hypothesis, H0 , will be rejected when the value of the test statistic is either sufficiently small or sufficiently large...

  • Type I and type II errors
    Type I and type II errors
    In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...

  • Type-1 Gumbel distribution
  • Type-2 Gumbel distribution
  • Tyranny of averages
    Tyranny of averages
    The tyranny of averages is a phrase used in applied statistics to describe the often overlooked fact that the mean does not provide any information about the distribution of a data set or skewness, and that decisions or analysis based on this value—as opposed to median and standard deviation—may be...


U

  • u-chart
  • U-quadratic distribution
    U-quadratic distribution
    In probability theory and statistics, the U-quadratic distribution is a continuous probability distribution defined by a unique quadratic function with lower limit a and upper limit b.-Parameter relations:...

  • U-statistic
    U-statistic
    In statistical theory, a U-statistic is a class of statistics that is especially important in estimation theory. In elementary statistics, U-statistics arise naturally in producing minimum-variance unbiased estimators...

  • U test
  • Umbrella sampling
    Umbrella sampling
    Umbrella sampling is a technique in computational physics and chemistry, used to improve sampling of a system where ergodicity is hindered by the form of the system's energy landscape. It was first suggested by Torrie and Valleau in 1977...

  • Unbiased estimator—see bias (statistics)
    Bias (statistics)
    A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter of interest. The following lists some types of, or aspects of, bias which should not be considered mutually exclusive:...

  • Unbiased estimation of standard deviation
    Unbiased estimation of standard deviation
    The question of unbiased estimation of a standard deviation arises in statistics mainly as question in statistical theory. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the...

  • Uncertainty
    Uncertainty
    Uncertainty is a term used in subtly different ways in a number of fields, including physics, philosophy, statistics, economics, finance, insurance, psychology, sociology, engineering, and information science...

  • Uncertainty coefficient
    Uncertainty coefficient
    In statistics, the uncertainty coefficient, also called entropy coefficient or Theil's U, is a measure of nominal association. It was first introduced by Henri Theil and is based on the concept of information entropy. Suppose we have samples of two random variables, i and j...

  • Uncertainty quantification
    Uncertainty quantification
    Uncertainty quantification is the science of quantitative characterization and reduction of uncertainties in applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known...

  • Uncomfortable science
    Uncomfortable science
    Uncomfortable science is the term coined by statistician John Tukey for cases in which there is a need to draw an inference from a limited sample of data, where further samples influenced by the same cause system will not be available...

  • Uncorrelated
    Uncorrelated
    In probability theory and statistics, two real-valued random variables are said to be uncorrelated if their covariance is zero. Uncorrelatedness is by definition pairwise; i.e...

  • Underdispersion — redirects to Overdispersion
    Overdispersion
    In statistics, overdispersion is the presence of greater variability in a data set than would be expected based on a given simple statistical model....

  • Unexplained variation — redirects to Explained variation
    Explained variation
    In statistics, explained variation or explained randomness measures the proportion to which a mathematical model accounts for the variation of a given data set...

  • Underprivileged area score
    Underprivileged area score
    The Underprivileged Area Score is an indicy to measure socio-economic variation across small geographical areas. The score is an outcome of the need identified in the Acheson Committee Report , to create an indicy to identify 'underprivileged areas' where there were high numbers of patients and...

  • Uniform distribution (continuous)
    Uniform distribution (continuous)
    In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by...

  • Uniform distribution (discrete)
  • Uniformly most powerful test
    Uniformly most powerful test
    In statistical hypothesis testing, a uniformly most powerful test is a hypothesis test which has the greatest power 1 − β among all possible tests of a given size α...

  • Unimodal distribution redirects to Unimodal function (has some stats context)
  • Unimodality
    Unimodality
    Unimodality is a term used in several contexts in mathematics. Originally, it relates to possessing a unique mode.- Unimodal probability distribution :...

  • Unistat
    Unistat
    The Unistat computer program is a statistical data analysis tool featuring two modes of operation: The stand-alone user interface is a complete workbench for data input, analysis and visualization while the Microsoft Excel add-in mode extends the features of the mainstream spreadsheet application...

     – software
  • Unit (statistics)
  • Unit of observation
    Unit of observation
    The Unit of observation is the unit on which one collects data . For example, a study may have a unit of observation at the individual level but may have the unit of analysis at the neighborhood level, drawing conclusions on neighborhood characteristics from data collected from individuals....

  • Unit root
    Unit root
    In time series models in econometrics , a unit root is a feature of processes that evolve through time that can cause problems in statistical inference if it is not adequately dealt with....

  • Unit root test
    Unit root test
    In statistics, a unit root test tests whether a time series variable is non-stationary using an autoregressive model. A well-known test that is valid in large samples is the augmented Dickey–Fuller test. The optimal finite sample tests for a unit root in autoregressive models were developed by John...

  • Unit-weighted regression
    Unit-weighted regression
    In statistics, unit-weighted regression is perhaps the easiest form of multiple regression analysis, a method in which two or more variables are used to predict the value of an outcome....

  • Unitized risk
  • Univariate
    Univariate
    In mathematics, univariate refers to an expression, equation, function or polynomial of only one variable. Objects of any of these types but involving more than one variable may be called multivariate...

  • Univariate analysis
    Univariate analysis
    Univariate analysis is the simplest form of quantitative analysis. The analysis is carried out with the description of a single variable and its attributes of the applicable unit of analysis...

  • Univariate distribution
    Univariate distribution
    In statistics, a univariate distribution is a probability distribution of only one random variable. This is in contrast to a multivariate distribution, the probability distribution of a random vector.-Further reading:...

  • Unmatched count
    Unmatched count
    In psychology and social research, unmatched count, or item count, is a technique to improve through anonymity the number of true answers to possibly embarrassing or self-incriminating questions. It is very simple to use but yields only the number of people bearing the property of interest.- Method...

  • Unsolved problems in statistics
    Unsolved problems in statistics
    There are many longstanding unsolved problems in mathematics for which a solution has still not yet been found. The unsolved problems in statistics are generally of a different flavor; according to John Tukey, "difficulties in identifying problems have delayed statistics far more than difficulties...

  • Upper and lower probabilities
    Upper and lower probabilities
    Upper and lower probabilities are representations of imprecise probability. Whereas probability theory uses a single number, the probability, to describe how likely an event is to occur, this method uses two numbers: the upper probability of the event and the lower probability of the event.Because...

  • Upside potential ratio
    Upside potential ratio
    The Upside-Potential Ratio is a measure of a return of an investment asset relative to the minimal acceptable return. The measurement allows a firm or individual to choose investments which have had relatively good upside performance, per unit of downside risk....

     – finance
  • Urn problem
    Urn problem
    In probability and statistics, an urn problem is an idealized mental exercise in which some objects of real interest are represented as colored balls in an urn or other container....

  • Ursell function
    Ursell function
    In statistical mechanics, an Ursell function or connected correlation function, is a cumulant ofa random variable. It is also called a connected correlation function as it can often be obtained by summing over...

  • Utility maximization problem
    Utility maximization problem
    In microeconomics, the utility maximization problem is the problem consumers face: "how should I spend my money in order to maximize my utility?" It is a type of optimal decision problem.-Basic setup:...

  • Utilization
    Utilization
    Utilization is a statistical concept as well as a primary business measure for the rental industry.-Queueing theory:In queueing theory, utilization is the proportion of the system's resources which is used by the traffic which arrives at it. It should be strictly less than one for the system to...

  • Utilization distribution
    Utilization distribution
    A utilization distribution is a probability distribution constructed from data providing the location of an individual in space at different points in time....


V

  • Validity (statistics)
    Validity (statistics)
    In science and statistics, validity has no single agreed definition but generally refers to the extent to which a concept, conclusion or measurement is well-founded and corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong...

  • Van der Waerden test
    Van der Waerden test
    Named for the Dutch mathematician Bartel Leendert van der Waerden, the Van der Waerden test is a statistical test that k population distribution functions are equal. The Van Der Waerden test converts the ranks from a standard Kruskal-Wallis one-way analysis of variance to quantiles of the standard...

  • Van Houtum distribution
    Van Houtum distribution
    In probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum. It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of...

  • Vapnik–Chervonenkis theory
  • Varadhan's lemma
    Varadhan's lemma
    In mathematics, Varadhan's lemma is a result in large deviations theory named after S. R. Srinivasa Varadhan. The result gives information on the asymptotic distribution of a statistic φ of a family of random variables Zε as ε becomes small in terms of a rate function for the variables.-Statement...

  • Variable
    Variable (mathematics)
    In mathematics, a variable is a value that may change within the scope of a given problem or set of operations. In contrast, a constant is a value that remains unchanged, though often unknown or undetermined. The concepts of constants and variables are fundamental to many areas of mathematics and...

  • Variable kernel density estimation
    Variable kernel density estimation
    In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varied...

  • Variable-order Bayesian network
    Variable-order Bayesian network
    Variable-order Bayesian network models provide an important extension of both the Bayesian network models and the variable-order Markov models...

  • Variable-order Markov model
    Variable-order Markov model
    Variable-order Markov models are an important class of models that extend the well known Markov chain models. In contrast to the Markov chain models, where each random variable in a sequence with a Markov property depends on a fixed number of random variables, in VOM models this number of...

  • Variable rules analysis
  • Variance
    Variance
    In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...

  • Variance decomposition
    Variance decomposition
    Variance decomposition or forecast error variance decomposition indicates the amount of information each variable contributes to the other variables in a vector autoregression models...

  • Variance gamma process
    Variance gamma process
    In the theory of stochastic processes, a part of the mathematical theory of probability, the variance gamma process , also known as Laplace motion, is a Lévy process determined by a random time change. The process has finite moments distinguishing it from many Lévy processes. There is no diffusion...

  • Variance inflation factor
    Variance inflation factor
    In statistics, the variance inflation factor quantifies the severity of multicollinearity in an ordinary least squares regression analysis...

  • Variance-gamma distribution
  • Variance reduction
    Variance reduction
    In mathematics, more specifically in the theory of Monte Carlo methods, variance reduction is a procedure used to increase the precision of the estimates that can be obtained for a given number of iterations. Every output random variable from the simulation is associated with a variance which...

  • Variance-stabilizing transformation
    Variance-stabilizing transformation
    In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or analysis of variance techniques.The aim behind the...

  • Variance-to-mean ratio
  • Variation ratio
  • Variational Bayesian methods
  • Variational message passing
    Variational message passing
    Variational message passing is an approximate inference technique for continuous- or discrete-valued Bayesian networks, with conjugate-exponential parents, developed by John Winn...

  • Variogram
    Variogram
    In spatial statistics the theoretical variogram 2\gamma is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z...

  • Varimax rotation
    Varimax rotation
    In statistics, a varimax rotation is a change of coordinates used in principal component analysis and factor analysis that maximizes the sum of the variances of the squared loadings...

  • Vasicek model
    Vasicek model
    In finance, the Vasicek model is a mathematical model describing the evolution of interest rates. It is a type of "one-factor model" as it describes interest rate movements as driven by only one source of market risk...

  • VC dimension
    VC dimension
    In statistical learning theory, or sometimes computational learning theory, the VC dimension is a measure of the capacity of a statistical classification algorithm, defined as the cardinality of the largest set of points that the algorithm can shatter...

  • VC theory
  • Vector autoregression
    Vector autoregression
    Vector autoregression is a statistical model used to capture the linear interdependencies among multiple time series. VAR models generalize the univariate autoregression models. All the variables in a VAR are treated symmetrically; each variable has an equation explaining its evolution based on...

  • VEGAS algorithm
    VEGAS algorithm
    The VEGAS algorithm, due to G. P. Lepage, is a method for reducing error in Monte Carlo simulations by using a known or approximate probability distribution function to concentrate the search in those areas of the graph that make the greatest contribution to the final integral.The VEGAS algorithm...

  • Violin plot
    Violin plot
    Violin plots are a method of plotting numeric data. A violin plot is a combination of a box plot and a kernel density plot. Specifically, it starts with a box plot...

  • ViSta - Software — redirects to ViSta, The Visual Statistics system
    ViSta, The Visual Statistics system
    ViSta, the Visual Statistics system is a freeware statistical system developed by Forrest W. Young of the University of North Carolina. ViSta current version maintained by Pedro M. Valero-Mora of the University of Valencia and can be found at...

  • Voigt profile
  • Volatility (finance)
    Volatility (finance)
    In finance, volatility is a measure for variation of price of a financial instrument over time. Historic volatility is derived from time series of past market prices...

  • Volcano plot (statistics)
    Volcano plot (statistics)
    In statistics, a volcano plot is a type of scatter-plot that is used to quickly identify changes in large datasets composed of replicate data . It plots significance versus fold-change on the y- and x-axes, respectively...

  • Von Mises distribution
  • Von Mises–Fisher distribution
  • V-optimal histograms
  • V-statistic
    V-statistic
    V-statistics are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947. V-statistics are closely related to U-statistics introduced by Wassily Hoeffding in 1948...

  • Vuong's closeness test
  • Vysochanskiï–Petunin inequality

W

  • Wald distribution redirects to Inverse Gaussian distribution
    Inverse Gaussian distribution
    | cdf = \Phi\left +\exp\left \Phi\left...

  • Wald test
    Wald test
    The Wald test is a parametric statistical test named after Abraham Wald with a great variety of uses. Whenever a relationship within or between data items can be expressed as a statistical model with parameters to be estimated from a sample, the Wald test can be used to test the true value of the...

  • Wald's decision theory
    Wald's decision theory
    Wald's decision theory was explicated in his last book, "Statistical decision functions"...

  • Wald–Wolfowitz runs test
  • Wallenius' noncentral hypergeometric distribution
    Wallenius' noncentral hypergeometric distribution
    In probability theory and statistics, Wallenius' noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where items are sampled with bias....

  • Wang and Landau algorithm
    Wang and Landau algorithm
    The Wang and Landau algorithm proposed by Fugao Wang and David P. Landau is an extension of Metropolis Monte Carlo sampling. It is designed to calculate the density of states of a computer-simulated system, such as an Ising model of spin glasses, or model atoms in a molecular force field...

  • Watterson estimator
    Watterson estimator
    In population genetics, the Watterson estimator is a method for estimating the population mutation rate, \theta = 4N_e\mu, where N_e is the effective population size and \mu is the per-generation mutation rate of the population of interest...

  • Watts and Strogatz model
    Watts and Strogatz model
    The Watts and Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. It was proposed by Duncan J. Watts and Steven Strogatz in their joint 1998 Nature paper...

  • Weibull chart — presently redirects to weibull distribution
  • Weibull distribution
  • Weibull modulus
    Weibull modulus
    The Weibull modulus is a dimensionless parameter of the Weibull distribution which is used to describe variability in measured material strength of brittle materials. For ceramics and other brittle materials, the maximum stress that a sample can be measured to withstand before failure may vary from...

  • Weight function
    Weight function
    A weight function is a mathematical device used when performing a sum, integral, or average in order to give some elements more "weight" or influence on the result than other elements in the same set. They occur frequently in statistics and analysis, and are closely related to the concept of a...

  • Weighted sample redirects to Sample mean and sample covariance
    Sample mean and sample covariance
    The sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...

  • Weighted covariance matrix redirects to Sample mean and sample covariance
    Sample mean and sample covariance
    The sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...

  • Weighted mean
    Weighted mean
    The weighted mean is similar to an arithmetic mean , where instead of each of the data points contributing equally to the final average, some data points contribute more than others...

  • Welch's t test
    Welch's t test
    In statistics, Welch's t test is an adaptation of Student's t-test intended for use with two samples having possibly unequal variances. As such, it is an approximate solution to the Behrens–Fisher problem.-Formulas:...

  • Welch–Satterthwaite equation
  • Well-behaved statistic
    Well-behaved statistic
    A well-behaved statistic is a term sometimes used in the theory of statistics to describe part of a procedure. This usage is broadly similar to the use of well-behaved in more general mathematics...

  • Wick product
    Wick product
    In probability theory, the Wick product\langle X_1,\dots,X_k \rangle\,named after physicist Gian-Carlo Wick, is a sort of product of the random variables, X1, ..., Xk, defined recursively as follows:\langle \rangle = 1\,...

  • Wilks' lambda distribution
    Wilks' lambda distribution
    In statistics, Wilks' lambda distribution , is a probability distribution used in multivariate hypothesis testing, especially with regard to the likelihood-ratio test and Multivariate analysis of variance...

  • Winsorized mean
    Winsorized mean
    A Winsorized mean is a Winsorized statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean...

  • Whipple's index
    Whipple's Index
    Survey or census respondents sometimes inaccurately report ages or dates of birth. Whipple's index , invented by the American demographer George Chandler Whipple , indicates the extent to which age data show systematic heaping on certain ages as a result of digit preference or rounding...

  • White test
    White test
    In statistics, the White test is a statistical test that establishes whether the residual variance of a variable in a regression model is constant: that is for homoscedasticity....

  • White noise
    White noise
    White noise is a random signal with a flat power spectral density. In other words, the signal contains equal power within a fixed bandwidth at any center frequency...

  • Wide and narrow data
    Wide and narrow data
    Wide and narrow are terms used to describe two different presentations for tabular data.- Wide :Wide, or unstacked data is presented with each different data variable in a separate column.- Narrow :...

  • Wiener deconvolution
    Wiener deconvolution
    In mathematics, Wiener deconvolution is an application of the Wiener filter to the noise problems inherent in deconvolution. It works in the frequency domain, attempting to minimize the impact of deconvoluted noise at frequencies which have a poor signal-to-noise ratio.The Wiener deconvolution...

  • Wiener filter
    Wiener filter
    In signal processing, the Wiener filter is a filter proposed by Norbert Wiener during the 1940s and published in 1949. Its purpose is to reduce the amount of noise present in a signal by comparison with an estimation of the desired noiseless signal. The discrete-time equivalent of Wiener's work was...

  • Wiener process
    Wiener process
    In mathematics, the Wiener process is a continuous-time stochastic process named in honor of Norbert Wiener. It is often called standard Brownian motion, after Robert Brown...

  • Wigner quasi-probability distribution
    Wigner quasi-probability distribution
    The Wigner quasi-probability distribution is a quasi-probability distribution. It was introduced by Eugene Wigner in 1932 to study quantum corrections to classical statistical mechanics...

  • Wigner semicircle distribution
  • Wike's law of low odd primes
    Wike's law of low odd primes
    Wike's law of low odd primes is a methodological principle to help design sound experiments in psychology. It is: "If the number of experimental treatments is a low odd prime number, then the experimental design is unbalanced and partially confounded" Wike's law of low odd primes is a...

  • Wilcoxon signed-rank test
    Wilcoxon signed-rank test
    The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples or repeated measurements on a single sample to assess whether their population mean ranks differ The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used...

  • Will Rogers phenomenon
    Will Rogers phenomenon
    The Will Rogers phenomenon is obtained when moving an element from one set to another set raises the average values of both sets. It is based on the following quote, attributed to comedian Will Rogers:...

  • WinBUGS
    WinBUGS
    WinBUGS is statistical software for Bayesian analysis using Markov chain Monte Carlo methods.It is based on the BUGS project started in 1989...

     – software
  • Window function
    Window function
    In signal processing, a window function is a mathematical function that is zero-valued outside of some chosen interval. For instance, a function that is constant inside the interval and zero elsewhere is called a rectangular window, which describes the shape of its graphical representation...

  • Winpepi
    Winpepi
    WinPepi is a freeware package of statistical programs for epidemiologists, comprising seven programs with over 120 modules. WinPepi is not a complete compendium of statistical routines for epidemiologists but it provides a very wide range of procedures, including those most commonly used and many...

     – software
  • Winsorising
    Winsorising
    Winsorising or Winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-turned-biostatistician Charles P. Winsor...

  • Wishart distribution
  • Wold's theorem
  • Wombling
    Wombling
    In statistics, Wombling is any of a number of techniques used for identifying zones of rapid change, typically in some quantity as it varies across some geographical or Euclidean space. It is named for statistician William H. Womble....

  • World Programming System
    World Programming System
    The World Programming System, also known as WPS, is a software product developed by a company called World Programming. WPS allows users to create, edit and run programs written in the language of SAS. The latest release of WPS covers a significant gap in use of WPS. It now provides PROC REG and...

     – software
  • Wrapped Cauchy distribution
  • Wrapped distribution
    Wrapped distribution
    In probability theory and directional statistics, a wrapped probability distribution is a continuous probability distribution that describes data points that lie on a unit n-sphere...

  • Wrapped exponential distribution
  • Wrapped normal distribution
    Wrapped normal distribution
    In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution which results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for...

  • Wrapped Lévy distribution
    Wrapped Lévy distribution
    In probability theory and directional statistics, a wrapped Lévy distribution is a wrapped probability distribution that results from the "wrapping" of the Lévy distribution around the unit circle.- Description :The pdf of the wrapped Lévy distribution is...

  • Writer invariant
    Writer invariant
    Writer invariant, also called authorial invariant or author's invariant, is a property of a text which is invariant of its author, that is, it will be similar in all texts of a given author and different in texts of different authors. It can be used to find plagiarism or discover who is real author...


X

  • X-12-ARIMA
    X-12-ARIMA
    X-12-ARIMA is the U.S. Census Bureau's software package for seasonal adjustment. It can be used together with gretl, which provides a graphical user interface for X-12-ARIMA.X-12-ARIMA is the successor to X-11-ARIMA-See also:*AnSWR*ARIMA*CSPro...

  • chart
    X-bar chart
    In industrial statistics, the X-bar chart is a type of Shewhart control chart that is used to monitor the arithmetic means of successive samples of constant size, n. This type of control chart is used for characteristics that can be measured on a continuous scale, such as weight, temperature,...

  • and R chart
  • and s chart
  • XLispStat
    XLispStat
    XLispStat is an open-source statistical scientific package based on the XLISP language.As from xlispstat startup: XLISP-PLUS version 3.04 Portions Copyright 1988, by David Betz. Modified by Thomas Almy and others....

     – software
  • XLSTAT
    XLSTAT
    XLSTAT is a commercial statistical and multivariate analysis software. The software has been developed by Addinsoft and was introduced by Thierry Fahmy, the founder of Addinsoft, in 1993. It is a Microsoft Excel add-in...

     – software
  • XploRe
    XploRe
    XploRe is the name of a commercial statistics software, developed by the German software company MD*Tech. XploRe is not sold anymore. The last version, 4.8, is available for download at no cost. The user interacts with the software via the XploRe programming language, which is derived from the C...

     – software

Y

  • Yamartino method
    Yamartino method
    The Yamartino method is an algorithm for calculating an approximation to the standard deviation σθ of wind direction θ during a single pass through the incoming data...

  • Yates analysis
    Yates Analysis
    Full- and fractional-factorial designs are common in designed experiments for engineering and scientific applications. In these designs, each factor is assigned two levels. These are typically called the low and high levels. For computational purposes, the factors are scaled so that the low level...

  • Yates's correction for continuity
  • Youden's J statistic
    Youden's J statistic
    Youden's J statistic is a single statistic that captures the performance of a diagnostic test. The use of such a single index is "not generally to be recommended". It is equal to the risk difference for a dichotomous test ....

  • Yule–Simon distribution
  • Yxilon
    Yxilon
    Yxilon is a modular open-source statistical programming language.Developed by Sigbert Klinke, Uwe Ziegenhagen, and Yuval Guri.A re-implementation of the XploRe language, with the intention of providing better performance by using compiled code instead of a language interpreter...

     – statistical programming language

Z

  • z-score
  • z-factor
    Z-factor
    The Z-factor is a measure of statistical effect size. It has been proposed for use in high-throughput screening to judge whether the response in a particular assay is large enough to warrant further attention.-Background:...

  • z statistic
    Z statistic
    In statistics, the Vuong closeness test is likelihood-ratio-based test for model selection using the Kullback-Leibler information criterion. This statistic makes probabilistic statements about two models. It tests the null hypothesis, that two models are as close to the actual model against the...

  • Z-test
    Z-test
    A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Due to the central limit theorem, many test statistics are approximately normally distributed for large samples...

  • Zakai equation
    Zakai equation
    In filtering theory the Zakai equation is a linear recursive filtering equation for the un-normalized density of a hidden state. In contrast, the Kushner equation gives a non-linear recursive equation for the normalized density of the hidden state...

  • Zelen's design
  • Zero-one law (disambiguation)
  • Zeta distribution
  • Ziggurat algorithm
    Ziggurat algorithm
    The ziggurat algorithm is an algorithm for pseudo-random number sampling. Belonging to the class of rejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from a pseudo-random number generator, as well as precomputed tables. The...

  • Zipf–Mandelbrot law — a discrete distribution
  • Zipf's law

See also

Supplementary lists
These lists include items which are somehow related to statistics however are not included in this index:

Topic lists

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK