List of statistical topics
Encyclopedia
0–9
- 1.961.961.96 is the approximate value of the 97.5 percentile point of the normal distribution used in probability and statistics. 95% of the area under a normal curve lies within roughly 1.96 standard deviations of the mean, and due to the central limit theorem, this number is therefore used in the...
- 2SLS (two-stage least squares) — redirects to instrumental variableInstrumental variableIn statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....
- 3SLS — redirects to Three-stage least squares
- 68-95-99.7 rule68-95-99.7 ruleIn statistics, the 68-95-99.7 rule, or three-sigma rule, or empirical rule, states that for a normal distribution, nearly all values lie within 3 standard deviations of the mean....
- 100-year flood100-year floodA one-hundred-year flood is calculated to be the level of flood water expected to be equaled or exceeded every 100 years on average. The 100-year flood is more accurately referred to as the 1% annual exceedance probability flood, since it is a flood that has a 1% chance of being equaled or exceeded...
A
- A posteriori probability (disambiguation)
- A priori probabilityA priori probabilityThe term a priori probability is used in distinguishing the ways in which values for probabilities can be obtained. In particular, an "a priori probability" is derived purely by deductive reasoning...
- A priori (statistics)A priori (statistics)In statistics, a priori knowledge is prior knowledge about a population, rather than that estimated by recent observation. It is common in Bayesian inference to make inferences conditional upon this knowledge, and the integration of a priori knowledge is the central difference between the Bayesian...
- Abductive reasoningAbductive reasoningAbduction is a kind of logical inference described by Charles Sanders Peirce as "guessing". The term refers to the process of arriving at an explanatory hypothesis. Peirce said that to abduce a hypothetical explanation a from an observed surprising circumstance b is to surmise that a may be true...
- Absolute deviationAbsolute deviationIn statistics, the absolute deviation of an element of a data set is the absolute difference between that element and a given point. Typically the point from which the deviation is measured is a measure of central tendency, most often the median or sometimes the mean of the data set.D_i = |x_i-m|...
- Absolute risk reductionAbsolute risk reductionIn epidemiology, the absolute risk reduction or risk difference is the decrease in risk of a given activity or treatment in relation to a control activity or treatment. It is the inverse of the number needed to treat....
- ABX testABX testAn ABX test is a method of comparing two kinds of sensory stimuli to identify detectable differences. A subject is presented with two known samples , and one unknown sample X, for three samples total. X is randomly selected from A and B, and the subject identifies X as being either A or B...
- Accelerated failure time modelAccelerated failure time modelIn the statistical area of survival analysis, an accelerated failure time model is a parametric model that provides an alternative to the commonly used proportional hazards models...
- Acceptable quality limitAcceptable quality limitThe acceptable quality limit is the worst tolerable process average in percentage or ratio, that is still considered acceptable: that is, it is at an acceptable quality level...
- Acceptance samplingAcceptance samplingAcceptance sampling uses statistical sampling to determine whether to accept or reject a production lot of material. It has been a common quality control technique used in industry and particularly the military for contracts and procurement. It is usually done as products leave the factory, or in...
- Accidental samplingAccidental samplingAccidental sampling is a type of nonprobability sampling which involves the sample being drawn from that part of the population which is close to hand. That is, a sample population selected because it is readily available and convenient...
- Accuracy and precisionAccuracy and precisionIn the fields of science, engineering, industry and statistics, the accuracy of a measurement system is the degree of closeness of measurements of a quantity to that quantity's actual value. The precision of a measurement system, also called reproducibility or repeatability, is the degree to which...
- Accuracy paradoxAccuracy paradoxThe accuracy paradox for predictive analytics states that predictive models with a given level of accuracy may have greater predictive power than models with higher accuracy...
- Acquiescence biasAcquiescence biasAcquiescence bias is a category of response bias in which respondents to a survey have a tendency to agree with all the questions or to indicate a positive connotation. Acquiescence is sometimes referred to as "yah-saying" and is the tendency of a respondent to agree with a statement when in doubt...
- Actuarial scienceActuarial scienceActuarial science is the discipline that applies mathematical and statistical methods to assess risk in the insurance and finance industries. Actuaries are professionals who are qualified in this field through education and experience...
- ADAPAADAPAADAPA is intrinsically a predictive decisioning platform. It combines the power of predictive analytics and business rules to facilitate the tasks of managing and designing automated decisioning systems.-Automated decisions:...
– software - Adapted processAdapted processIn the study of stochastic processes, an adapted process is one that cannot "see into the future". An informal interpretation is that X is adapted if and only if, for every realisation and every n, Xn is known at time n...
- Adaptive estimatorAdaptive estimatorIn statistics, an adaptive estimator is an estimator in a parametric or semiparametric model with nuisance parameters such that the presence of these nuisance parameters does not affect efficiency of estimation.-Definition:...
- Additive Markov chainAdditive Markov chainIn probability theory, an additive Markov chain is a Markov chain with an additive conditional probability function. Here the process is a discrete-time Markov chain of order m and the transition probability to a state at the next time is a sum of functions, each depending on the next state and one...
- Additive modelAdditive modelIn statistics, an additive model is a nonparametric regression method. It was suggested by Jerome H. Friedman and Werner Stuetzle and is an essential part of the ACE algorithm. The AM uses a one dimensional smoother to build a restricted class of nonparametric regression models. Because of this,...
- Additive smoothingAdditive smoothingIn statistics, additive smoothing, also called Laplace smoothing , or Lidstone smoothing, is a technique used to smooth categorical data...
- Additive white Gaussian noiseAdditive white Gaussian noiseAdditive white Gaussian noise is a channel model in which the only impairment to communication is a linear addition of wideband or white noise with a constant spectral density and a Gaussian distribution of amplitude. The model does not account for fading, frequency selectivity, interference,...
- Adjusted Rand index — redirects to Rand indexRand indexThe Rand index or Rand measure in statistics, and in particular in data clustering, is a measure of the similarity between two data clusterings...
(subsection) - ADMBADMBADMB or AD Model Builder is a free and open source software suite for non-linear statistical modeling. It was created by David Fournier and now being developed by the ADMB Project, a creation of the non-profit ADMB Foundation...
– software - Admissible decision ruleAdmissible decision ruleIn statistical decision theory, an admissible decision rule is a rule for making a decision such that there isn't any other rule that is always "better" than it, in a specific sense defined below....
- Age adjustmentAge adjustmentIn epidemiology and demography, age adjustment, also called age standardisation, is a technique used to better allow populations to be compared when the age profiles of the populations are quite different....
- Age-standardized mortality rate
- Age stratificationAge stratificationIn critical sociology, age stratification refers to the hierarchical ranking of people into age groups within a society.Age stratification which is based on an ascribed status is a major source inequality, and thus may lead to ageism.-External links:* *...
- Aggregate dataAggregate dataIn statistics, aggregate data describes data combined from several measurements.In economics, aggregate data or data aggregates describes high-level data that is composed of a multitude or combination of other more individual data....
- Aggregate patternAggregate patternAn Aggregate pattern can refer to concepts in either statistics or computer programming. Both uses deal with considering a large case as composed of smaller, simpler, pieces.- Statistics :...
- Akaike information criterionAkaike information criterionThe Akaike information criterion is a measure of the relative goodness of fit of a statistical model. It was developed by Hirotsugu Akaike, under the name of "an information criterion" , and was first published by Akaike in 1974...
- Algebra of random variablesAlgebra of random variablesIn the algebraic axiomatization of probability theory, the primary concept is not that of probability of an event, but rather that of a random variable. Probability distributions are determined by assigning an expectation to each random variable...
- Algebraic statisticsAlgebraic statisticsAlgebraic statistics is the use of algebra to advance statistics. Algebra has been useful for experimental design, parameter estimation, and hypothesis testing....
- Algorithmic inferenceAlgorithmic inferenceAlgorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst...
- Algorithms for calculating varianceAlgorithms for calculating varianceAlgorithms for calculating variance play a major role in statistical computing. A key problem in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical instability as well as to arithmetic overflow when dealing with...
- All-pairs testingAll-pairs testingAll-pairs testing or pairwise testing is a combinatorial software testing method that, for each pair of input parameters to a system , tests all possible discrete combinations of those parameters...
- Allan varianceAllan varianceThe Allan variance , also known as two-sample variance, is a measure of frequency stability in clocks, oscillators and amplifiers. It is named after David W. Allan. It is expressed mathematically as\sigma_y^2. \,...
- Alignments of random pointsAlignments of random pointsAlignments of random points, as shown by statistics, can be found when a large number of random points are marked on a bounded flat surface. This might be used to show that ley lines exist due to chance alone .One precise definition which expresses the generally accepted meaning of "alignment"...
- Almost surelyAlmost surelyIn probability theory, one says that an event happens almost surely if it happens with probability one. The concept is analogous to the concept of "almost everywhere" in measure theory...
- Alpha beta filterAlpha beta filterAn alpha beta filter is a simplified form of observer for estimation, data smoothing and control applications. It is closely related to Kalman filters and to linear state observers used in control theory...
- Alternative hypothesis
- Analyse-itAnalyse-itAnalyse-it is a statistical analysis add-in for Microsoft Excel. Analyse-it is the successor to Astute, developed in 1992 for Excel 4 and the first statistical analysis add-in for Microsoft Excel...
– software - Analysis of categorical dataAnalysis of categorical dataThis a list of statistical procedures which can be used for the analysis of categorical data, also known as data on the nominal scale and as categorical variables* Categorical distribution, general model* Stratified analysis* Chi-squared test...
- Analysis of covariance
- Analysis of molecular varianceAnalysis of molecular varianceAnalysis of molecular variance , is a statistical model for the molecular variation in a single species, typically biological. The name and model are inspired by ANOVA. The method was developed by Laurent Excoffier, Peter Smouse and Joseph Quattro at Rutgers University in 1992.Since developing...
- Analysis of rhythmic varianceAnalysis of rhythmic varianceIn statistics, analysis of rhythmic variance is a method for detecting rhythms in biological time series, published by Peter Celec . It is a procedure for detecting cyclic variations in biological time series and quantification of their probability...
- Analysis of varianceAnalysis of varianceIn statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation...
- Analytic and enumerative statistical studiesAnalytic and enumerative statistical studiesAnalytic and enumerative statistical studies are two types of scientific studies:In any statistical study the ultimate aim is to provide a rational basis for action. Enumerative and analytic studies differ by where the action is taken...
- Ancestral graphAncestral graphAn ancestral graph is a graph with three types of edges: directed edge, bidirected edge, and undirected edge such that it can be decomposed into three parts: an undirected subgraph, a directed subgraph, and directed edges pointing from the undirected subgraph to the directed subgraph.An ancestral...
- Anchor testAnchor testIn psychometrics, an anchor test is a common set of test items administered in combination with two or more alternative forms of the test with the aim of establishing the equivalence of the test scores on the alternative forms. The purpose of the anchor test is to provide a baseline for an...
- Ancillary statisticAncillary statisticIn statistics, an ancillary statistic is a statistic whose sampling distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken...
- ANCOVAANCOVAIn statistics, analysis of covariance is a general linear model with a continuous outcome variable and two or more predictor variables where at least one is continuous and at least one is categorical . ANCOVA is a merger of ANOVA and regression for continuous variables...
– redirects to Analysis of covariance - Anderson–Darling test
- ANOVA
- ANOVA on ranksANOVA on ranksIn statistics, one purpose for the analysis of variance is to analyze differences in means between groups. The test statistic, F, assumes independence of observations, homogeneous variances, and population normality...
- ANOVA-simultaneous component analysisANOVA-simultaneous component analysisASCA, ANOVA-SCA, or analysis of variance – simultaneous component analysis is a method that partitions variation and enables interpretation of these partitions by SCA, a method that is similar to PCA. This method is a multi or even megavariate extension of ANOVA. The variation partitioning is...
- Anomaly detectionAnomaly detectionAnomaly detection, also referred to as outlier detection refers to detecting patterns in a given data set that do not conform to an established normal behavior....
- Anomaly time seriesAnomaly time seriesIn atmospheric sciences and some other applications of statistics, an anomaly time series is the time series of deviations of a quantity from some mean. Similarly a standardized anomaly series contains values of deviations divided by a standard deviation...
- Anscombe transformAnscombe transformIn statistics, the Anscombe transform, named after Francis Anscombe, is a variance-stabilizing transformation that transforms a random variable with a Poisson distribution into one with an approximately standard Gaussian distribution. The Anscombe transform is widely used in photon-limited imaging ...
- Anscombe's quartetAnscombe's quartetAnscombe's quartet comprises four datasets that have identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven points. They were constructed in 1973 by the statistician F.J...
- Antecedent variableAntecedent variableIn statistics and social sciences, an antecedent variable is a variable that can help to explain the apparent relationship between other variables that are nominally in a cause and effect relationship...
- Antithetic variatesAntithetic variatesThe antithetic variates method is a variance reduction technique used in Monte Carlo methods. Considering that the error reduction in the simulated signal has a square root convergence , a very large number of sample paths is required to obtain an accurate result.-Underlying principle:The...
- Approximate Bayesian computationApproximate Bayesian computationApproximate Bayesian computation is a family of computational techniques in Bayesian statistics. These simulation techniques operate on summary data to make broad inferences with less computation than might be required if all available data were analyzed in detail...
- Arcsine distribution
- Area chartArea chartAn area chart or area graph displays graphically quantitive data. It is based on the line chart. The area between axis and line are commonly emphasized with colors, textures and hatchings...
- Area compatibility factor
- ARGUS distribution
- Arithmetic meanArithmetic meanIn mathematics and statistics, the arithmetic mean, often referred to as simply the mean or average when the context is clear, is a method to derive the central tendency of a sample space...
- Armitage–Doll multistage model of carcinogenesis
- Arrival theorem
- Artificial neural networkArtificial neural networkAn artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...
- Ascertainment bias
- ASRemlASRemlASReml is a statistical software package for fitting linear mixed models using restricted maximum likelihood, a technique commonly used in plant and animal breeding and quantitative genetics as well as other fields...
– software - Association (statistics)Association (statistics)In statistics, an association is any relationship between two measured quantities that renders them statistically dependent. The term "association" refers broadly to any such relationship, whereas the narrower term "correlation" refers to a linear relationship between two quantities.There are many...
- Association mappingAssociation mappingAssociation mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes .-Theory:Association mapping is based on the idea that traits that have entered a population...
- Association schemeAssociation schemeThe theory of association schemes arose in statistics, in the theory of experimental design for the analysis of variance. In mathematics, association schemes belong to both algebra and combinatorics. Indeed, in algebraic combinatorics, association schemes provide a unified approach to many topics,...
- Assumed meanAssumed meanIn statistics the assumed mean is a method for calculating the arithmetic mean and standard deviation of a data set. It simplifies calculating accurate values by hand. Its interest today is chiefly historical but it can be used to quickly estimate these statistics...
- Asymptotic distributionAsymptotic distributionIn mathematics and statistics, an asymptotic distribution is a hypothetical distribution that is in a sense the "limiting" distribution of a sequence of distributions...
- Asymptotic equipartition propertyAsymptotic equipartition propertyIn information theory the asymptotic equipartition property is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of compression....
(information theory) - Asymptotic normality – redirects to Asymptotic distributionAsymptotic distributionIn mathematics and statistics, an asymptotic distribution is a hypothetical distribution that is in a sense the "limiting" distribution of a sequence of distributions...
- Asymptotic relative efficiency redirects to Efficiency (statistics)Efficiency (statistics)In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors...
- Asymptotic theory (statistics)Asymptotic theory (statistics)In statistics, asymptotic theory, or large sample theory, is a generic framework for assessment of properties of estimators and statistical tests...
- Atkinson indexAtkinson indexThe Atkinson index is a measure of income inequality developed by British economist Anthony Barnes Atkinson...
- Attack rateAttack rateIn epidemiology, an attack rate is the cumulative incidence of infection in a group of people observed over a period of time during an epidemic, usually in relation to foodborne illness....
- Augmented Dickey–Fuller test
- Aumann's agreement theoremAumann's agreement theoremAumann's agreement theorem says that two people acting rationally and with common knowledge of each other's beliefs cannot agree to disagree...
- AutocorrelationAutocorrelationAutocorrelation is the cross-correlation of a signal with itself. Informally, it is the similarity between observations as a function of the time separation between them...
- Autocorrelation plot redirects to CorrelogramCorrelogramIn the analysis of data, a correlogram is an image of correlation statistics. For example, in time series analysis, a correlogram, also known as an autocorrelation plot, is a plot of the sample autocorrelations r_h\, versus h\, ....
- Autocorrelation plot redirects to Correlogram
- AutocovarianceAutocovarianceIn statistics, given a real stochastic process X, the autocovariance is the covariance of the variable with itself, i.e. the variance of the variable against a time-shifted version of itself...
- Autoregressive conditional durationAutoregressive conditional durationIn financial econometrics, an autoregressive conditional duration model considers irregularly spaced and autocorrelated intertrade durations. ACD is analogous to GARCH...
- Autoregressive conditional heteroskedasticityAutoregressive conditional heteroskedasticityIn econometrics, AutoRegressive Conditional Heteroskedasticity models are used to characterize and model observed time series. They are used whenever there is reason to believe that, at any point in a series, the terms will have a characteristic size, or variance...
- Autoregressive fractionally integrated moving averageAutoregressive fractionally integrated moving averageIn statistics, autoregressive fractionally integrated moving average models are time series models that generalize ARIMA models by allowing non-integer values of the differencing parameter and are useful in modeling time series with long memory...
- Autoregressive integrated moving averageAutoregressive integrated moving averageIn statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average model is a generalization of an autoregressive moving average model. These models are fitted to time series data either to better understand the data or to predict future points...
- Autoregressive modelAutoregressive modelIn statistics and signal processing, an autoregressive model is a type of random process which is often used to model and predict various types of natural phenomena...
- Autoregressive moving average modelAutoregressive moving average modelIn statistics and signal processing, autoregressive–moving-average models, sometimes called Box–Jenkins models after the iterative Box–Jenkins methodology usually used to estimate them, are typically applied to autocorrelated time series data.Given a time series of data Xt, the ARMA model is a...
- Auxiliary particle filterAuxiliary particle filterThe auxiliary particle filter is a particle filtering algorithm introduced by Pitt and Shephard in 1999 to improve some deficiencies of the sequential importance resampling algorithm when dealing with tailed observation densities....
- AverageAverageIn mathematics, an average, or central tendency of a data set is a measure of the "middle" value of the data set. Average is one form of central tendency. Not all central tendencies should be considered definitions of average....
- Average treatment effect
- Averaged one-dependence estimators
- Azuma's inequality
B
- BA modelBA modelThe Barabási–Albert model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Scale-free networks are widely observed in natural and man-made systems, including the Internet, the world wide web, citation networks, and some social...
– model for a random network - Backfitting algorithmBackfitting algorithmIn statistics, the backfitting algorithm is a simple iterative procedure used to fit a generalized additive model. It was introduced in 1985 by Leo Breiman and Jerome Friedman along with generalized additive models...
- Balance equationBalance equationIn probability theory, a balance equation is an equation that describes the probability flux associated with a Markov chain in and out of states or set of states.-Global balance:...
- Balanced incomplete block design redirects to Block design
- Balanced repeated replicationBalanced repeated replicationBalanced repeated replication is a statistical technique for estimating the sampling variability of a statistic obtained by stratified sampling.- Outline of the technique :# Select balanced half-samples from the full sample....
- Balding–Nichols model
- BanburismusBanburismusBanburismus was a cryptanalytic process developed by Alan Turing at Bletchley Park in England during the Second World War. It was used by Bletchley Park's Hut 8 to help break German Kriegsmarine messages enciphered on Enigma machines. The process used sequential conditional probability to infer...
— related to Bayesian networks - Bapat–Beg theorem
- Bar chartBar chartA bar chart or bar graph is a chart with rectangular bars with lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally....
- Barabási–Albert model
- Barber–Johnson diagram
- Barnard's testBarnard's testIn statistics, Barnard's test is an exact test of the null hypothesis of independence of rows and columns in a contingency table. It is an alternative to Fisher's exact test but is more time-consuming to compute...
- BarnardisationBarnardisationBarnardisation is a method of disclosure control for tables of counts that involves randomly adding or subtracting 1 from some cells in the table....
- Barnes interpolationBarnes interpolationBarnes interpolation, named after Stanley L. Barnes, is the interpolation of unstructured data points from a set of measurements of an unknown function in two dimensions into an analytic function of two variables...
- Bartlett's method
- Bartlett's testBartlett's testIn statistics, Bartlett's test is used to test if k samples are from populations with equal variances. Equal variances across samples is called homoscedasticity or homogeneity of variances. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups...
- Base rateBase rateIn probability and statistics, base rate generally refers to the class probabilities unconditioned on featural evidence, frequently also known as prior probabilities...
- Baseball statisticsBaseball statisticsStatistics play an important role in summarizing baseball performance and evaluating players in the sport.Since the flow of a baseball game has natural breaks to it, and normally players act individually rather than performing in clusters, the sport lends itself to easy record-keeping and statistics...
- Basu's theoremBasu's theoremIn statistics, Basu's theorem states that any boundedly complete sufficient statistic is independent of any ancillary statistic. This is a 1955 result of Debabrata Basu....
- Bates distributionBates distributionIn probability and statistics, the Bates distribution, is a probability distribution of the mean of a number of statistically independent uniformly distributed random variables on the unit interval...
- Baum–Welch algorithm
- Bayes' ruleBayes' ruleIn probability theory and applications, Bayes' rule relates the odds of event A_1 to event A_2, before and after conditioning on event B. The relationship is expressed in terms of the Bayes factor, \Lambda. Bayes' rule is derived from and closely related to Bayes' theorem...
- Bayes' theoremBayes' theoremIn probability theory and applications, Bayes' theorem relates the conditional probabilities P and P. It is commonly used in science and engineering. The theorem is named for Thomas Bayes ....
- Evidence under Bayes theoremEvidence under Bayes theoremBayes' theorem provides a way of updating the probability of an event in the light of new information. In the evidence law context, for example, it could be used as a way of updating the probability that a genetic sample found at the scene of the crime came from the defendant in light of a genetic...
- Evidence under Bayes theorem
- Bayes estimatorBayes estimatorIn estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function . Equivalently, it maximizes the posterior expectation of a utility function...
- Bayes factorBayes factorIn statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing. Bayesian model comparison is a method of model selection based on Bayes factors.-Definition:...
- Bayes linear statistics
- Bayesian — disambiguation
- Bayesian additive regression kernelsBayesian additive regression kernelsBayesian additive regression kernels is a non-parametric statistical model for regression and statistical classification.The unknown mean function is represented as a weighted sum of kernel functions, which is constructed by a prior using...
- Bayesian averageBayesian averageA Bayesian average is a method of estimating the mean of a population consistent with Bayesian interpretation, where instead of estimating the mean strictly from the available data set, other existing information related to that data set may also be incorporated into the calculation in order to...
- Bayesian brainBayesian brainBayesian brain is a term that is used to refer to the ability of the nervous system to operate in situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics. This term is used in behavioural sciences and neuroscience and studies associated with this term...
- Bayesian econometricsBayesian econometricsBayesian econometrics is a branch of econometrics which applies Bayesian principles to economic modelling. Bayesianism is based on a degree-of-belief interpretation of probability, as opposed to a relative-frequency interpretation....
- Bayesian experimental designBayesian experimental designBayesian experimental design provides a general probability-theoretical framework from which other theories on experimental design can be derived. It is based on Bayesian inference to interpret the observations/data acquired during the experiment...
- Bayesian gameBayesian gameIn game theory, a Bayesian game is one in which information about characteristics of the other players is incomplete. Following John C. Harsanyi's framework, a Bayesian game can be modelled by introducing Nature as a player in a game...
- Bayesian inferenceBayesian inferenceIn statistics, Bayesian inference is a method of statistical inference. It is often used in science and engineering to determine model parameters, make predictions about unknown variables, and to perform model selection...
- Bayesian inference in phylogenyBayesian inference in phylogenyBayesian inference in phylogeny generates a posterior distribution for a parameter, composed of a phylogenetic tree and a model of evolution, based on the prior for that parameter and the likelihood of the data, generated by a multiple alignment. The Bayesian approach has become more popular due...
- Bayesian information criterion
- Bayesian linear regressionBayesian linear regressionIn statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within the context of Bayesian inference...
- Bayesian model comparison — redirects to Bayes factorBayes factorIn statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing. Bayesian model comparison is a method of model selection based on Bayes factors.-Definition:...
- Bayesian multivariate linear regression
- Bayesian networkBayesian networkA Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph . For example, a Bayesian network could represent the probabilistic...
- Bayesian probabilityBayesian probabilityBayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...
- Bayesian search theoryBayesian search theoryBayesian search theory is the application of Bayesian statistics to the search for lost objects. It has been used several times to find lost sea vessels, for example the USS Scorpion.-Procedure:The usual procedure is as follows:...
- Bayesian spam filteringBayesian spam filteringBayesian spam filtering is a statistical technique of e-mail filtering. It makes use of a naive Bayes classifier to identify spam e-mail.Bayesian classifiers work by correlating the use of tokens , with spam and non spam e-mails and then using Bayesian inference to calculate a probability that an...
- Bayesian statisticsBayesian statisticsBayesian statistics is that subset of the entire field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief or, more specifically, Bayesian probabilities...
- Bayesian VARBayesian VARBayesian Vector Autoregression is a term which indicates that Bayesian methods are used to estimate a vector autoregression . In that respect, the difference with standard VAR models lies on the fact that the model parameters are treated as random variables, and prior probabilities are assigned to...
— Bayesian Vector Autoregression - BCMP networkBCMP networkIn queueing theory, a discipline within the mathematical theory of probability, a BCMP network is a class of queueing network for which a product form equilibrium distribution exists. It is named after the authors of the paper where the network was first described: Baskett, Chandy, Muntz and Palacios...
– queueing theory - Bean machineBean machineThe bean machine, also known as the quincunx or Galton box, is a device invented by Sir Francis Galton to demonstrate the central limit theorem, in particular that the normal distribution is approximate to the binomial distribution....
- Behrens–Fisher problem
- Belief propagationBelief propagationBelief propagation is a message passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for each unobserved node, conditional on any observed nodes...
- Belt transectBelt transectBelt transects are used in biology to investigate the distribution of organisms in relation to a certain area, such as the seashore or a meadow. It records all the species found between two lines and how far they are for a certain place or area and how many of them there are...
- Benford's lawBenford's lawBenford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is distributed in a specific, non-uniform way...
- Bennett's inequalityBennett's inequalityIn probability theory, Bennett's inequality provides an upper bound on the probability that the sum of independent random variables deviates from its expected value by more than any specified amount...
- Berkson error modelBerkson error modelThe Berkson error model is a description of random error in measurement. Unlike classical error, Berkson error causes little or no bias in the measurement. It was proposed by Joseph Berkson in a paper entitled Are there two regressions?, published in 1950.An example of Berkson error arises in...
- Berkson's paradoxBerkson's paradoxBerkson's paradox or Berkson's fallacy is a result in conditional probability and statistics which is counter-intuitive for some people, and hence a veridical paradox. It is a complicating factor arising in statistical tests of proportions...
- Berlin procedureBerlin procedureThe so-called Berlin procedure is a mathematical procedure for time series decomposition and seasonal adjustment of monthly and quarterly economic time series. The mathematical foundations of the procedure were developed in 1960's at the Technical University of Berlin and the German Institute...
- Bernoulli distribution
- Bernoulli processBernoulli processIn probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variables Xi are identical and independent...
- Bernoulli samplingBernoulli samplingIn the theory of finite population sampling, Bernoulli sampling is a sampling process where each element of the population that is sampled is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample during the drawing of a single sample...
- Bernoulli schemeBernoulli schemeIn mathematics, the Bernoulli scheme or Bernoulli shift is a generalization of the Bernoulli process to more than two possible outcomes. Bernoulli schemes are important in the study of dynamical systems, as most such systems exhibit a repellor that is the product of the Cantor set and a smooth...
- Bernoulli trialBernoulli trialIn the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure"....
- Bernstein inequalities (probability theory)
- Bernstein–von Mises theoremBernstein–von Mises theoremIn Bayesian inference, the Bernstein–von Mises theorem provides the basis for the important result that the posterior distribution for unknown quantities in any problem is effectively independent of the prior distribution once the amount of information supplied by a sample of data is large...
- Berry–Esseen theoremBerry–Esséen theoremThe central limit theorem in probability theory and statistics states that under certain circumstances the sample mean, considered as a random quantity, becomes more normally distributed as the sample size is increased...
- Bertrand's ballot theorem
- Bertrand's box paradoxBertrand's box paradoxBertrand's box paradox is a classic paradox of elementary probability theory. It was first posed by Joseph Bertrand in his Calcul des probabilités, published in 1889.There are three boxes:# a box containing two gold coins,...
- Bessel processBessel processIn mathematics, a Bessel process, named after Friedrich Bessel, is a type of stochastic process. The n-dimensional Bessel process is the real-valued process X given byX_t = \| W_t \|,...
- Bessel's correctionBessel's correctionIn statistics, Bessel's correction, named after Friedrich Bessel, is the use of n − 1 instead of n in the formula for the sample variance and sample standard deviation, where n is the number of observations in a sample: it corrects the bias in the estimation of the population variance,...
- Best linear unbiased predictionBest linear unbiased predictionIn statistics, best linear unbiased prediction is used in linear mixed models for the estimation of random effects. BLUP was derived by Charles Roy Henderson in 1950 but the term "best linear unbiased predictor" seems not to have been used until 1962...
- Beta (finance)
- Beta-binomial distribution
- Beta-binomial model
- Beta distribution
- Beta function – for incomplete beta function
- Beta negative binomial distribution
- Beta prime distribution
- Beverton–Holt model
- Bhatia–Davis inequalityBhatia–Davis inequalityIn mathematics, the Bhatia–Davis inequality, named after Rajendra Bhatia and Chandler Davis, is an upper bound on the variance of any bounded probability distribution on the real line....
- Bhattacharya coefficient redirects to Bhattacharyya distanceBhattacharyya distanceIn statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after A...
- Bias (statistics)Bias (statistics)A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter of interest. The following lists some types of, or aspects of, bias which should not be considered mutually exclusive:...
- Bias of an estimatorBias of an estimatorIn statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...
- Biased random walk (biochemistry)Biased random walk (biochemistry)In cell biology, a biased random walk enables bacteria to search for food and flee from harm. Bacteria propel themselves with the aid of flagella in a process called chemotaxis, and a typical bacteria trajectory has many characteristics of a random walk. They move forward for a certain distance,...
- Biased sample – redirects to Sampling bias
- BiclusteringBiclusteringBiclustering, co-clustering, or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix....
- Big O in probability notationBig O in probability notationThe order in probability notation is used in probability theory and statistical theory in direct parallel to the big-O notation which is standard in mathematics...
- Bienaymé–Chebyshev inequalityChebyshev's inequalityIn probability theory, Chebyshev’s inequality guarantees that in any data sample or probability distribution,"nearly all" values are close to the mean — the precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean...
- Bills of MortalityBills of MortalityThe London Bills of Mortality were the main source of mortality statistics, designed to monitor deaths from the plague from the 17th century-1830s. They were used mainly as a way of warning about plague epidemics....
- Bimodal distributionBimodal distributionIn statistics, a bimodal distribution is a continuous probability distribution with two different modes. These appear as distinct peaks in the probability density function, as shown in Figure 1....
- Binary classificationBinary classificationBinary classification is the task of classifying the members of a given set of objects into two groups on the basis of whether they have some property or not. Some typical binary classification tasks are...
- Bingham distributionBingham distributionIn statistics, the Bingham distribution, named after Christopher Bingham, is an antipodally symmetric probability distribution on the n-sphere...
- Binomial distribution
- Binomial proportion confidence intervalBinomial proportion confidence intervalIn statistics, a binomial proportion confidence interval is a confidence interval for a proportion in a statistical population. It uses the proportion estimated in a statistical sample and allows for sampling error. There are several formulas for a binomial confidence interval, but all of them rely...
- Binomial regressionBinomial regressionIn statistics, binomial regression is a technique in which the response is the result of a series of Bernoulli trials, or a series of one of two possible disjoint outcomes...
- Binomial testBinomial testIn statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories.-Common use:...
- BioinformaticsBioinformaticsBioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...
- Biometrics (statistics) — redirects to BiostatisticsBiostatisticsBiostatistics is the application of statistics to a wide range of topics in biology...
- BiostatisticsBiostatisticsBiostatistics is the application of statistics to a wide range of topics in biology...
- BiplotBiplotBiplots are a type of exploratory graph used in statistics, a generalization of the simple two-variable scatterplot. A biplot allows information on both samples and variables of a data matrix to be displayed graphically. Samples are displayed as points while variables are displayed either as...
- Birnbaum–Saunders distribution
- Birth-death processBirth-death processThe birth–death process is a special case of continuous-time Markov process where the states represent the current size of a population and where the transitions are limited to births and deaths...
- BispectrumBispectrumIn mathematics, in the area of statistical analysis, the bispectrum is a statistic used to search for nonlinear interactions. The Fourier transform of the second-order cumulant, i.e., the autocorrelation function, is the traditional power spectrum...
- Bivariate analysisBivariate analysisBivariate analysis is one of the simplest forms of the quantitative analysis. It involves the analysis of two variables , for the purpose of determining the empirical relationship between them...
- Bivariate von Mises distributionBivariate von Mises distributionIn probability theory and statistics, the bivariate von Mises distribution is a probability distribution describing values on a torus. It may be thought of as an analogue on the torus of the bivariate normal distribution. The distribution belongs to the field of directional statistics. The general...
- Black–Scholes
- Bland–Altman plot
- Blind deconvolutionBlind deconvolutionIn image processing and applied mathematics, blind deconvolution is a deconvolution technique that permits recovery of the target scene from a single or set of "blurred" images in the presence of a poorly determined or unknown point spread function ....
- Blind experiment
- Block design
- Blocking (statistics)Blocking (statistics)In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups that are similar to one another. For example, an experiment is designed to test a new drug on patients. There are two levels of the treatment, drug, and placebo, administered to male...
- BMDPBMDPBMDP is a statistical package developed in 1961 at UCLA. Based on the older BIMED program for biomedical applications, it used keyword parameters in the input instead of fixed-format cards, so the letter P was added to the letters BMD, although the name was later defined as being an abbreviation...
– software - Bochner's theoremBochner's theoremIn mathematics, Bochner's theorem characterizes the Fourier transform of a positive finite Borel measure on the real line.- Background :...
- Bonferroni correctionBonferroni correctionIn statistics, the Bonferroni correction is a method used to counteract the problem of multiple comparisons. It was developed and introduced by Italian mathematician Carlo Emilio Bonferroni...
- Bonferroni inequalities – redirects to Boole's inequalityBoole's inequalityIn probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events...
- Boole's inequalityBoole's inequalityIn probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events...
- Boolean analysisBoolean analysisBoolean analysis was introduced by Flament . The goal of a Boolean analysis is to detect deterministic dependencies between the items of a questionnaire or similar data-structures in observed response patterns. These deterministic dependencies have the form of logical formulas connecting the items...
- Bootstrap aggregatingBootstrap aggregatingBootstrap aggregating is a machine learning ensemble meta-algorithm to improve machine learning of statistical classification and regression models in terms of stability and classification accuracy. It also reduces variance and helps to avoid overfitting. Although it is usually applied to decision...
- Bootstrap error-adjusted single-sample techniqueBootstrap error-adjusted single-sample techniqueIn statistics, the bootstrap error-adjusted single-sample technique is a non-parametric method that is intended to allow an assessment to be made of the validity of a single sample. It is based on estimating a probability distribution representing what can be expected from valid samples...
- Bootstrapping (statistics)Bootstrapping (statistics)In statistics, bootstrapping is a computer-based method for assigning measures of accuracy to sample estimates . This technique allows estimation of the sample distribution of almost any statistic using only very simple methods...
- Bootstrapping populations
- Borel–Cantelli lemma
- Bose–Mesner algebraBose–Mesner algebraIn mathematics, a Bose–Mesner algebra is a set of matrices, together with set of rules for combining those matrices, such that certain conditions apply...
- Box–Behnken design
- Box–Cox distribution
- Box–Cox transformation – redirects to Power transformPower transformIn statistics, the power transform is from a family of functions that are applied to create a rank-preserving transformation of data using power functions. This is a useful data processing technique used to stabilize variance, make the data more normal distribution-like, improve the correlation...
- Box–Jenkins
- Box–Muller transform
- Box–Pierce test
- Box plotBox plotIn descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation , lower quartile , median , upper quartile , and largest observation...
- Branching processBranching processIn probability theory, a branching process is a Markov process that models a population in which each individual in generation n produces some random number of individuals in generation n + 1, according to a fixed probability distribution that does not vary from individual to...
- Bregman divergenceBregman divergenceIn mathematics, the Bregman divergence or Bregman distance is similar to a metric, but does not satisfy the triangle inequality nor symmetry. There are two ways in which Bregman divergences are important. Firstly, they generalize squared Euclidean distance to a class of distances that all share...
- Breusch–Godfrey testBreusch–Godfrey testIn statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series...
- Breusch–Pagan statistic – redirects to Breusch–Pagan test
- Breusch–Pagan test
- Brown–Forsythe test
- Brownian bridgeBrownian bridgeA Brownian bridge is a continuous-time stochastic process B whose probability distribution is the conditional probability distribution of a Wiener process W given the condition that B = B = 0.The expected value of the bridge is zero, with variance t, implying that the most...
- Brownian excursionBrownian excursionIn probability theory a Brownian excursion process is a stochastic processes that is closely related to a Wiener process . Realisations of Brownian excursion processes are essentially just realisations of a Weiner process seleced to satisfy certain conditions...
- Brownian motionBrownian motionBrownian motion or pedesis is the presumably random drifting of particles suspended in a fluid or the mathematical model used to describe such random movements, which is often called a particle theory.The mathematical model of Brownian motion has several real-world applications...
- Brownian treeBrownian treeA Brownian tree, whose name is derived from Robert Brown via Brownian motion, is a form of computer art that was briefly popular in the 1990s, when home computers started to have sufficient power to simulate Brownian motion...
- Bruck–Ryser–Chowla theorem
- Burke's theoremBurke's theoremIn probability theory, Burke's theorem is a theorem in queueing theory by Paul J. Burke while working at Bell Telephone Laboratories that states for an M/M/1, M/M/m or M/M/∞ queue in the steady state with arrivals a Poisson process with rate parameter λ then:# The departure process is a Poisson...
- Burr distribution
- Business statisticsBusiness statisticsBusiness statistics is the science of good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement, and marketing research....
- Bühlmann modelBühlmann modelThe Bühlmann model is a random effects model used in credibility theory in actuarial science to determine the appropriate premium for a group of insurance contracts....
- Buzen's algorithmBuzen's algorithmIn queueing theory, a discipline within the mathematical theory of probability, Buzen's algorithm is an algorithm for calculating the normalization constant G in the Gordon–Newell theorem. This method was first proposed by Jeffrey P. Buzen in 1973. Once G is computed the probability distributions...
- BV4.1 (software)BV4.1 (software)The application software BV4.1 is a user-friendly tool for decomposing and seasonally adjusting monthly or quarterly economic time series by the so-called Berlin procedure. It is being developed by the Federal Statistical Office of Germany...
C
- c-chart
- CàdlàgCàdlàgIn mathematics, a càdlàg , RCLL , or corlol function is a function defined on the real numbers that is everywhere right-continuous and has left limits everywhere...
- Calculating demand forecast accuracyCalculating Demand Forecast AccuracyCalculating demand forecast accuracy is the process of determining the accuracy of forecasts made regarding customer demand for a product.-Importance of forecasts:...
- Calculus of predispositionsCalculus of predispositionsCalculus of predispositions is a basic part of predispositioning theory and belongs to the indeterministic procedures.-Introduction:“The key component of any indeterministic procedure is the evaluation of a position...
- CalEstCalEstCalEst is a statistics package which also includes probability functions as well as tutorials to enhance the learning of Statistics and Probability...
– software - Calibrated probability assessmentCalibrated probability assessmentCalibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty. In other words, when a calibrated person says they are "80% confident" in each of 100 predictions they...
- Calibration (probability) – subjective probability, redirects to Calibrated probability assessmentCalibrated probability assessmentCalibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty. In other words, when a calibrated person says they are "80% confident" in each of 100 predictions they...
- Calibration (statistics)Calibration (statistics)There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. Thus "calibration" can mean...
– the statistical calibration problem - Cancer clusterCancer clusterCancer cluster is a term used by epidemiologists, statisticians, and public health workers to define an occurrence of a greater-than-expected number of cancer cases within a group of people in a geographic area over a period of time....
- Candlestick chartCandlestick chartA candlestick chart is a style of bar-chart used primarily to describe price movements of a security, derivative, or currency over time.It is a combination of a line-chart and a bar-chart, in that each bar represents the range of price movement over a given time interval. It is most often used in...
- Canonical analysisCanonical analysisIn statistics, canonical analysis belongs to the family of regression methods for data analysis. Regression analysis quantifies a relationship between a predictor variable and a criterion variable by the coefficient of correlation r, coefficient of determination r², and the standard regression...
- Canonical correlationCanonical correlationIn statistics, canonical correlation analysis, introduced by Harold Hotelling, is a way of making sense of cross-covariance matrices. If we have two sets of variables, x_1, \dots, x_n and y_1, \dots, y_m, and there are correlations among the variables, then canonical correlation analysis will...
- Canopy clustering algorithmCanopy clustering algorithmThe canopy clustering algorithm is an unsupervised pre-clustering algorithm, often used as preprocessing step for the K-means algorithm or the Hierarchical clustering algorithm....
- Cantor distribution
- Carpet plotCarpet plotA carpet plot is any of a few different specific types of diagram.- Interaction of two independent variables :Probably the more common plot referred to as a carpet plot is one that illustrates the interacting behaviour of two independent variables, which among other things facilitates interpolation...
- CartogramCartogramA cartogram is a map in which some thematic mapping variable – such as travel time or Gross National Product – is substituted for land area or distance. The geometry or space of the map is distorted in order to convey the information of this alternate variable...
- Case-controlCase-controlA case-control study is a type of study design in epidemiology. Case-control studies are used to identify factors that may contribute to a medical condition by comparing subjects who have that condition with patients who do not have the condition but are otherwise similar .Case-control studies are...
– redirects to Case-control study - Case-control study
- Catastro of EnsenadaCatastro of EnsenadaIn 1749 a large-scale census and statistical investigation was conducted in the Crown of Castile . It included population, territorial properties, buildings, cattle, offices, all kinds of revenue and trades, and even geographical information from each place...
– a census of part of Spain - Categorical dataCategorical dataIn statistics, categorical data is that part of an observed dataset that consists of categorical variables, or for data that has been converted into that form, for example as grouped data...
- Categorical distributionCategorical distributionIn probability theory and statistics, a categorical distribution is a probability distribution that describes the result of a random event that can take on one of K possible outcomes, with the probability of each outcome separately specified...
- Categorical variable
- Cauchy distributionCauchy distributionThe Cauchy–Lorentz distribution, named after Augustin Cauchy and Hendrik Lorentz, is a continuous probability distribution. As a probability distribution, it is known as the Cauchy distribution, while among physicists, it is known as the Lorentz distribution, Lorentz function, or Breit–Wigner...
- Cauchy–Schwarz inequalityCauchy–Schwarz inequalityIn mathematics, the Cauchy–Schwarz inequality , is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, and other areas...
- Causal Markov conditionCausal Markov conditionThe Markov condition for a Bayesian network states that any node in a Bayesian network is conditionally independent of its nondescendents, given its parents.A node is conditionally independent of the entire network, given its Markov blanket....
- Ceiling effectCeiling effectThe term ceiling effect has two distinct meanings, referring to the level at which an independent variable no longer has an effect on a dependent variable, or to the level above which variance in an independent variable is no longer measured or estimated...
- Censored regression modelCensored regression modelCensored regression models commonly arise in econometrics in cases where the variable ofinterest is only observable under certain conditions. A common example is labor supply. Data are frequently available on the hours worked by employees, and a labor supply model estimates the relationship between...
- Censoring (clinical trials)Censoring (clinical trials)The term censoring is used in clinical trials to refer to mathematically removing a patient from the survival curve at the end of their follow-up time. Censoring a patient will reduce the sample size for analyzing after the time of the censorship...
- Censoring (statistics)Censoring (statistics)In statistics, engineering, and medical research, censoring occurs when the value of a measurement or observation is only partially known.For example, suppose a study is conducted to measure the impact of a drug on mortality. In such a study, it may be known that an individual's age at death is at...
- Centering matrixCentering matrixIn mathematics and multivariate statistics, the centering matrix is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component.- Definition :...
- Centerpoint (geometry)Centerpoint (geometry)In statistics and computational geometry, the notion of centerpoint is a generalization of the median to data in higher-dimensional Euclidean space...
— Tukey median redirects here - Central composite designCentral composite designIn statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order model for the response variable without needing to use a complete three-level factorial experiment....
- Central limit theoremCentral limit theoremIn probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. The central limit theorem has a number of variants. In its common...
- Central limit theorem (illustration) — redirects to Illustration of the central limit theoremIllustration of the central limit theoremThis article gives two concrete illustrations of the central limit theorem. Both involve the sum of independent and identically-distributed random variables and show how the probability distribution of the sum approaches the normal distribution as the number of terms in the sum increases.The first...
- Central limit theorem for directional statisticsCentral limit theorem for directional statisticsIn probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed....
- Lyapunov's central limit theorem
- Martingale central limit theoremMartingale central limit theoremIn probability theory, the central limit theorem says that, under certain conditions, the sum of many independent identically-distributed random variables, when scaled appropriately, converges in distribution to a standard normal distribution...
- Central limit theorem (illustration) — redirects to Illustration of the central limit theorem
- Central momentCentral momentIn probability theory and statistics, central moments form one set of values by which the properties of a probability distribution can be usefully characterised...
- Central tendencyCentral tendencyIn statistics, the term central tendency relates to the way in which quantitative data is clustered around some value. A measure of central tendency is a way of specifying - central value...
- CensusCensusA census is the procedure of systematically acquiring and recording information about the members of a given population. It is a regularly occurring and official count of a particular population. The term is used mostly in connection with national population and housing censuses; other common...
- CepstrumCepstrumA cepstrum is the result of taking the Fourier transform of the logarithm of the spectrum of a signal. There is a complex cepstrum, a real cepstrum, a power cepstrum, and phase cepstrum....
- CHAIDCHAIDCHAID is a type of decision tree technique, based upon adjusted significance testing . The technique was developed in South Africa and was published in 1980 by Gordon V. Kass, who had completed a PhD thesis on this topic...
— CHi-squared Automatic Interaction Detector - Chain rule for Kolmogorov complexityChain rule for Kolmogorov complexityThe chain rule for Kolmogorov complexity is an analogue of the chain rule for information entropy, which states:H = H + HThat is, the combined randomness of two sequences X and Y is the sum of the randomness of X plus whatever randomness is left in Y once we know X.This follows immediately from the...
- Challenge-dechallenge-rechallengeChallenge-dechallenge-rechallengeChallenge-dechallenge-rechallenge is a medical testing protocol in which a medicine or drug is administered, withdrawn, then re-administered, while being monitored for adverse effects at each stage...
- Change detectionChange detectionIn statistical analysis, change detection tries to identify changes in the probability distribution of a stochastic process or time series. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times...
- Change detection (GIS)Change detection (GIS)Change detection for GIS is a process that measures how the attributes of a particular area have changed between two or more time periods. Change detection often involves comparing aerial photographs or satellite imagery of the area taken at different times...
- Change detection (GIS)
- Chapman–Kolmogorov equation
- Chapman–Robbins boundChapman–Robbins boundIn statistics, the Chapman–Robbins bound or Hammersley–Chapman–Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. It is a generalization of the Cramér–Rao bound; compared to the Cramér–Rao bound, it is both tighter and applicable to a wider range of problems...
- Characteristic function (probability theory)Characteristic function (probability theory)In probability theory and statistics, the characteristic function of any random variable completely defines its probability distribution. Thus it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative...
- Chauvenet's criterionChauvenet's criterionIn statistical theory, the Chauvenet's criterion is a means of assessing whether one piece of experimental data — an outlier — from a set of observations, is likely to be spurious....
- Chebyshev centerChebyshev centerIn geometry, the Chebyshev center of a bounded set Q having non-empty interior is the center of the minimal-radius ball enclosing the entire set Q, or, alternatively, the center of largest inscribed ball of Q ....
- Chebyshev's inequalityChebyshev's inequalityIn probability theory, Chebyshev’s inequality guarantees that in any data sample or probability distribution,"nearly all" values are close to the mean — the precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean...
- Checking if a coin is biased — redirects to Checking whether a coin is fair
- Checking whether a coin is fair
- Cheeger boundCheeger boundIn mathematics, the Cheeger bound is a bound of the second largest eigenvalue of the transition matrix of a finite-state, discrete-time, reversible stationary Markov chain. It can be seen as a special case of Cheeger inequalities in expander graphs....
- ChemometricsChemometricsChemometrics is the science of extracting information from chemical systems by data-driven means. It is a highly interfacial discipline, using methods frequently employed in core data-analytic disciplines such as multivariate statistics, applied mathematics, and computer science, in order to...
- Chernoff boundChernoff boundIn probability theory, the Chernoff bound, named after Herman Chernoff, gives exponentially decreasing bounds on tail distributions of sums of independent random variables...
– a special case of Chernoff's inequality - Chernoff face
- Chernoff's distributionChernoff's distributionIn probability theory, Chernoff's distribution, named after Herman Chernoff, is the probability distribution of the random variablewhere W is a "two-sided" Wiener process satisfying W = 0.If...
- Chernoff's inequality
- Chi distribution
- Chi-squared distribution
- Chi-squared test
- Chinese restaurant process
- Choropleth mapChoropleth mapA choropleth map A choropleth map A choropleth map (Greek χώρος + πληθαίν:, ("area/region" + "multiply") is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita...
- Chow testChow testThe Chow test is a statistical and econometric test of whether the coefficients in two linear regressions on different data sets are equal. The Chow test was invented by economist Gregory Chow. In econometrics, the Chow test is most commonly used in time series analysis to test for the presence of...
- ChronuxChronuxChronux is an open-source software package developed for the loading, visualization and analysis of a variety of modalities / formats of neurobiological time series data...
software - Circular distribution
- Circular error probableCircular error probableIn the military science of ballistics, circular error probable is an intuitive measure of a weapon system's precision...
- Circular statisticsCircular statisticsDirectional statistics is the subdiscipline of statistics that deals with directions , axes or rotations in Rn...
– redirects to Directional statistics - Circular uniform distributionCircular uniform distributionIn probability theory and directional statistics, a circular uniform distribution is a probability distribution on the unit circle whose density is uniform for all angles.- Description :The pdf of the circular uniform distribution is:...
- Clark–Ocone theorem
- Class membership probabilitiesClass membership probabilitiesIn general proplems of classification, class membership probabilities reflect the uncertainty with which a given indivual item can be assigned to any given class. Although statistical classification methods by definition generate such probabilities, applications of classification in machine...
- Classic data sets
- Classical definition of probabilityClassical definition of probabilityThe classical definition of probability is identified with the works of Pierre-Simon Laplace. As stated in his Théorie analytique des probabilités,This definition is essentially a consequence of the principle of indifference...
- Classical test theoryClassical test theoryClassical test theory is a body of related psychometric theory that predict outcomes of psychological testing such as the difficulty of items or the ability of test-takers. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological...
- psychometrics - Classification ruleClassification ruleGiven a population whose members can be potentially separated into a number of different sets or classes, a classification rule is a procedure in which the elements of the population set are each assigned to one of the classes. A perfect test is such that every element in the population is assigned...
- Classifier (mathematics)
- Climate ensembleClimate ensembleIn physics, a statistical ensemble is a large set of copies of a system, considered all at once; each copy of the system representing a different possible detailed realisation of the system, consistent with the system's observed macroscopic properties....
- Clinical significanceClinical significanceIn medicine and psychology, clinical significance refers to either of two related but slightly dissimilar concepts whereby certain findings or differences, even if measurable or statistically confirmed, either may or may not have additional significance, either by being of a magnitude that conveys...
- Clinical study design
- Clinical trialClinical trialClinical trials are a set of procedures in medical research and drug development that are conducted to allow safety and efficacy data to be collected for health interventions...
- Clinical utility of diagnostic testsClinical utility of diagnostic testsThe clinical utility of a diagnostic test is its capacity to rule diagnosis in and/or out and to make a decision possible to adopt or to reject a therapeutic action. It can be integrated into clinical prediction rules for specific diseases or outcomes....
- CliodynamicsCliodynamicsthumb|Clio—detail from [[The Art of Painting|The Allegory of Painting]] by [[Johannes Vermeer]]Cliodynamics is a new multidisciplinary area of research focused at mathematical modeling of historical dynamics.-Origins:The term was originally coined by Peter...
- Closed testing procedureClosed testing procedureIn statistics, the closed testing procedure is a general method for performing more than one hypothesis test simultaneously.-The closed testing principle:...
- Cluster analysis
- Cluster analysis (in marketing)Cluster analysis (in marketing)Cluster analysis is a class of statistical techniques that can be applied to data that exhibit “natural” groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group of relatively homogeneous cases or observations. Objects in a cluster are similar to...
- Cluster randomised controlled trialCluster randomised controlled trialA cluster randomised controlled trial is a type of randomised controlled trial in which groups of subjects are randomised...
- Cluster samplingCluster samplingCluster Sampling is a sampling technique used when "natural" groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into these groups and a sample of the groups is selected. Then the required information is...
- Cluster-weighted modelingCluster-weighted modelingIn statistics, cluster-weighted modeling is an algorithm-based approach to non-linear prediction of outputs from inputs based on density estimation using a set of models that are each notionally appropriate in a sub-region of the input space...
- Clustering high-dimensional dataClustering high-dimensional dataClustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional data spaces are often encountered in areas such as medicine, where DNA microarray technology can produce a large number of measurements at once, and...
- CMA-ESCMA-ESCMA-ES stands for Covariance Matrix Adaptation Evolution Strategy. Evolution strategies are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation...
(Covariance Matrix Adaptation Evolution Strategy) - Coalescent theoryCoalescent theoryIn genetics, coalescent theory is a retrospective model of population genetics. It attempts to trace all alleles of a gene shared by all members of a population to a single ancestral copy, known as the most recent common ancestor...
- Cochran's C testCochran's C testIn statistics, Cochran's C test , named after William G. Cochran, is a one-sided upper limit variance outlier test. The C test is used to decide if a single estimate of a variance is significantly larger than a group of variances with which the single estimate is supposed to be comparable...
- Cochran's Q test
- Cochran's theoremCochran's theoremIn statistics, Cochran's theorem, devised by William G. Cochran, is a theorem used in to justify results relating to the probability distributions of statistics that are used in the analysis of variance.- Statement :...
- Cochran-Armitage test for trendCochran-Armitage test for trendThe Cochran-Armitage test for trend, named for William Cochran and Peter Armitage, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and a variable with k categories. It modifies the chi-squared test to...
- Cochran–Mantel–Haenszel statisticsCochran–Mantel–Haenszel statisticsIn statistics, the Cochran–Mantel–Haenszel statistics are a collection of test statistics used in the analysis of stratified categorical data.. They are named after William G Cochran, Nathan Mantel and William Haenszel. One of these test statistics is the Cochran–Mantel–Haenszel test, which allows...
- Cochrane–Orcutt estimation
- Coding (social sciences)Coding (social sciences)Coding refers to an analytical process in which data, in both quantitative form or qualitative are categorised to facilitate analysis....
- Coefficient of coherence — redirects to Coherence (statistics)Coherence (statistics)In probability theory and statistics, coherence can have two meanings.*When dealing with personal probability assessments, or supposed probabilities derived in nonstandard ways, it is a property of self-consistency across a whole set of such assessments...
- Coefficient of determinationCoefficient of determinationIn statistics, the coefficient of determination R2 is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model...
- Coefficient of dispersionCoefficient of dispersionIn probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, or variance-to-mean ratio , like the coefficient of variation, is a normalized measure of the dispersion of a probability distribution: it is a measure used to quantify whether a set of...
- Coefficient of variationCoefficient of variationIn probability theory and statistics, the coefficient of variation is a normalized measure of dispersion of a probability distribution. It is also known as unitized risk or the variation coefficient. The absolute value of the CV is sometimes known as relative standard deviation , which is...
- Cognitive pretestingCognitive pretestingCognitive interviewing is a field research method used primarily in pre-testing survey instruments developed in collaboration by psychologists and survey researchers. It allows survey researchers to collect verbal information regarding survey responses and is used in evaluating whether the...
- Cohen's class distribution functionCohen's class distribution functionBilinear time–frequency distributions, or quadratic time–frequency distributions, arise in a sub-field field of signal analysis and signal processing called time–frequency signal processing, and, in the statistical analysis of time series data...
– a time-frequency distribution function - Cohen's kappaCohen's kappaCohen's kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative items. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance. Some...
- Coherence (statistics)Coherence (statistics)In probability theory and statistics, coherence can have two meanings.*When dealing with personal probability assessments, or supposed probabilities derived in nonstandard ways, it is a property of self-consistency across a whole set of such assessments...
- Cohort (statistics)Cohort (statistics)In statistics and demography, a cohort is a group of subjects who have shared a particular time together during a particular time span . Cohorts may be tracked over extended periods in a cohort study. The cohort can be modified by censoring, i.e...
- Cohort effectCohort effectThe term cohort effect is used in social science to describe variations in the characteristics of an area of study over time among individuals who are defined by some shared temporal experience or common life experience, such as year of birth, or year of exposure to radiation.Cohort effects are...
- Cohort studyCohort studyA cohort study or panel study is a form of longitudinal study used in medicine, social science, actuarial science, and ecology. It is an analysis of risk factors and follows a group of people who do not have the disease, and uses correlations to determine the absolute risk of subject contraction...
- CointegrationCointegrationCointegration is a statistical property of time series variables. Two or more time series are cointegrated if they share a common stochastic drift.-Introduction:...
- Collectively exhaustive events
- Collider (epidemiology)Collider (epidemiology)In epidemiology, a collider is a variable which is the effect of two other variables. It is known as collider because, in graphical models, the other variables lead to the collider in a way that their arrow heads appear to collide on the same node that is the collider e.g.M \rightarrow P...
- Combinatorial data analysisCombinatorial data analysisCombinatorial data analysis is the study of data sets where the arrangement of objects is important. CDA can be used either to determine how well a given combinatorial construct reflects the observed data, or to search for a suitable combinatorial construct that does fit the data.-See...
- Combinatorial designCombinatorial designCombinatorial design theory is the part of combinatorial mathematics that deals with the existence and construction of systems of finite sets whose intersections have specified numerical properties....
- Combinatorial meta-analysis
- Common mode failure
- Common-cause and special-causeCommon-cause and special-causeCommon- and special-causes are the two distinct origins of variation in a process, as defined in the statistical thinking and methods of Walter A. Shewhart and W. Edwards Deming...
- Comparing meansComparing meansThe following tables provide guidance to the selection of the proper parametric or non-parametric statistical tests for a given data set.-Is there a difference ?:...
- Comparison of general and generalized linear models
- Comparison of statistical packagesComparison of statistical packagesThe following tables compare general and technical information for a number of statistical analysis packages.-General information:Basic information about each product...
- Comparisonwise error rate
- Complementary eventComplementary eventIn probability theory, the complement of any event A is the event [not A], i.e. the event that A does not occur. The event A and its complement [not A] are mutually exclusive and exhaustive. Generally, there is only one event B such that A and B are both mutually exclusive and...
- Complete-linkage clusteringComplete-linkage clusteringIn cluster analysis, complete linkage or farthest neighbour is a method of calculating distances between clusters in agglomerative hierarchical clustering...
- Complete spatial randomnessComplete spatial randomnessComplete spatial randomness describes a point process whereby point events occur within a given study area in a completely random fashion. Such a process is often modeled using only one parameter, i.e. the density of points, \rho within the defined area...
- Completely randomized designCompletely randomized designIn the design of experiments, completely randomized designs are for studying the effects of one primary factor without the need to take other nuisance variables into account. This article describes completely randomized designs that have one primary factor. The experiment compares the values of a...
- Completeness (statistics)Completeness (statistics)In statistics, completeness is a property of a statistic in relation to a model for a set of observed data. In essence, it is a condition which ensures that the parameters of the probability distribution representing the model can all be estimated on the basis of the statistic: it ensures that the...
- Compositional dataCompositional dataIn statistics, compositional data are quantitative descriptions of the parts of some whole, conveying exclusively relative information.This definition, given by John Aitchison has several consequences:...
- Composite bar chartComposite bar chartComposite bar charts are bar charts which always total 100, but each element is shown as a percentage of the bar allowing different sample sizes to be more easily compared.-External links:...
- Compound Poisson distributionCompound Poisson distributionIn probability theory, a compound Poisson distribution is the probability distribution of the sum of a "Poisson-distributed number" of independent identically-distributed random variables...
- Compound Poisson processCompound Poisson processA compound Poisson process is a continuous-time stochastic process with jumps. The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution...
- Compound probability distributionCompound probability distributionIn probability theory, a compound probability distribution is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution F with an unknown parameter θ that is distributed according to some other distribution G, and then...
- Computational formula for the variance
- Computational learning theoryComputational learning theoryIn theoretical computer science, computational learning theory is a mathematical field related to the analysis of machine learning algorithms.-Overview:Theoretical results in machine learning mainly deal with a type of...
- Computational statisticsComputational statisticsComputational statistics, or statistical computing, is the interface between statistics and computer science. It is the area of computational science specific to the mathematical science of statistics....
- Computer experimentComputer experimentIn the scientific context, a computer experiment refer to mathematical modeling using computer simulation. It has become common to call such experiments in silico...
- Concordance correlation coefficientConcordance correlation coefficientIn statistics, the concordance correlation coefficient measures the agreement between two variables, e.g., to evaluate reproducibility or for inter-rater reliability.-Definition:...
- Concordant pair
- Concrete illustration of the central limit theorem
- Concurrent validityConcurrent validityConcurrent validity is a parameter used in sociology, psychology, and other psychometric or behavioral sciences. Concurrent validity is demonstrated where a test correlates well with a measure that has previously been validated. The two measures may be for the same construct, or for different, but...
- Conditional change modelConditional change modelThe conditional change model in statistics is the analytic procedure in which change scores are regressed on baseline values, together with the explanatory variables of interest . The method has some substantial advantages over the usual two-sample t-test recommended in textbooks.-References:*...
- Conditional distribution — redirects to Conditional probability distribution
- Conditional expectationConditional expectationIn probability theory, a conditional expectation is the expected value of a real random variable with respect to a conditional probability distribution....
- Conditional independenceConditional independenceIn probability theory, two events R and B are conditionally independent given a third event Y precisely if the occurrence or non-occurrence of R and the occurrence or non-occurrence of B are independent events in their conditional probability distribution given Y...
- Conditional probabilityConditional probabilityIn probability theory, the "conditional probability of A given B" is the probability of A if B is known to occur. It is commonly notated P, and sometimes P_B. P can be visualised as the probability of event A when the sample space is restricted to event B...
- Conditional probability distribution
- Conditional random fieldConditional random fieldA conditional random field is a statistical modelling method often applied in pattern recognition.More specifically it is a type of discriminative undirected probabilistic graphical model. It is used to encode known relationships between observations and construct consistent interpretations...
- Conditional varianceConditional varianceIn probability theory and statistics, a conditional variance is the variance of a conditional probability distribution. Particularly in econometrics, the conditional variance is also known as the scedastic function or skedastic function...
- Conditionality principleConditionality principleThe conditionality principle is a Fisherian principle of statistical inference that Allan Birnbaum formally defined and studied in his 1962 JASA article. Together with the sufficiency principle, Birnbaum's version of the principle implies the famous likelihood principle...
- Confidence bandConfidence bandA confidence band is used in statistical analysis to represent the uncertainty in an estimate of a curve or function based on limited or noisy data. Confidence bands are often used as part of the graphical presentation of results in a statistical analysis...
- Confidence distributionConfidence distributionIn statistics, the concept of a confidence distribution has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest...
- Confidence intervalConfidence intervalIn statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...
- Confidence regionConfidence regionIn statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.The confidence region is...
- Configural frequency analysisConfigural frequency analysisConfigural frequency analysis is a method of exploratory data analysis. The goal of a configural frequency analysis is to detect patterns in the data that occur significantly more or significantly less often than expected by chance...
- Confirmation biasConfirmation biasConfirmation bias is a tendency for people to favor information that confirms their preconceptions or hypotheses regardless of whether the information is true.David Perkins, a geneticist, coined the term "myside bias" referring to a preference for "my" side of an issue...
- Confirmatory factor analysisConfirmatory factor analysisIn statistics, confirmatory factor analysis is a special form of factor analysis. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct . In contrast to exploratory factor analysis, where all loadings are free to vary,...
- ConfoundingConfoundingIn statistics, a confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable...
- Confounding factor
- Confusion of the inverseConfusion of the inverseConfusion of the inverse, also called the conditional probability fallacy, is a logical fallacy whereupon a conditional probability is equivocated with its inverse: That is, given two events A and B, the probability Pr is assumed to be approximately equal to Pr.-Example 1:In one study, physicians...
- Conjoint analysisConjoint analysisConjoint analysis, also called multi-attribute compositional models or stated preference analysis, is a statistical technique that originated in mathematical psychology. Today it is used in many of the social sciences and applied sciences including marketing, product management, and operations...
- Conjoint analysis (in healthcare)Conjoint analysis (in healthcare)-Why conjoint in healthcare market research?:Pharmaceutical manufacturers need deeper and deeper market information they can rely on to make the right decisions and to identify the most promising market opportunities[1][6] . They can obtain great benefits from understanding physicians’ prescription...
- Conjoint analysis (in marketing)Conjoint analysis (in marketing)Conjoint analysis is a statistical technique used in market research to determine how people value different features that make up an individual product or service....
- Conjoint analysis (in healthcare)
- Conjugate priorConjugate priorIn Bayesian probability theory, if the posterior distributions p are in the same family as the prior probability distribution p, the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood...
- Consensus-based assessment
- Consensus forecast
- Consistency (statistics)Consistency (statistics)In statistics, consistency of procedures such as confidence intervals or hypothesis tests involves their behaviour as the number of items in the data-set to which they are applied increases indefinitely...
(disambiguation) - Consistent estimatorConsistent estimatorIn statistics, a sequence of estimators for parameter θ0 is said to be consistent if this sequence converges in probability to θ0...
- Constant elasticity of substitutionConstant Elasticity of SubstitutionIn economics, Constant elasticity of substitution is a property of some production functions and utility functions.More precisely, it refers to a particular type of aggregator function which combines two or more types of consumption, or two or more types of productive inputs into an aggregate...
- Constant false alarm rateConstant false alarm rateConstant false alarm rate detection refers to a common form of adaptive algorithm used in radar systems to detect target returns against a background of noise, clutter and interference.Other detection algorithms are not adaptive...
- Constraint (information theory)Constraint (information theory)Constraint in information theory refers to the degree of statistical dependence between or among variables.Garner provides a thorough discussion of various forms of constraint with application to pattern recognition and psychology....
- Consumption distributionConsumption distributionIn economics, the consumption distribution is an alternative to the income distribution for judging economic inequality, comparing levels of consumption rather than income or wealth.-See also:* Economic inequality* Wealth condensation* Lorenz curve* Asset...
- Contact process (mathematics)
- Content validityContent validityIn psychometrics, content validity refers to the extent to which a measure represents all facets of a given social construct. For example, a depression scale may lack content validity if it only assesses the affective dimension of depression but fails to take into account the behavioral dimension...
- Contiguity (probability theory)Contiguity (probability theory)In probability theory, two sequences of probability measures are said to be contiguous if asymptotically they share the same support. Thus the notion of contiguity extends the concept of absolute continuity to the sequences of measures....
- Contingency tableContingency tableIn statistics, a contingency table is a type of table in a matrix format that displays the frequency distribution of the variables...
- Continuity correctionContinuity correctionIn probability theory, if a random variable X has a binomial distribution with parameters n and p, i.e., X is distributed as the number of "successes" in n independent Bernoulli trials with probability p of success on each trial, then...
- Continuous distribution — redirects to Continuous probability distribution
- Continuous mapping theoremContinuous mapping theoremIn probability theory, the continuous mapping theorem states that continuous functions are limit-preserving even if their arguments are sequences of random variables. A continuous function, in Heine’s definition, is such a function that maps convergent sequences into convergent sequences: if xn → x...
- Continuous probability distribution
- Continuous stochastic processContinuous stochastic processIn the probability theory, a continuous stochastic process is a type of stochastic process that may be said to be "continuous" as a function of its "time" or index parameter. Continuity is a nice property for a process to have, since it implies that they are well-behaved in some sense, and,...
- Continuous-time Markov process
- Continuous-time stochastic processContinuous-time stochastic processIn probability theory and statistics, a continuous-time stochastic process, or a continuous-space-time stochastic process is a stochastic process for which the index variable takes a continuous set of values, as contrasted with a discrete-time process for which the index variable takes only...
- Contrast (statistics)Contrast (statistics)In statistics, particularly analysis of variance, a contrast is a linear combination of two or more factor level means whose coefficients add up to zero. A simple contrast is the difference between two means...
- Control chartControl chartControl charts, also known as Shewhart charts or process-behaviour charts, in statistical process control are tools used to determine whether or not a manufacturing or business process is in a state of statistical control.- Overview :...
- Control event rateControl event rateIn epidemiology and biostatistics, the control event rate is a measure of how often a particular statistical event occurs within the scientific control group of an experiment ....
- Control limitsControl limitsControl limits, also known as natural process limits, are horizontal lines drawn on a statistical process control chart, usually at a distance of ±3 standard deviations of the plotted statistic from the statistic's mean....
- Control variateControl variateThe control variates method is a variance reduction technique used in Monte Carlo methods. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.-Underlying principle:...
- Controlling for a variableControlling for a variableControlling for a variable refers to the deliberate varying of the experimental conditions in order to see the impact of a specific variable when predicting the outcome variable . Controlling tends to reduce the experimental error...
- Convergence of measuresConvergence of measuresIn mathematics, more specifically measure theory, there are various notions of the convergence of measures. Three of the most common notions of convergence are described below.-Total variation convergence of measures:...
- Convergence of random variablesConvergence of random variablesIn probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes...
- Convex hullConvex hullIn mathematics, the convex hull or convex envelope for a set of points X in a real vector space V is the minimal convex set containing X....
- Convolution of probability distributionsConvolution of probability distributionsThe convolution of probability distributions arises in probability theory and statistics as the operation in terms of probability distributions that corresponds to the addition of independent random variables and, by extension, to forming linear combinations of random variables...
- Convolution random number generator
- Conway–Maxwell–Poisson distribution
- Cook's distanceCook's distanceIn statistics, Cook's distance is a commonly used estimate of the influence of a data point when doing least squares regression analysis. In a practical ordinary least squares analysis, Cook's distance can be used in several ways: to indicate data points that are particularly worth checking for...
- Cophenetic correlationCophenetic correlationIn statistics, and especially in biostatistics, cophenetic correlation is a measure of how faithfully a dendrogram preserves the pairwise distances between the original unmodeled data points...
- Copula (statistics)Copula (statistics)In probability theory and statistics, a copula can be used to describe the dependence between random variables. Copulas derive their name from linguistics....
- Correct samplingCorrect samplingDuring sampling of particulate materials, correct sampling is defined in Gy's sampling theory as a sampling scenario in which all particles in a population have the same probability of ending up in the sample ....
- Correction for attenuationCorrection for attenuationCorrection for attenuation is a statistical procedure, due to Spearman , to "rid a correlation coefficient from the weakening effect of measurement error" , a phenomenon also known as regression dilution. In measurement and statistics, it is also called disattenuation...
- Correlate summation analysisCorrelate summation analysisCorrelate summation analysis is a data mining method. It is designed to find the variables that are most covariant with all of the other variables being studied, relative to clustering. Aggregate correlate summation is the product of the totaled negative logarithm of the p-values for all of the...
- CorrelationCorrelationIn statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
- Correlation and dependence
- Correlation does not imply causationCorrelation does not imply causation"Correlation does not imply causation" is a phrase used in science and statistics to emphasize that correlation between two variables does not automatically imply that one causes the other "Correlation does not imply causation" (related to "ignoring a common cause" and questionable cause) is a...
- Correlation clusteringCorrelation clusteringIn machine learning, correlation clustering or cluster editing operates in a scenario where the relationship between the objects are known instead of the actual representation of the objects...
- Correlation functionCorrelation functionA correlation function is the correlation between random variables at two different points in space or time, usually as a function of the spatial or temporal distance between the points...
- Correlation function (astronomy)Correlation function (astronomy)In astronomy, a correlation function describes the distribution of galaxies in the universe. By default, correlation function refers to the two-point autocorrelation function. For a given distance, the two-point autocorrelation function is a function of one variable which describes the...
- Correlation function (quantum field theory)Correlation function (quantum field theory)In quantum field theory, the matrix element computed by inserting a product of operators between two states, usually the vacuum states, is called a correlation function....
- Correlation function (statistical mechanics)Correlation function (statistical mechanics)In statistical mechanics, the correlation function is a measure of the order in a system, as characterized by a mathematical correlation function, and describes how microscopic variables at different positions are correlated....
- Correlation function (astronomy)
- Correlation implies causationCorrelation implies causation"Correlation does not imply causation" is a phrase used in science and statistics to emphasize that correlation between two variables does not automatically imply that one causes the other "Correlation does not imply causation" (related to "ignoring a common cause" and questionable cause) is a...
- Correlation inequalityCorrelation inequalityIn probability and statistics, a correlation inequality is one of a number of inequalities satisfied by the correlation functions of a model. Such inequalities are of particular use in statistical mechanics and in percolation theory.Examples include:...
- Correlation ratioCorrelation ratioIn statistics, the correlation ratio is a measure of the relationship between the statistical dispersion within individual categories and the dispersion across the whole population or sample. The measure is defined as the ratio of two standard deviations representing these types of variation...
- CorrelogramCorrelogramIn the analysis of data, a correlogram is an image of correlation statistics. For example, in time series analysis, a correlogram, also known as an autocorrelation plot, is a plot of the sample autocorrelations r_h\, versus h\, ....
- Correspondence analysisCorrespondence analysisCorrespondence analysis is a multivariate statistical technique proposed by Hirschfeld and later developed by Jean-Paul Benzécri. It is conceptually similar to principal component analysis, but applies to categorical rather than continuous data...
- Cosmic varianceCosmic varianceCosmic variance is the statistical uncertainty inherent in observations of the universe at extreme distances. It is based on the idea that it is only possible to observe part of the universe at one particular time, so it is difficult to make statistical statements about cosmology on the scale of...
- Cost-of-living indexCost-of-living indexCost of living is the cost of maintaining a certain standard of living. Changes in the cost of living over time are often operationalized in a cost of living index. Cost of living calculations are also used to compare the cost of maintaining a certain standard of living in different geographic areas...
- Count data
- CounternullCounternullIn statistics, and especially in the statistical analysis of psychological data, the counternull is a statistic used to aid the understanding and presentation of research results...
- Counting process
- CovarianceCovarianceIn probability theory and statistics, covariance is a measure of how much two variables change together. Variance is a special case of the covariance when the two variables are identical.- Definition :...
- Covariance and correlationCovariance and correlationIn probability theory and statistics, the mathematical descriptions of covariance and correlation are very similar. Both describe the degree of similarity between two random variables or sets of random variables....
- Covariance intersectionCovariance intersectionCovariance intersection is an algorithm for combining two or more estimates of state variables in a Kalman filter when the correlation between them is unknown.-Specification:...
- Covariance matrixCovariance matrixIn probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
- Covariance functionCovariance functionIn probability theory and statistics, covariance is a measure of how much two variables change together and the covariance function describes the variance of a random variable process or field...
- CovariateCovariateIn statistics, a covariate is a variable that is possibly predictive of the outcome under study. A covariate may be of direct interest or it may be a confounding or interacting variable....
- Cover's theoremCover's TheoremCover's Theorem is a statement in computational learning theory and is one of the primary theoretical motivations for the use of non-linear kernel methods in machine learning applications...
- Coverage probabilityCoverage probabilityIn statistics, the coverage probability of a confidence interval is the proportion of the time that the interval contains the true value of interest. For example, suppose our interest is in the mean number of months that people with a particular type of cancer remain in remission following...
- Cox processCox processA Cox process , also known as a doubly stochastic Poisson process or mixed Poisson process, is a stochastic process which is a generalization of a Poisson process...
- Cox's theoremCox's theoremCox's theorem, named after the physicist Richard Threlkeld Cox, is a derivation of the laws of probability theory from a certain set of postulates. This derivation justifies the so-called "logical" interpretation of probability. As the laws of probability derived by Cox's theorem are applicable to...
- Cox–Ingersoll–Ross model
- Cramér–Rao bound
- Cramér–von Mises criterion
- Cramér's theoremCramér's theoremIn mathematical statistics, Cramér's theorem is one of several theorems of Harald Cramér, a Swedish statistician and probabilist.- Normal random variables :...
- Cramér's VCramér's VIn statistics, Cramér's V is a popular measure of association between two nominal variables, giving a value between 0 and +1...
- Craps principleCraps principleIn probability theory, the craps principle is a theorem about event probabilities under repeated iid trials. Let E_1 and E_2 denote two mutually exclusive events which might occur on a given trial...
- Credible intervalCredible intervalIn Bayesian statistics, a credible interval is an interval in the domain of a posterior probability distribution used for interval estimation. The generalisation to multivariate problems is the credible region...
- Cricket statisticsCricket statisticsCricket is a sport that generates a large number of statistics.Statistics are recorded for each player during a match, and aggregated over a career. At the professional level, statistics for Test cricket, one-day internationals, and first-class cricket are recorded separately...
- Crime statisticsCrime statisticsCrime statistics attempt to provide statistical measures of the crime in societies. Given that crime is usually secretive by nature, measurements of it are likely to be inaccurate....
- Critical region — redirects to Statistical hypothesis testingStatistical hypothesis testingA statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...
- Cromwell's ruleCromwell's ruleCromwell's rule, named by statistician Dennis Lindley, states that one should avoid using prior probabilities of 0 or 1, except when applied to statements that are logically true or false...
- Cronbach's αCronbach's alphaCronbach's \alpha is a coefficient of reliability. It is commonly used as a measure of the internal consistency or reliability of a psychometric test score for a sample of examinees. It was first named alpha by Lee Cronbach in 1951, as he had intended to continue with further coefficients...
- Cross-correlationCross-correlationIn signal processing, cross-correlation is a measure of similarity of two waveforms as a function of a time-lag applied to one of them. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long-duration signal for a shorter, known feature...
- Cross-covariance
- Cross-entropy methodCross-entropy methodThe cross-entropy method attributed to Reuven Rubinstein is a general Monte Carlo approach tocombinatorial and continuous multi-extremal optimization and importance sampling.The method originated from the field of rare event simulation, where...
- Cross-sectional dataCross-sectional dataCross-sectional data or cross section in statistics and econometrics is a type of one-dimensional data set. Cross-sectional data refers to data collected by observing many subjects at the same point of time, or without regard to differences in time...
- Cross-sectional regressionCross-sectional regressionA Cross-sectional regression is a type of regression model in which the explained and explanatory variables are associated with one period or point in time...
- Cross-sectional studyCross-sectional studyCross-sectional studies form a class of research methods that involve observation of all of a population, or a representative subset, at one specific point in time...
- Cross-spectrumCross-spectrumIn time series analysis, the cross-spectrum is used as part of a frequency domain analysis of the cross correlation or cross covariance between two time series.- Definition :...
- Cross tabulationCross tabulationCross tabulation is the process of creating a contingency table from the multivariate frequency distribution of statistical variables. Heavily used in survey research, cross tabulations can be produced by a range of statistical packages, including some that are specialised for the task. Survey...
- Cross-validation (statistics)
- Crystal Ball functionCrystal Ball functionThe Crystal Ball function, named after the Crystal Ball Collaboration , is a probability density function commonly used to model various lossy processes in high-energy physics. It consists of a Gaussian core portion and a power-law low-end tail, below a certain threshold...
- a probability distribution - CumulantCumulantIn probability theory and statistics, the cumulants κn of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. The moments determine the cumulants in the sense that any two probability distributions whose moments are identical will have...
- Cumulant generating function — redirects to cumulantCumulantIn probability theory and statistics, the cumulants κn of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. The moments determine the cumulants in the sense that any two probability distributions whose moments are identical will have...
- Cumulative distribution functionCumulative distribution functionIn probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...
- Cumulative frequency analysisCumulative frequency analysisCumulative frequency analysis is the applcation of estimation theory to exceedance probability . The complement, the non-exceedance probability concerns the frequency of occurrence of values of a phenomenon staying below a reference value. The phenomenon may be time or space dependent...
- Cumulative incidenceCumulative incidenceCumulative incidence or incidence proportion is a measure of frequency, as in epidemiology, where it is a measure of disease frequency during a period of time...
- Cunningham functionCunningham functionIn statistics, the Cunningham function or Pearson–Cunningham function ωm,n is a generalisation of a special function introduced by and studied in the form here by...
- CURE data clustering algorithmCURE data clustering algorithmCURE is an efficient data clustering algorithm for large databases that is more robust to outliers and identifies clusters having non-spherical shapes and wide variances in size.- Drawbacks of traditional algorithms :...
- Curve fittingCurve fittingCurve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a "smooth" function...
- CUSUMCUSUMIn statistical quality control, the CUSUM is a sequential analysis technique due to E. S. Page of the University of Cambridge. It is typically used for monitoring change detection...
- Cuzick–Edwards test
- Cyclostationary process
D
- d'D'The sensitivity index or d' is a statistic used in signal detection theory. It provides the separation between the means of the signal and the noise distributions, in units of the standard deviation of the noise distribution....
- d-separation
- D'Agostino's K-squared testD'Agostino's K-squared testIn statistics, D’Agostino’s K2 test is a goodness-of-fit measure of departure from normality, that is the test aims to establish whether or not the given sample comes from a normally distributed population...
- Dagum distribution
- DAPDAP (software)Dap is a statistics and graphics program, that performs data management, analysis, and graphical visualization tasks which are commonly required in statistical consulting practice....
— open source software - Data analysisData analysisAnalysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making...
- Data assimilationData assimilationApplications of data assimilation arise in many fields of geosciences, perhaps most importantly in weather forecasting and hydrology. Data assimilation proceeds by analysis cycles...
- Data binningData binningData binning is a data pre-processing technique used to reduce the effects of minor observation errors. The original data values which fall in a given small interval, a bin, are replaced by a value representative of that interval, often the central value...
- Data classification (business intelligence)Data classification (Business Intelligence)In business intelligence, data classification has close ties to data clustering, but where data clustering is descriptive, data classification is predictive. In essence data classification consists of using variables with known values to predict the unknown or future values of other variables. It...
- Data cleansingData cleansingData cleansing, data cleaning, or data scrubbing is the process of detecting and correcting corrupt or inaccurate records from a record set, table, or database. Used mainly in databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant, etc...
- Data clusteringData clusteringCluster analysis or clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters....
- Data collectionData collectionData collection is a term used to describe a process of preparing and collecting data, for example, as part of a process improvement or similar project. The purpose of data collection is to obtain information to keep on record, to make decisions about important issues, to pass information on to...
- Data DeskData DeskData Desk is a software program for visual data analysis, visual data exploration, and statistics. It carries out Exploratory Data Analysis and standard statistical analyses by means of dynamically linked graphic data displays that update any change simultaneously.-History:Data Desk was developed...
– software - Data dredgingData dredgingData dredging is the inappropriate use of data mining to uncover misleading relationships in data. Data-snooping bias is a form of statistical bias that arises from this misuse of statistics...
- Data generating process (disambiguation)
- Data miningData miningData mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...
- Data reductionData reductionData Reduction is the transformation of numerical or alphabetical digital information derived empirical or experimentally into a corrected, ordered, and simplified form....
- Data pointData pointIn statistics, a data point is a set of measurements on a single member of a statistical population, or a subset of those measurements for a given individual...
- Data quality assuranceData quality assuranceData quality assurance is the process of profiling the data to discover inconsistencies, and other anomalies in the data and performing data cleansing activities Data quality assurance is the process of profiling the data to discover inconsistencies, and other anomalies in the data and performing...
- Data setData setA data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. Its values for each of the variables, such as height and weight of an object or values of random numbers. Each...
- Data-snooping bias
- Data transformation (statistics)Data transformation (statistics)In statistics, data transformation refers to the application of a deterministic mathematical function to each point in a data set — that is, each data point zi is replaced with the transformed value yi = f, where f is a function...
- Data visualizationData visualizationData visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information"....
- DataDetectiveDataDetectiveDataDetective is a data mining platform developed by Sentient Information Systems. Since 1992, this software is being applied in organizations that have the need for retrieving patterns and relations in their typically large databases...
– software - DataplotDataplotDataplot is a public-domain software system for scientific visualization and statistical analysis. It was developed at the National Institute of Standards and Technology.-External links:*...
– software - Davies–Bouldin indexDavies–Bouldin indexThe Davies–Bouldin index in 1979 is a metric for evaluating clustering algorithms. This is an internal evaluation scheme, where the validation of how well the clustering has been done is made using quantities and features inherent to the dataset...
- Davis distribution
- De Finetti's game
- De Finetti's theoremDe Finetti's theoremIn probability theory, de Finetti's theorem explains why exchangeable observations are conditionally independent given some latent variable to which an epistemic probability distribution would then be assigned...
- de Moivre's lawDe Moivre's lawDe Moivre's Law is a survival model applied in actuarial science, named for Abraham de Moivre. It is a simple law of mortality based on a linear survival function.-Definition:De Moivre's law has a singleparameter \omega called the ultimate age...
- De Moivre–Laplace theoremDe Moivre–Laplace theoremIn probability theory, the de Moivre–Laplace theorem is a normal approximation to the binomial distribution. It is a special case of the central limit theorem...
- Decision boundaryDecision boundaryIn a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class...
- Decision theoryDecision theoryDecision theory in economics, psychology, philosophy, mathematics, and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision...
- Decomposition of time series
- Deep samplingDeep samplingDeep sampling is a variation of statistical sampling in which precision is sacrificed for insight. Small numbers of samples are taken, with each sample containing much information. The samples are taken approximately uniformly over the resource of interest, such as time or space...
- Degenerate distribution
- Degrees of freedom (statistics)Degrees of freedom (statistics)In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the...
- Delphi methodDelphi methodThe Delphi method is a structured communication technique, originally developed as a systematic, interactive forecasting method which relies on a panel of experts.In the standard version, the experts answer questionnaires in two or more rounds...
- Delta methodDelta methodIn statistics, the delta method is a method for deriving an approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator...
- Demand forecastingDemand forecastingDemand forecasting is the activity of estimating the quantity of a product or service that consumers will purchase. Demand forecasting involves techniques including both informal methods, such as educated guesses, and quantitative methods, such as the use of historical sales data or current data...
- Deming regressionDeming regressionIn statistics, Deming regression, named after W. Edwards Deming, is an errors-in-variables model which tries to find the line of best fit for a two-dimensional dataset...
- DemographicsDemographicsDemographics are the most recent statistical characteristics of a population. These types of data are used widely in sociology , public policy, and marketing. Commonly examined demographics include gender, race, age, disabilities, mobility, home ownership, employment status, and even location...
- DemographyDemographyDemography is the statistical study of human population. It can be a very general science that can be applied to any kind of dynamic human population, that is, one that changes over time or space...
- Demographic statisticsDemographic statisticsAmong the kinds of data that national leaders need are the demographic statistics of their population. Records of births, deaths, marriages, immigration and emigration and a regular census of population provide information that is key to making sound decisions about national policy.A useful summary...
- Demographic statistics
- DendrogramDendrogramA dendrogram is a tree diagram frequently used to illustrate the arrangement of the clusters produced by hierarchical clustering...
- Density estimationDensity estimationIn probability and statistics,density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function...
- Dependent and independent variablesDependent and independent variablesThe terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...
- Descriptive researchDescriptive researchDescriptive research, also known as statistical research, describes data and characteristics about the population or phenomenon being studied. Descriptive research answers the questions who, what, where, when, "why" and how......
- Descriptive statisticsDescriptive statisticsDescriptive statistics quantitatively describe the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics , in that descriptive statistics aim to summarize a data set, rather than use the data to learn about the population that the data are...
- Design effectDesign effectIn statistics, the design effect is an adjustment used in some kinds of studies, such as cluster randomised trials, to allow for the design structure. The adjustment inflates the variance of parameter estimates, and therefore their standard errors, which is necessary to allow for correlations among...
- Design matrixDesign matrixIn statistics, a design matrix is a matrix of explanatory variables, often denoted by X, that is used in certain statistical models, e.g., the general linear model....
- Design of experimentsDesign of experimentsIn general usage, design of experiments or experimental design is the design of any information-gathering exercises where variation is present, whether under the full control of the experimenter or not. However, in statistics, these terms are usually used for controlled experiments...
- The Design of ExperimentsThe Design of ExperimentsThe Design of Experiments is a 1935 book by the British statistician R.A. Fisher, which effectively founded the field of design of experiments. The book has been highly influential.-References:...
(book by Fisher)
- The Design of Experiments
- Detailed balanceDetailed balanceThe principle of detailed balance is formulated for kinetic systems which are decomposed into elementary processes : At equilibrium, each elementary process should be equilibrated by its reverse process....
- Detection theoryDetection theoryDetection theory, or signal detection theory, is a means to quantify the ability to discern between information-bearing energy patterns and random energy patterns that distract from the information Detection theory, or signal detection theory, is a means to quantify the ability to discern between...
- Determining the number of clusters in a data setDetermining the number of clusters in a data setDetermining the number of clusters in a data set, a quantity often labeled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem....
- Detrended correspondence analysisDetrended Correspondence AnalysisDetrended correspondence analysis is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify ecological community data. For example, Hill and Gauch analyse the data of a vegetation...
- Detrended fluctuation analysisDetrended fluctuation analysisIn stochastic processes, chaos theory and time series analysis, detrended fluctuation analysis is a method for determining the statistical self-affinity of a signal. It is useful for analysing time series that appear to be long-memory processes In stochastic processes, chaos theory and time series...
- Deviance (statistics)
- Deviance information criterionDeviance information criterionThe deviance information criterion is a hierarchical modeling generalization of the AIC and BIC . It is particularly useful in Bayesian model selection problems where the posterior distributions of the models have been obtained by Markov chain Monte Carlo simulation...
- Deviation (statistics)Deviation (statistics)In mathematics and statistics, deviation is a measure of difference for interval and ratio variables between the observed value and the mean. The sign of deviation , reports the direction of that difference...
- Deviation analysis (disambiguation)
- DFFITS — a regression diagnostic
- Dickey–Fuller test
- Difference in differencesDifference in differencesDifference in differences is a quasi-experimental technique used in econometrics that measures the effect of a treatment at a given period in time. It is often used to measure the change induced by a particular treatment or event, though may be subject to certain biases...
- Differential entropyDifferential entropyDifferential entropy is a concept in information theory that extends the idea of entropy, a measure of average surprisal of a random variable, to continuous probability distributions.-Definition:...
- Diffusion processDiffusion processIn probability theory, a branch of mathematics, a diffusion process is a solution to a stochastic differential equation. It is a continuous-time Markov process with continuous sample paths....
- Diffusion-limited aggregationDiffusion-limited aggregationDiffusion-limited aggregation is the process whereby particles undergoing a random walk due to Brownian motion cluster together to form aggregates of such particles. This theory, proposed by Witten and Sander in 1981, is applicable to aggregation in any system where diffusion is the primary means...
- Dimension reduction
- Dilution assayDilution assayThe term dilution assay is generally used to designate a special type of bioassay in which one or more preparations are administered to experimental units at different dose levels inducing a measurable biological response. The dose levels are prepared by dilution in a diluent that is inert in...
- Direct relationshipDirect relationshipIn mathematics and statistics, a positive or direct relationship is a relationship between two variables in which change in one variable is associated with a change in the other variable in the same direction. For example all linear relationships with a positive slope are direct relationships...
- Directional statistics
- Dirichlet distribution
- Dirichlet processDirichlet processIn probability theory, a Dirichlet process is a stochastic process that can be thought of as a probability distribution whose domain is itself a random distribution...
- Disattenuation
- Discrepancy functionDiscrepancy functionA discrepancy function is a mathematical function which describes how closely a structural model conforms to observed data. Larger values of the discrepancy function indicate a poor fit of the model to data. In general, the parameter estimates for a given model are chosen so as to make the...
- Discrete choiceDiscrete choiceIn economics, discrete choice problems involve choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Such choices contrast with standard consumption models in which the quantity of each good consumed is assumed...
- Discrete choice analysis
- Discrete distribution
- Discrete phase-type distributionDiscrete phase-type distributionThe discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process...
- Discrete probability distribution
- Discrete timeDiscrete timeDiscrete time is the discontinuity of a function's time domain that results from sampling a variable at a finite interval. For example, consider a newspaper that reports the price of crude oil once every day at 6:00AM. The newspaper is described as sampling the cost at a frequency of once per 24...
- Discretization of continuous featuresDiscretization of continuous featuresIn statistics and machine learning, discretization refers to the process of converting or partitioning continuous attributes, features or variables to discretized or nominal attributes/features/variables/intervals. This can be useful when creating probability mass functions – formally, in density...
- Discriminant function analysisDiscriminant function analysisDiscriminant function analysis is a statistical analysis to predict a categorical dependent variable by one or more continuous or binary independent variables. It is different from an ANOVA or MANOVA, which is used to predict one or multiple continuous dependent variables by one or more...
- Discriminative modelDiscriminative modelDiscriminative models are a class of models used in machine learning for modeling the dependence of an unobserved variable y on an observed variable x...
- Disorder problemDisorder problemIn the study of stochastic processes in mathematics, a disorder problem has been formulated by Kolmogorov. Specifically, the problem is use ongoing observations on a stochastic process to decide whether or not to raise an alarm that the probabilistic properties of the process have changed.An...
- Distance correlationDistance correlationIn statistics and in probability theory, distance correlation is a measure of statistical dependence between two random variables or two random vectors of arbitrary, not necessarily equal dimension. Its important property is that this measure of dependence is zero if and only if the random...
- Distributed lagDistributed lagIn statistics and econometrics, a distributed lag model is a model for time series data in which a regression equation is used to predict current values of a dependent variable based on both the current values of an explanatory variable and the lagged values of this explanatory variable.The...
- Divergence (statistics)Divergence (statistics)In statistics and information geometry, divergence or a contrast function is a function which establishes the “distance” of one probability distribution to the other on a statistical manifold...
- Diversity indexDiversity indexA diversity index is a statistic which is intended to measure the local members of a set consisting of various types of objects. Diversity indices can be used in many fields of study to assess the diversity of any population in which each member belongs to a unique group, type or species...
- Divisia indexDivisia indexA Divisia index is a theoretical construct to create index number series for continuous-time data on prices and quantities of goods exchanged.It is designed to incorporate quantity and price changes over time from subcomponents which are measured in different units -- e.g...
- Divisia monetary aggregates index
- Dixon's Q test
- Dominating decision ruleDominating decision ruleIn decision theory, a decision rule is said to dominate another if the performance of the former is sometimes better, and never worse, than that of the latter....
- Donsker's theoremDonsker's theoremIn probability theory, Donsker's theorem, named after M. D. Donsker, identifies a certain stochastic process as a limit of empirical processes. It is sometimes called the functional central limit theorem....
- Doob decomposition theoremDoob decomposition theoremIn the theory of discrete time stochastic processes, a part of the mathematical theory of probability, the Doob decomposition theorem gives a unique decomposition of any submartingale as the sum of a martingale and an increasing predictable process. The theorem was proved by and is named for J. L....
- Doob martingaleDoob martingaleA Doob martingale is a mathematical construction of a stochastic process which approximates a given random variable and has the martingale property with respect to the given filtration...
- Doob's martingale convergence theoremsDoob's martingale convergence theoremsIn mathematics — specifically, in stochastic analysis — Doob's martingale convergence theorems are a collection of results on the long-time limits of supermartingales, named after the American mathematician Joseph Leo Doob....
- Doob's martingale inequalityDoob's martingale inequalityIn mathematics, Doob's martingale inequality is a result in the study of stochastic processes. It gives a bound on the probability that a stochastic process exceeds any given value over a given interval of time...
- Doob–Meyer decomposition theorem
- Doomsday argumentDoomsday argumentThe Doomsday argument is a probabilistic argument that claims to predict the number of future members of the human species given only an estimate of the total number of humans born so far...
- Dot plot (bioinformatics)Dot plot (bioinformatics)A dot plot is a graphical method that allows the comparison of two biological sequences and identify regions of close similarity between them. It is a kind of recurrence plot.-Introduction:...
- Dot plot (statistics)Dot plot (statistics)A dot chart or dot plot is a statistical chart consisting of data points plotted on a simple scale, typically using filled in circles. There are two common, yet very different, versions of the dot chart. The first is described by Wilkinson as a graph that has been used in hand-drawn graphs to...
- Double counting (fallacy)Double counting (fallacy)Double counting is a fallacy in which, when counting events or occurrences in probability or in other areas, a solution counts events two or more times, resulting in an erroneous number of events or occurrences which is higher than the true result...
- Double exponential distribution — disambiguation
- Double mass analysisDouble mass analysisDouble mass analysis is a commonly used data analysis approach for investigating the behaviour of records made of hydrological or meteorological data at a number of locations. It is used to determine whether there is a need for corrections to the data to account for changes in data collection...
- Doubly stochastic modelDoubly stochastic modelIn statistics, a doubly stochastic model is a type of model that can arise in many contexts, but in particular in modelling time-series and stochastic processes....
- Drift rate — redirects to Stochastic driftStochastic driftIn probability theory, stochastic drift is the change of the average value of a stochastic process. A related term is the drift rate which is the rate at which the average changes. This is in contrast to the random fluctuations about this average value...
- Dudley's theoremDudley's theoremIn probability theory, Dudley’s theorem is a result relating the expected upper bound and regularity properties of a Gaussian process to its entropy and covariance structure. The result was proved in a landmark 1967 paper of Richard M...
- Dummy variable (statistics)
- Duncan's new multiple range testDuncan's new multiple range testIn statistics, Duncan's new multiple range test is a multiple comparison procedure developed by David B. Duncan in 1955. Duncan's MRT belongs to the general class of multiple comparison procedures that use the studentized range statistic qr to compare sets of means.Duncan's new multiple range test...
- Durbin testDurbin testIn the analysis of designed experiments, the Friedman test is the most common non-parametric test for complete block designs. The Durbin test is a nonparametric test for balanced incomplete designs that reduces to the Friedman test in the case of a complete block design.-Background:In a randomized...
- Durbin–Watson statistic
- Dutch bookDutch bookIn gambling a Dutch book or lock is a set of odds and bets which guarantees a profit, regardless of the outcome of the gamble. It is associated with probabilities implied by the odds not being coherent....
- Dvoretzky–Kiefer–Wolfowitz inequalityDvoretzky–Kiefer–Wolfowitz inequalityIn the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz inequality predicts how close an empirically determined distribution function will be to the distribution function from which the empirical samples are drawn...
- Dyadic distributionDyadic distributionA dyadic distribution is a specific type of discrete or categorical probability distribution that is of some theoretical importance in data compression.-Definition:...
- Dynamic Bayesian networkDynamic Bayesian networkA dynamic Bayesian network is a Bayesian network that represents sequences of variables. These sequences are often time-series or sequences of symbols . The hidden Markov model can be considered as a simple dynamic Bayesian network.- References :* , Zoubin Ghahramani, Lecture Notes In Computer...
- Dynamic factor
E
- E-statistic
- Earth mover's distanceEarth Mover's DistanceIn computer science, the earth mover's distance is a measure of the distance between two probability distributions over a region D. In mathematics, this is known as the Wasserstein metric...
- Ecological correlationEcological correlationIn statistics, an ecological correlation is a correlation between two variables that are group means, in contrast to a correlation between two variables that describe individuals. For example, one might study the correlation between physical activity and weight among sixth-grade children...
- Ecological fallacyEcological fallacyAn ecological fallacy is a logical fallacy in the interpretation of statistical data in an ecological study, whereby inferences about the nature of specific individuals are based solely upon aggregate statistics collected for the group to which those individuals belong...
- Ecological studyEcological studyAn ecological study is an epidemiological study in which the unit of analysis is a population rather than an individual. For instance, an ecological study may look at the association between smoking and lung cancer deaths in different countries...
- EconometricsEconometricsEconometrics has been defined as "the application of mathematics and statistical methods to economic data" and described as the branch of economics "that aims to give empirical content to economic relations." More precisely, it is "the quantitative analysis of actual economic phenomena based on...
- Econometric modelEconometric modelEconometric models are statistical models used in econometrics. An econometric model specifies the statistical relationship that is believed to hold between the various economic quantities pertaining to a particular economic phenomenon under study...
- Econometric software – a list of software articles
- Economic dataEconomic dataEconomic data or economic statistics may refer to data describing an actual economy, past or present. These are typically found in time-series form, that is, covering more than one time period or in cross-sectional data in one time period Economic data or economic statistics may refer to data...
- Economic epidemiologyEconomic epidemiologyEconomic epidemiology is a field at the intersection of epidemiology and economics. Its premise is to incorporate incentives for healthy behavior and their attendant behavioral responses into an epidemiological context to better understand how diseases are transmitted...
- Economic statisticsEconomic statisticsEconomic statistics is a topic in applied statistics that concerns the collection, processing, compilation, dissemination, and analysis of economic data. It is also common to call the data themselves 'economic statistics', but for this usage see economic data. The data of concern to economic ...
- Eddy covarianceEddy covarianceThe eddy covariance technique is a key atmospheric flux measurement technique to measure and calculate vertical turbulent fluxes within atmospheric boundary layers...
- Edgeworth seriesEdgeworth seriesThe Gram–Charlier A series , and the Edgeworth series are series that approximate a probability distribution in terms of its cumulants...
- Effect sizeEffect sizeIn statistics, an effect size is a measure of the strength of the relationship between two variables in a statistical population, or a sample-based estimate of that quantity...
- Efficiency (statistics)Efficiency (statistics)In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors...
- Efficient estimator
- Ehrenfest modelEhrenfest modelThe Ehrenfest model of diffusion was proposed by Paul Ehrenfest to explain the second law of thermodynamics. The model considers N particles in two containers. Particles independently change container at a rate λ...
- EigenpollEigenpollAn eigenpoll is a type of statistical survey which gathers knowledge from the community. It differs from opinion polls by finding the best solution, rather than finding the most popular opinion.- Methodology :...
- Elastic mapElastic mapElastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are system of elastic springs embedded in the dataspace. This system approximates a low-dimensional manifold...
- Elliptical distributionElliptical distributionIn probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution and inherit some of its properties.-Definition:...
- Ellsberg paradoxEllsberg paradoxThe Ellsberg paradox is a paradox in decision theory and experimental economics in which people's choices violate the expected utility hypothesis.An alternate viewpoint is that expected utility theory does not properly describe actual human choices...
- Elston–Stewart algorithm
- EmpiricalEmpiricalThe word empirical denotes information gained by means of observation or experimentation. Empirical data are data produced by an experiment or observation....
- Empirical Bayes methodEmpirical Bayes methodEmpirical Bayes methods are procedures for statistical inference in which the prior distribution is estimated from the data. This approach stands in contrast to standardBayesian methods, for which the prior distribution is fixed before any data are observed...
- Empirical distribution functionEmpirical distribution functionIn statistics, the empirical distribution function, or empirical cdf, is the cumulative distribution function associated with the empirical measure of the sample. This cdf is a step function that jumps up by 1/n at each of the n data points. The empirical distribution function estimates the true...
- Empirical measureEmpirical measureIn probability theory, an empirical measure is a random measure arising from a particular realization of a sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical statistics....
- Empirical orthogonal functionsEmpirical orthogonal functionsIn statistics and signal processing, the method of empirical orthogonal function analysis is a decomposition of a signal or data set in terms of orthogonal basis functions which are determined from the data. It is the same as performing a principal components analysis on the data, except that the...
- Empirical probabilityEmpirical probabilityEmpirical probability, also known as relative frequency, or experimental probability, is the ratio of the number of "favorable" outcomes to the total number of trials, not in a sample space but in an actual sequence of experiments...
- Empirical processEmpirical processThe study of empirical processes is a branch of mathematical statistics and a sub-area of probability theory. It is a generalization of the central limit theorem for empirical measures...
- Empirical statistical lawsEmpirical statistical lawsAn empirical statistical law or a law of statistics represents a type of behaviour that has been found across a number of datasets and, indeed, across a range of types of data sets. Many of these observances have been formulated and proved as statistical or probabilistic theorems and the term...
- Endogeneity (economics)Endogeneity (economics)In an econometric model, a parameter or variable is said to be endogenous when there is a correlation between the parameter or variable and the error term. Endogeneity can arise as a result of measurement error, autoregression with autocorrelated errors, simultaneity, omitted variables, and sample...
- End point of clinical trialsEnd point of clinical trialsAn endpoint is something which is measured in a clinical trial or study. Measuring the selected endpoints is the goal of a trial. The response rate and survival are examples of the endpoints....
- Energy distanceEnergy distanceEnergy distance is a statistical distance between probability distributions. If X and Y are independent random vectors in Rd, with cumulative distribution functions F and G respectively, then the energy distance between the distributions F and G is definedwhere X, X' are independent and identically...
- Energy statisticsEnergy statisticsEnergy statistics refers to collecting, compiling, analyzing and disseminating data on commodities such as coal, crude oil, natural gas, electricity, or renewable energy sources , when they are used for the energy they contain...
- Encyclopedia of Statistical SciencesEncyclopedia of Statistical SciencesThe Encyclopedia of Statistical Sciences is the largest-ever encyclopaedia of statistics. It is published by John Wiley & Sons.The first edition, in nine volumes, was edited by Norman Lloyd Johnson and Samuel Kotz and appeared in 1982. The second edition, in 16 volumes, was published in 2006. ...
(book) - Engineering statisticsEngineering statisticsEngineering statistics combines engineering and statistics:# Design of Experiments is a methodology for formulating scientific and engineering problems using statistical models. The protocol specifies a randomization procedure for the experiment and specifies the primary data-analysis,...
- Engineering tolerance
- Engset calculation
- Ensemble forecastingEnsemble forecastingEnsemble forecasting is a numerical prediction method that is used to attempt to generate a representative sample of the possible future states of a dynamical system...
- Ensemble Kalman filterEnsemble Kalman filterThe ensemble Kalman filter is a recursive filter suitable for problems with a large number of variables, such as discretizations of partial differential equations in geophysical models...
- Entropy (information theory)
- Entropy estimationEntropy estimationEstimating the differential entropy of a system or process, given some observations, is useful in various science/engineering applications, such as Independent Component Analysis, image analysis, genetic analysis, speech recognition, manifold learning, and time delay estimation...
- Entropy power inequalityEntropy power inequalityIn mathematics, the entropy power inequality is a result in probability theory that relates to so-called "entropy power" of random variables. It shows that the entropy power of suitably well-behaved random variables is a superadditive function. The entropy power inequality was proved in 1948 by...
- Environmental statisticsEnvironmental statisticsEnvironmental statistics is the application of statistical methods to environmental science. It covers procedures for dealing with questions concerning both the natural environment in its undistrurbed state and the interaction of humanity with the environment...
- Epi InfoEpi InfoEpi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....
— software - EpidataEpidataEpiData refers to a group of applications used in combination for creating documented data structures and analysis of quantitative data. The EpiData Association, which created the software, was created in 1999 and is based in Denmark...
— software - Epidemic modelEpidemic modelAn Epidemic model is a simplified means of describing the transmission of communicable disease through individuals.-Introduction:The outbreak and spread of disease has been questioned and studied for many years...
- Epidemiological methodsEpidemiological methodsThe science of epidemiology has matured significantly from the times of Hippocrates and John Snow. The techniques for gathering and analyzing epidemiological data vary depending on the type of disease being monitored but each study will have overarching similarities....
- EpilogismEpilogismEpilogism is a style of Inference invented by the ancient Empiric school of medicine. It is a theory-free method of looking at history by accumulating fact with minimal generalization and being conscious of the side effects of making causal claims .Epilogism is an inference which moves entirely...
- Epitome (image processing)Epitome (image processing)In image processing, an epitome is a condensed digital representation of the essential statistical properties of ordered datasets, such as matrices representing images, audio signals, videos, or genetic sequences...
- Epps effectEpps effectIn econometrics and time series analysis, the Epps effect, named after T. W. Epps, is the phenomenon that the empirical correlation between the returns of two different stocks decreases as the sampling frequency of data increases. The phenomenon is caused by non-synchronous/asynchronous...
- EquatingEquatingTest equating traditionally refers to the statistical process of determining comparable scores on different forms of an exam. It can be accomplished using either classical test theory or item response theory....
– test equating - EquipossibleEquipossibleEquipossibility is a philosophical concept in possibility theory that is a precursor to the notion of equiprobability in probability theory. It is used to distinguish what can occur in a probability experiment...
- EquiprobableEquiprobableEquiprobability is a philosophical concept in probability theory that allows one to assign equal probabilities to outcomes when they are judged to be equipossible or to be "equally likely" in some sense...
- Erdős–Rényi modelErdos–Rényi modelIn graph theory, the Erdős–Rényi model, named for Paul Erdős and Alfréd Rényi, is either of two models for generating random graphs, including one that sets an edge between each pair of nodes with equal probability, independently of the other edges...
- Erlang distribution
- Ergodic theoryErgodic theoryErgodic theory is a branch of mathematics that studies dynamical systems with an invariant measure and related problems. Its initial development was motivated by problems of statistical physics....
- ErgodicityErgodicityIn mathematics, the term ergodic is used to describe a dynamical system which, broadly speaking, has the same behavior averaged over time as averaged over space. In physics the term is used to imply that a system satisfies the ergodic hypothesis of thermodynamics.-Etymology:The word ergodic is...
- Error barError barError bars are a graphical representation of the variability of data and are used on graphs to indicate the error, or uncertainty in a reported measurement. They give a general idea of how accurate a measurement is, or conversely, how far from the reported value the true value might be...
- Error correction modelError correction modelAn error correction model is a dynamical system with the characteristics that the deviation of the current state from its long-run relationship will be fed into its short-run dynamics....
- Error functionError functionIn mathematics, the error function is a special function of sigmoid shape which occurs in probability, statistics and partial differential equations...
- Errors and residuals in statisticsErrors and residuals in statisticsIn statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...
- Errors-in-variables modelsErrors-in-variables modelsIn statistics and econometrics, errors-in-variables models or measurement errors models are regression models that account for measurement errors in the independent variables...
- An Essay towards solving a Problem in the Doctrine of ChancesAn Essay towards solving a Problem in the Doctrine of ChancesAn Essay towards solving a Problem in the Doctrine of Chances is a work on the mathematical theory of probability by the Reverend Thomas Bayes, published in 1763, two years after its author's death. It included a statement of a special case of what is now called Bayes' theorem. In 18th-century...
- Estimating equationsEstimating equationsIn statistics, the method of estimating equations is a way of specifying how the parameters of a statistical model should be estimated. This can be thought of as a generalisation of many classical methods --- the method of moments, least squares, and maximum likelihood --- as well as some recent...
- EstimationEstimationEstimation is the calculated approximation of a result which is usable even if input data may be incomplete or uncertain.In statistics,*estimation theory and estimator, for topics involving inferences about probability distributions...
- Estimation theoryEstimation theoryEstimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...
- Estimation of covariance matricesEstimation of covariance matricesIn statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution...
- Estimation of signal parameters via rotational invariance techniques
- EstimatorEstimatorIn statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule and its result are distinguished....
- Etemadi's inequalityEtemadi's inequalityIn probability theory, Etemadi's inequality is a so-called "maximal inequality", an inequality that gives a bound on the probability that the partial sums of a finite collection of independent random variables exceed some specified bound...
- Ethical problems using children in clinical trialsEthical problems using children in clinical trialsIn health care, a clinical trial is a comparison test of a medication or other medical treatment , versus a placebo , other medications or devices, or the standard medical treatment for a patient's condition....
- Event (probability theory)Event (probability theory)In probability theory, an event is a set of outcomes to which a probability is assigned. Typically, when the sample space is finite, any subset of the sample space is an event...
- Event studyEvent studyAn Event study is a statistical method to assess the impact of an event on the value of a firm. For example, the announcement of a merger between two business entities can be analyzed to see whether investors believe the merger will create or destroy value...
- Evidence under Bayes theoremEvidence under Bayes theoremBayes' theorem provides a way of updating the probability of an event in the light of new information. In the evidence law context, for example, it could be used as a way of updating the probability that a genetic sample found at the scene of the crime came from the defendant in light of a genetic...
- Evolutionary data miningEvolutionary data miningEvolutionary data mining, or genetic data mining is an umbrella term for any data mining using evolutionary algorithms. While it can be used for mining data from DNA sequences, it is not limited to biological contexts and can be used in any classification-based prediction scenario, which helps...
- Ewens's sampling formulaEwens's sampling formulaIn population genetics, Ewens' sampling formula, describes the probabilities associated with counts of how many different alleles are observed a given number of times in the sample.-Definition:...
- EWMA chartEWMA chartIn statistical quality control, the EWMA chart is a type of control chart used to monitor either variables or attributes-type data using the monitored business or industrial process's entire history of output...
- Exact statisticsExact statisticsExact statistics, such as that described in exact test, is a branch of statistics that was developed to provide more accurate results pertaining to statistical testing and interval estimation by eliminating procedures based on asymptotic and approximate statistical methods...
- Exact testExact testIn statistics, an exact test is a test where all assumptions upon which the derivation of the distribution of the test statistic is based are met, as opposed to an approximate test, in which the approximation may be made as close as desired by making the sample size big enough...
- Examples of Markov chainsExamples of Markov chains- Board games played with dice :A game of snakes and ladders or any other game whose moves are determined entirely by dice is a Markov chain, indeed, an absorbing Markov chain. This is in contrast to card games such as blackjack, where the cards represent a 'memory' of the past moves. To see the...
- Excess riskExcess riskIn statistics, excess risk is a measure of the association between a specified risk factor and a specified outcome...
- Exchange paradox
- Exchangeable random variables
- Expander walk samplingExpander walk samplingIn the mathematical discipline of graph theory, the expander walk sampling theorem states that sampling vertices in an expander graph by doing a random walk is almost as good as sampling the vertices independently from a uniform distribution....
- Expectation-maximization algorithmExpectation-maximization algorithmIn statistics, an expectation–maximization algorithm is an iterative method for finding maximum likelihood or maximum a posteriori estimates of parameters in statistical models, where the model depends on unobserved latent variables...
- Expectation propagationExpectation propagationExpectation propagation is a technique in Bayesian machine learning, developed by Thomas Minka.EP finds approximations to a probability distribution. It uses an iterative approach that leverages the factorization structure of the target distribution. It differs from other Bayesian approximation...
- Expected utility hypothesisExpected utility hypothesisIn economics, game theory, and decision theory the expected utility hypothesis is a theory of utility in which "betting preferences" of people with regard to uncertain outcomes are represented by a function of the payouts , the probabilities of occurrence, risk aversion, and the different utility...
- Expected valueExpected valueIn probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...
- Expected value of sample informationExpected value of sample informationIn decision theory, the expected value of sample information is the expected increase in utility that you could obtain from gaining access to a sample of additional observations before making a decision. The additional information obtained from the sample may allow you to make a more informed,...
- ExperimentExperimentAn experiment is a methodical procedure carried out with the goal of verifying, falsifying, or establishing the validity of a hypothesis. Experiments vary greatly in their goal and scale, but always rely on repeatable procedure and logical analysis of the results...
- Experimental design diagramExperimental Design DiagramExperimental Design Diagram is a diagram used by scientists, to design an experiment. This diagram helps to identify the essential components of an experiment...
- Experimental event rateExperimental event rateIn epidemiology and biostatistics, the experimental event rate is a measure of how often a particular statistical event occurs within the experimental group of an experiment ....
- Experimental research design
- Experimental uncertainty analysisExperimental uncertainty analysisThe purpose of this introductory article is to discuss the experimental uncertainty analysis of a derived quantity, based on the uncertainties in the experimentally measured quantities that are used in some form of mathematical relationship to calculate that derived quantity...
- Experimental techniques — redirects to Experimental research design
- Experimenter's biasExperimenter's biasIn experimental science, experimenter's bias is subjective bias towards a result expected by the human experimenter. David Sackett, in a useful review of biases in clinical studies, states that biases can occur in any one of seven stages of research:...
- Experimentwise error rateExperimentwise error rateIn statistics, during multiple comparisons testing, experimentwise error rate is the probability of at least one false rejection of the null hypothesis over an entire experiment. The α that is assigned applies to all of the hypothesis tests as a whole, not individually as in the comparisonwise...
- Explained sum of squaresExplained sum of squaresIn statistics, the explained sum of squares is a quantity used in describing how well a model, often a regression model, represents the data being modelled...
- Explained variationExplained variationIn statistics, explained variation or explained randomness measures the proportion to which a mathematical model accounts for the variation of a given data set...
- Explanatory variable
- Exploratory data analysisExploratory data analysisIn statistics, exploratory data analysis is an approach to analysing data sets to summarize their main characteristics in easy-to-understand form, often with visual graphs, without using a statistical model or having formulated a hypothesis...
- Exponential dispersion modelExponential dispersion modelExponential dispersion models are statistical models in which the probability distribution is of a special form. This class of models represents a generalisation of the exponential family of models which themselves play an important role in statistical theory because they have a special structure...
- Exponential distributionExponential distributionIn probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...
- Exponential familyExponential familyIn probability and statistics, an exponential family is an important class of probability distributions sharing a certain form, specified below. This special form is chosen for mathematical convenience, on account of some useful algebraic properties, as well as for generality, as exponential...
- Exponential-logarithmic distribution
- Exponential power distribution — redirects to Generalized normal distribution
- Exponential random numbers — redirect to subsection of Exponential distributionExponential distributionIn probability theory and statistics, the exponential distribution is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e...
- Exponential smoothingExponential smoothingExponential smoothing is a technique that can be applied to time series data, either to produce smoothed data for presentation, or to make forecasts. The time series data themselves are a sequence of observations. The observed phenomenon may be an essentially random process, or it may be an...
- Exponentiated Weibull distributionExponentiated weibull distributionIn statistics, the exponentiated Weibull family of probability distributions was introduced by Mudholkar and Srivastava as an extension of the Weibull family obtained by adding a second shape parameter....
- Exposure variable
- Extended Kalman filterExtended Kalman filterIn estimation theory, the extended Kalman filter is the nonlinear version of the Kalman filter which linearizes about the current mean and covariance...
- Extended negative binomial distributionExtended negative binomial distributionIn probability and statistics the extended negative binomial distribution is a discrete probability distribution extending the negative binomial distribution. It is a truncated version of the negative binomial distribution for which estimation methods have been studied.In the context of actuarial...
- Extensions of Fisher's methodExtensions of Fisher's methodIn statistics, extensions of Fisher's method are a group of approaches that allow approximately valid statistical inferences to be made when the assumptions required for the direct application of Fisher's method are not valid...
- External validityExternal validityExternal validity is the validity of generalized inferences in scientific studies, usually based on experiments as experimental validity....
- Extrapolation domain analysisExtrapolation domain analysisExtrapolation domain analysis is a methodology for identifying geographical areas that seem suitable for adoption of innovative ecosystem management practices on the basis of sites exhibiting similarity in conditions such as climatic, land use and socio-economic indicators...
- Extreme value theoryExtreme value theoryExtreme value theory is a branch of statistics dealing with the extreme deviations from the median of probability distributions. The general theory sets out to assess the type of probability distributions generated by processes...
- Extremum estimatorExtremum estimatorIn statistics and econometrics, extremum estimators is a wide class of estimators for parametric models that are calculated through maximization of a certain objective function, which depends on the data...
F
- F-distribution
- F-divergenceF-divergenceIn probability theory, an ƒ-divergence is a function Df that measures the difference between two probability distributions P and Q...
- F-statisticsF-statisticsIn population genetics, F-statistics describe the level of heterozygosity in a population; more specifically the degree of a reduction in heterozygosity when compared to Hardy–Weinberg expectation...
– population genetics - F-testF-testAn F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis.It is most often used when comparing statistical models that have been fit to a data set, in order to identify the model that best fits the population from which the data were sampled. ...
- F-test of equality of variancesF-test of equality of variancesIn statistics, an F-test for the null hypothesis that two normal populations have the same variance is sometimes used, although it needs to be used with caution as it can be sensitive to the assumption that the variables have this distribution....
- F1 scoreF1 ScoreIn statistics, the F1 score is a measure of a test's accuracy. It considers both the precision p and the recall r of the test to compute the score: p is the number of correct results divided by the number of all returned results and r is the number of correct results divided by the number of...
- Factor analysisFactor analysisFactor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved, uncorrelated variables called factors. In other words, it is possible, for example, that variations in three or four observed variables...
- Factor regression model
- Factor graphFactor graphIn probability theory and its applications, a factor graph is a particular type of graphical model, with applications in Bayesian inference, that enables efficient computation of marginal distributions through the sum-product algorithm...
- Factorial codeFactorial codeMost real world data sets consist of data vectors whose individual components are not statistically independent, that is, they are redundant in the statistical sense. Then it is desirable to create a factorial code of the data, i...
- Factorial experimentFactorial experimentIn statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be...
- Factorial moment
- Factorial moment generating function
- Failure rateFailure rateFailure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is often denoted by the Greek letter λ and is important in reliability engineering....
- Fair coinFair coinIn probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin...
- Falconer's formulaFalconer's formulaFalconer's formula is used in twin studies to determine the genetic heritability of a trait based on the difference between twin correlations.The formula is hb2 = 2, where hb2 is the broad sense heritability, rmz is the identical twin correlation, and rdz is the fraternal twin correlation...
- False discovery rateFalse discovery rateFalse discovery rate control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. In a list of rejected hypotheses, FDR controls the expected proportion of incorrectly rejected null hypotheses...
- False negativeType I and type II errorsIn statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...
- False positiveType I and type II errorsIn statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...
- False positive rateFalse positive rateWhen performing multiple comparisons in a statistical analysis, the false positive rate is the probability of falsely rejecting the null hypothesis for a particular test among all the tests performed...
- False positive paradoxFalse positive paradoxThe false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate...
- Familywise error rateFamilywise error rateIn statistics, familywise error rate is the probability of making one or more false discoveries, or type I errors among all the hypotheses when performing multiple pairwise tests.-Classification of m hypothesis tests:...
- Fan chart (time series)Fan chart (time series)In time series analysis, a fan chart is a chart that joins a simple line chart for observed past data, by showing ranges for possible values of future data together with a line showing a central estimate or most likely value for the future outcomes...
- Fano factor
- Fast Fourier transformFast Fourier transformA fast Fourier transform is an efficient algorithm to compute the discrete Fourier transform and its inverse. "The FFT has been called the most important numerical algorithm of our lifetime ." There are many distinct FFT algorithms involving a wide range of mathematics, from simple...
- Fast Kalman filterFast Kalman filterThe fast Kalman filter , devised by Antti Lange , is an extension of the Helmert-Wolf blocking method from geodesy to real-time applications of Kalman filtering such as satellite imaging of the Earth...
- FastICAFastICAFastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. The algorithm is based on a fixed-point iteration scheme maximizing non-Gaussianity as a measure of statistical independence...
– fast independent component analysis - Fat tailFat tailA fat-tailed distribution is a probability distribution that has the property, along with the heavy-tailed distributions, that they exhibit extremely large skewness or kurtosis. This comparison is often made relative to the ubiquitous normal distribution, which itself is an example of an...
- Feasible generalized least squares
- Feature extractionFeature extractionIn pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant then the input data will be transformed into a reduced representation...
- Feller processFeller processIn probability theory relating to stochastic processes, a Feller process is a particular kind of Markov process.-Definitions:Let X be a locally compact topological space with a countable base...
- Feller's coin-tossing constantsFeller's coin-tossing constantsFeller's coin-tossing constants are a set of numerical constants which describe asymptotic probabilities that in n independent tosses of a fair coin, no run of k consecutive heads appears....
- Feller-continuous processFeller-continuous processIn mathematics, a Feller-continuous process is a continuous-time stochastic process for which the expected value of suitable statistics of the process at a given time in the future depend continuously on the initial condition of the process...
- Felsenstein's tree peeling algorithmFelsenstein's Tree Peeling AlgorithmIn statistical genetics, Felsenstein's tree-pruning algorithm , due to Joseph Felsenstein, is an algorithm for computing the likelihood of an evolutionary tree from nucleic acid sequence data....
— statistical genetics - Fides (reliability)Fides (reliability)Fides is a guide allowing estimated reliability calculation for electronic components and systems. The reliability prediction is generally expressed in FIT or MTBF...
- Fiducial inferenceFiducial inferenceFiducial inference is one of a number of different types of statistical inference. These are rules, intended for general application, by which conclusions can be drawn from samples of data. In modern statistical practice, attempts to work with fiducial inference have fallen out of fashion in...
- Field experimentField experimentA field experiment applies the scientific method to experimentally examine an intervention in the real world rather than in the laboratory...
- Fieller's theoremFieller's theoremIn statistics, Fieller's theorem allows the calculation of a confidence interval for the ratio of two means.-Approximate confidence interval:...
- File drawer problem
- Filtering problem (stochastic processes)Filtering problem (stochastic processes)In the theory of stochastic processes, the filtering problem is a mathematical model for a number of filtering problems in signal processing and the like. The general idea is to form some kind of "best estimate" for the true value of some system, given only some observations of that system...
- Financial econometricsFinancial econometricsPeople working in the finance industry often use econometric techniques in a range of activities. For example in support of portfolio management, risk management and in the analysis of securities...
- Financial models with long-tailed distributions and volatility clustering
- Finite-dimensional distributionFinite-dimensional distributionIn mathematics, finite-dimensional distributions are a tool in the study of measures and stochastic processes. A lot of information can be gained by studying the "projection" of a measure onto a finite-dimensional vector space .-Finite-dimensional distributions of a measure:Let be a measure space...
- First-hitting-time modelFirst-hitting-time modelIn statistics, first-hitting-time models are a sub-class of survival models. The first hitting time, also called first passage time, of a set A with respect to an instance of a stochastic process is the time until the stochastic process first enters A....
- First-in-man studyFirst-in-man studyA first-in-man study is a clinical trial where a medical procedure, previously developed and assessed through in vitro or animal testing, or through mathematical modelling is tested on human subjects for the first time....
- Fishburn–Shepp inequalityFishburn–Shepp inequalityIn combinatorial mathematics, the Fishburn–Shepp inequality is an inequality for the number of extensions of partial orders to linear orders, found by and .It states that if x, y, and z are incomparable elements of a finite poset, then PIn combinatorial mathematics, the Fishburn–Shepp inequality...
- Fisher consistencyFisher consistencyIn statistics, Fisher consistency, named after Ronald Fisher, is a desirable property of an estimator asserting that if the estimator were calculated using the entire population rather than a sample, the true value of the estimated parameter would be obtained...
- Fisher informationFisher informationIn mathematical statistics and information theory, the Fisher information is the variance of the score. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior...
- Fisher information metricFisher information metricIn information geometry, the Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space....
- Fisher kernelFisher kernelIn statistical classification, the Fisher kernel, named in honour of Sir Ronald Fisher, is a function that measures the similarity of two objects on the basis of sets of measurements for each object and a statistical model...
- Fisher transformationFisher transformationIn statistics, hypotheses about the value of the population correlation coefficient ρ between variables X and Y can be tested using the Fisher transformation applied to the sample correlation coefficient r.-Definition:...
- Fisher's exact testFisher's exact testFisher's exact test is a statistical significance test used in the analysis of contingency tables where sample sizes are small. It is named after its inventor, R. A...
- Fisher's inequalityFisher's inequalityIn combinatorial mathematics, Fisher's inequality, named after Ronald Fisher, is a necessary condition for the existence of a balanced incomplete block design satisfying certain prescribed conditions....
- Fisher's linear discriminator
- Fisher's methodFisher's MethodIn statistics, Fisher's method, also known as Fisher's combined probability test, is a technique for data fusion or "meta-analysis" . It was developed by and named for Ronald Fisher...
- Fisher's noncentral hypergeometric distributionFisher's noncentral hypergeometric distributionIn probability theory and statistics, Fisher's noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where sampling probabilities are modified by weight factors...
- Fisher's z-distribution
- Fisher-Tippett distribution — redirects to Generalized extreme value distribution
- Fisher–Tippett–Gnedenko theorem
- Five-number summaryFive-number summaryThe five-number summary is a descriptive statistic that provides information about a set of observations. It consists of the five most important sample percentiles:# the sample minimum # the lower quartile or first quartile...
- Fixed effects estimatorFixed effects estimatorIn econometrics and statistics, a fixed effects model is a statistical model that represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random. This is in contrast to random effects models and mixed models in which either all or some of...
and Fixed effects estimation — redirect to Fixed effects model - FLAME clusteringFLAME clusteringFuzzy clustering by Local Approximation of MEmberships is a data clustering algorithm that defines clusters in the dense parts of a dataset and performs cluster assignment solely based on the neighborhood relationships among objects...
- Fleiss' kappaFleiss' kappaFleiss' kappa is a statistical measure for assessing the reliability of agreement between a fixed number of raters when assigning categorical ratings to a number of items or classifying items. This contrasts with other kappas such as Cohen's kappa, which only work when assessing the agreement...
- Fleming-Viot processFleming-Viot processIn probability theory, a Fleming–Viot process is a member of a particular subset of probability-measure valued Markov processes on compact metric spaces, as defined in the 1979 paper by Wendell Helms Fleming and Michel Viot...
- Flood risk assessmentFlood risk assessmentA flood risk assessment is an assessment of the risk of flooding, particularly in relation to residential, commercial and industrial land use.-England and Wales:...
- Floor effectFloor effectIn statistics, the term floor effect refers to when data cannot take on a value lower than some particular number, called the floor.An example of this is when an IQ test is given to young children who have either been given training or have been given no training...
- FNN algorithmFNN algorithmThe false nearest neighbor algorithm is an algorithm for estimating the embedding dimension....
(false nearest neighbour algorithm) - Focused information criterionFocused information criterionIn statistics, the focused information criterion is a method for selecting the most appropriate model among a set of competitors for a given data set...
- Fokker–Planck equation
- Folded normal distributionFolded Normal DistributionThe folded normal distribution is a probability distribution related to the normal distribution. Given a normally distributed random variable X with mean μ and variance σ2, the random variable Y = |X| has a folded normal distribution. Such a case may be encountered if only the magnitude of some...
- Forecast biasForecast biasA forecast bias occurs when there are consistent differences between actual outcomes and previously generated forecasts of those quantities; that is, forecasts may have a general tendency to be too high or too low...
- Forecast errorForecast errorIn statistics, a forecast error is the difference between the actual or real and the predicted or forecast value of a time series or any other phenomenon of interest....
- Forecast skillForecast skillSkill in forecasting is a scaled representation of forecast error that relates the forecast accuracy of a particular forecast model to some reference model....
- ForecastingForecastingForecasting is the process of making statements about events whose actual outcomes have not yet been observed. A commonplace example might be estimation for some variable of interest at some specified future date. Prediction is a similar, but more general term...
- Forest plotForest plotA forest plot is a graphical display designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. It was developed for use in medical research as a means of graphically representing a meta-analysis of the results of...
- Fork-join queueFork-join queueIn queueing theory, a discipline within the mathematical theory of probability, a fork-join queue is a queue where incoming jobs are split on arrival for service by numerous servers and joined before departure. The model is often used for parallel computations or systems where products need to be...
- Formation matrixFormation matrixIn statistics and information theory, the expected formation matrix of a likelihood function L is the matrix inverse of the Fisher information matrix of L, while the observed formation matrix of L is the inverse of the observed information matrix of L.Currently, no notation for dealing with...
- Forward measureForward measureIn finance, a T-forward measure is a pricing measure absolutely continuous with respect to a risk-neutral measure but rather than using the money market as numeraire, it uses a bond with maturity T...
- Foster's theoremFoster's theoremIn probability theory, Foster's theorem, named after F. G. Foster, is used to draw conclusions about the positive recurrence of Markov chains with countable state spaces...
- Foundations of statisticsFoundations of statisticsFoundations of statistics is the usual name for the epistemological debate in statistics over how one should conduct inductive inference from data...
- Founders of statisticsFounders of statisticsStatistics is the theory and application of mathematics to the scientific method including hypothesis generation, experimental design, sampling, data collection, data summarization, estimation, prediction and inference from those results to the population from which the experimental sample was drawn...
- Fourier analysis
- Fraction of variance unexplainedFraction of variance unexplainedIn statistics, the fraction of variance unexplained in the context of a regression task is the fraction of variance of the regressand Y which cannot be explained, i.e., which is not correctly predicted, by the explanatory variables X....
- Fractional Brownian motion
- Fractional factorial design
- Fréchet distribution
- Fréchet meanFréchet meanThe Fréchet mean , is the point, x, that minimizes the Fréchet function, in cases where such a unique minimizer exists. The value at a point p, of the Fréchet function associated to a random point X on a complete metric space is the expected squared distance from p to X...
- Free statistical softwareFree statistical softwareIn this article, the word free generally means can be legally obtained without paying any money . Just a few of the software packages mentioned here are also free as in the sense of free speech: they are not only open source but also free software in the sense that the source code of the software...
- Freedman's paradoxFreedman's paradoxIn statistical analysis, Freedman's paradox, named after David Freedman, describes a problem in model selection whereby predictor variables with no explanatory power can appear artificially important. Freedman demonstrated that this is a common occurrence when the number of variables is similar to...
- Freedman–Diaconis rule
- Freidlin–Wentzell theorem
- Frequency (statistics)Frequency (statistics)In statistics the frequency of an event i is the number ni of times the event occurred in the experiment or the study. These frequencies are often graphically represented in histograms....
- Frequency distributionFrequency distributionIn statistics, a frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of...
- Frequency domainFrequency domainIn electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....
- Frequency probabilityFrequency probabilityFrequency probability is the interpretation of probability that defines an event's probability as the limit of its relative frequency in a large number of trials. The development of the frequentist account was motivated by the problems and paradoxes of the previously dominant viewpoint, the...
- Frequentist inferenceFrequentist inferenceFrequentist inference is one of a number of possible ways of formulating generally applicable schemes for making statistical inferences: that is, for drawing conclusions from statistical samples. An alternative name is frequentist statistics...
- Friedman testFriedman testThe Friedman test is a non-parametric statistical test developed by the U.S. economist Milton Friedman. Similar to the parametric repeated measures ANOVA, it is used to detect differences in treatments across multiple test attempts. The procedure involves ranking each row together, then...
- Friendship paradoxFriendship paradoxThe friendship paradox is the phenomenon first observed by the sociologist Scott L. Feld in 1991 that most people have fewer friends than their friends have, on average. It can be explained as a form of sampling bias in which people with greater numbers of friends have an increased likelihood of...
- Frisch–Waugh–Lovell theorem
- Fully crossed design
- Function approximationFunction approximationThe need for function approximations arises in many branches of applied mathematics, and computer science in particular. In general, a function approximation problem asks us to select a function among a well-defined class that closely matches a target function in a task-specific way.One can...
- Functional data analysisFunctional data analysisFunctional data analysis is a branch of statistics that analyzes data providing information about curves, surfaces or anything else varying over a continuum...
- Funnel plotFunnel plotA funnel plot is a useful graph designed to check the existence of publication bias in systematic reviews and meta-analyses. It assumes that the largest studies will be near the average, and small studies will be spread on both sides of the average...
- Fuzzy logicFuzzy logicFuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...
- Fuzzy measure theoryFuzzy measure theoryFuzzy measure theory considers a number of special classes of measures, each of which is characterized by a special property. Some of the measures used in this theory are plausibility and belief measures, fuzzy set membership function and the classical probability measures...
- FWL theoremFWL theoremIn econometrics, the Frisch–Waugh–Lovell theorem is named after the econometricians Ragnar Frisch, Frederick V. Waugh, and Michael C. Lovell.The Frisch–Waugh–Lovell theorem states that if the regression we are concerned with is:...
— relating regression and projection
G
- G-network
- G-testG-testIn statistics, G-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-squared tests were previously recommended....
- Galbraith plotGalbraith plotIn statistics, a Galbraith plot , is one way of displaying several estimates of the same quantity that have different standard errors....
- Gallagher IndexGallagher IndexThe Gallagher Index is used to measure the disproportionality of an electoral outcome, that is the difference between the percentage of votes received and the percentage of seats a party gets in the resulting legislature. This is especially useful for comparing proportionality across electoral...
- Galton–Watson process
- Galton's problemGalton's problemGalton’s problem, named after Sir Francis Galton, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. The problem is now recognized as a general one that applies to all nonexperimental studies and to experimental design as well...
- Gambler's fallacyGambler's fallacyThe Gambler's fallacy, also known as the Monte Carlo fallacy , and also referred to as the fallacy of the maturity of chances, is the belief that if deviations from expected behaviour are observed in repeated independent trials of some random process, future deviations in the opposite direction are...
- Gambler's ruinGambler's ruinThe term gambler's ruin is used for a number of related statistical ideas:* The original meaning is that a gambler who raises his bet to a fixed fraction of bankroll when he wins, but does not reduce it when he loses, will eventually go broke, even if he has a positive expected value on each bet.*...
- Gambling and information theoryGambling and information theoryStatistical inference might be thought of as gambling theory applied to the world around. The myriad applications for logarithmic information measures tell us precisely how to take the best guess in the face of partial information. In that sense, information theory might be considered a formal...
- Game of chanceGame of chanceA game of chance is a game whose outcome is strongly influenced by some randomizing device, and upon which contestants may or may not wager money or anything of monetary value...
- Gamma distribution
- Gamma test (statistics)Gamma test (statistics)In statistics, a gamma test tests the strength of association of the cross tabulated data when both variables are measured at the ordinal level. It makes no adjustment for either table size or ties. Values range from −1 to +1...
- Gamma process
- Gamma variate
- GAUSS (software)GAUSS (software)GAUSS is a matrix programming language for mathematics and statistics, developed and marketed by Aptech Systems. Its primary purpose is the solution of numerical problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization...
- Gauss's inequalityGauss's inequalityIn probability theory, Gauss's inequality gives an upper bound on the probability that a unimodal random variable lies more than any given distance from its mode....
- Gauss–Kuzmin distribution
- Gauss–Markov processGauss–Markov processGauss–Markov stochastic processes are stochastic processes that satisfy the requirements for both Gaussian processes and Markov processes. The stationary Gauss–Markov process is a very special case because it is unique, except for some trivial exceptions...
- Gauss–Markov theoremGauss–Markov theoremIn statistics, the Gauss–Markov theorem, named after Carl Friedrich Gauss and Andrey Markov, states that in a linear regression model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimator of the coefficients is given by the...
- Gauss–Newton algorithm
- Gaussian function
- Gaussian isoperimetric inequality
- Gaussian measureGaussian measureIn mathematics, Gaussian measure is a Borel measure on finite-dimensional Euclidean space Rn, closely related to the normal distribution in statistics. There is also a generalization to infinite-dimensional spaces...
- Gaussian noiseGaussian noiseGaussian noise is statistical noise that has its probability density function equal to that of the normal distribution, which is also known as the Gaussian distribution. In other words, the values that the noise can take on are Gaussian-distributed. A special case is white Gaussian noise, in which...
- Gaussian processGaussian processIn probability theory and statistics, a Gaussian process is a stochastic process whose realisations consist of random values associated with every point in a range of times such that each such random variable has a normal distribution...
- Gaussian process emulatorGaussian process emulatorIn statistics, Gaussian process emulator is one name for a general type of statistical model that has been used in contexts where the problem is to make maximum use of the outputs of a complicated computer-based simulation model. Each run of the simulation model is computationally expensive and...
- Gaussian q-distributionGaussian q-distributionIn mathematical physics and probability and statistics, the Gaussian q-distribution is a family of probability distributions that includes, as limiting cases, the uniform distribution and the normal distribution...
- Geary's CGeary's CGeary's C is a measure of spatial autocorrelation. Like autocorrelation, spatial autocorrelation means that adjacent observations of the same phenomenon are correlated. However, autocorrelation is about proximity in time. Spatial autocorrelation is about proximity in space...
- GEHGEHThe GEH Statistic is a formula used in traffic engineering, traffic forecasting, and traffic modelling to compare two sets of traffic volumes. The GEH formula gets its name from Geoffrey E. Havers, who invented it in the 1970s while working as a transport planner in London, England. Although its...
— a statistic comparing modelled and observed counts - General linear modelGeneral linear modelThe general linear model is a statistical linear model.It may be written aswhere Y is a matrix with series of multivariate measurements, X is a matrix that might be a design matrix, B is a matrix containing parameters that are usually to be estimated and U is a matrix containing errors or...
- General matrix notation of a VAR(p)
- Generalizability theoryGeneralizability theoryGeneralizability theory, or G Theory, is a statistical framework for conceptualizing, investigating, and designing reliable observations. It is used to determine the reliability of measurements under specific conditions. It is particularly useful for assessing the reliability of performance...
- Generalized additive modelGeneralized additive modelIn statistics, the generalized additive model is a statistical model developed by Trevor Hastie and Rob Tibshirani for blending properties of generalized linear models with additive models....
- Generalized additive model for location, scale and shapeGeneralized additive model for location, scale and shapeIn statistics, the generalized additive model location, scale and shape is a class of statistical model that provides extended capabilities compared to the simpler generalized linear models and generalized additive models. These simpler models allow the typical values of a quantity being modelled...
- Generalized canonical correlationGeneralized canonical correlationIn statistics, the generalized canonical correlation analysis , is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. While a conventional CCA generalizes Principal component analysis to two sets of random variables, a gCCA ...
- Generalized chi-squared distribution
- Generalized Dirichlet distributionGeneralized Dirichlet distributionIn statistics, the generalized Dirichlet distribution is a generalization of the Dirichlet distribution with a more general covariance structure and twice the number of parameters...
- Generalized entropy indexGeneralized entropy indexThe generalized entropy index is a general formula for measuring redundancy in data. The redundancy can be viewed as inequality, lack of diversity, non-randomness, compressibility, or segregation in the data. The primary use is for income inequality...
- Generalized estimating equation
- Generalized expected utilityGeneralized expected utilityThe expected utility model developed by John von Neumann and Oskar Morgenstern dominated decision theory from its formulation in 1944 until the late 1970s, not only as a prescriptive, but also as a descriptive model, despite powerful criticism from Maurice Allais and Daniel Ellsberg who showed...
- Generalized extreme value distribution
- Generalized gamma distributionGeneralized gamma distributionThe generalized gamma distribution is a continuous probability distribution with three parameters. It is a generalization of the two-parameter gamma distribution...
- Generalized Gaussian distribution
- Generalised hyperbolic distribution
- Generalized inverse Gaussian distribution
- Generalized least squaresGeneralized least squaresIn statistics, generalized least squares is a technique for estimating the unknown parameters in a linear regression model. The GLS is applied when the variances of the observations are unequal , or when there is a certain degree of correlation between the observations...
- Generalized linear array modelGeneralized linear array modelIn statistics, the generalized linear array model is used for analyzing data sets with array structures. It based on the generalized linear model with the design matrix written as a Kronecker product.- Overview :...
- Generalized linear mixed modelGeneralized linear mixed modelIn statistics, a generalized linear mixed model is a particular type of mixed model. It is an extension to the generalized linear model in which the linear predictor contains random effects in addition to the usual fixed effects...
- Generalized linear modelGeneralized linear modelIn statistics, the generalized linear model is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to...
- Generalized logistic distributionGeneralized logistic distributionThe term generalized logistic distribution is used as the name for several different families of probability distributions. For example, Johnson et al. list four forms, which are listed below. One family described here has also been called the skew-logistic distribution...
- Generalized method of momentsGeneralized method of momentsIn econometrics, generalized method of moments is a generic method for estimating parameters in statistical models. Usually it is applied in the context of semiparametric models, where the parameter of interest is finite-dimensional, whereas the full shape of the distribution function of the data...
- Generalized multidimensional scaling
- Generalized normal distribution
- Generalized p-valueGeneralized p-valueIn statistics, a generalized p-value is an extended version of the classical p-value, which except in a limited number of applications, provide only approximate solutions....
- Generalized Pareto distribution
- Generalized Procrustes analysisGeneralized Procrustes analysisGeneralized Procrustes analysis is a method of statistical analysis that can be used to compare the shapes of objects, or the results of surveys, interviews, panels. It was developed for analyising the results of free-choice profiling, a survey technique which allows respondents to describe a...
- Generalized randomized block designGeneralized randomized block designIn randomized statistical experiments, generalized randomized block designs are used to study the interaction between blocks and treatments...
- Generalized TobitGeneralized TobitA generalized Tobit is a generalization of the econometric Tobit model after James Tobin. It is also called Heckit after James Heckman. Anothername is "type 2 Tobit model".Tobit models assume that a variable is truncated.-References:...
- Generalized Wiener processGeneralized Wiener processIn statistics, a generalized Wiener process is a continuous time random walk with drift and random jumps at every point in time...
- Generative modelGenerative modelIn probability and statistics, a generative model is a model for randomly generating observable data, typically given some hidden parameters. It specifies a joint probability distribution over observation and label sequences...
- Genetic epidemiologyGenetic epidemiologyGenetic epidemiology is the study of the role of genetic factors in determining health and disease in families and in populations, and the interplay of such genetic factors with environmental factors...
- GenStatGenStatGenStat is a general statistical package. Early versions were developed for large mainframe computers. Up until version 5, there was a Unix binary available, and this continues to be used by many universities and research institutions...
– software - Geo-imputationGeo-imputationIn data analysis involving geographical locations, geo-imputation or geographical imputation methods are steps taken to replace missing values for exact locations with approximate locations derived from associated data...
- Geodemographic segmentationGeodemographic SegmentationIn marketing, Geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any...
- Geometric Brownian motionGeometric Brownian motionA geometric Brownian motion is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion, also called a Wiener process...
- Geometric data analysisGeometric data analysisGeometric data analysis can refer to geometric aspects of image analysis, pattern analysis and shape analysis or the approach of multivariate statistics that treats arbitrary data sets as clouds of points in n-dimensional space...
- Geometric distribution
- Geometric medianGeometric medianThe geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points. This generalizes the median, which has the property of minimizing the sum of distances for one-dimensional data, and provides a central tendency in higher...
- Geometric standard deviationGeometric standard deviationIn probability theory and statistics, the geometric standard deviation describes how spread out are a set of numbers whose preferred average is the geometric mean...
- Geometric stable distribution
- Geospatial predictive modelingGeospatial predictive modelingGeospatial predictive modeling is conceptually rooted in the principle that the occurrences ofevents being modeled are limited in distribution...
- GeostatisticsGeostatisticsGeostatistics is a branch of statistics focusing on spatial or spatiotemporal datasets. Developed originally to predict probability distributions of ore grades for mining operations, it is currently applied in diverse disciplines including petroleum geology, hydrogeology, hydrology, meteorology,...
- German tank problemGerman tank problemIn the statistical theory of estimation, estimating the maximum of a uniform distribution is a common illustration of differences between estimation methods...
- Gerschenkron effectGerschenkron effectThe Gerschenkron effect was developed by Alexander Gerschenkron, and claims that changing the base year for an index determines the growth rate of the index.This description is from the OECD website :...
- Gibbs samplingGibbs samplingIn statistics and in statistical physics, Gibbs sampling or a Gibbs sampler is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables...
- Gillespie algorithmGillespie algorithmIn probability theory, the Gillespie algorithm generates a statistically correct trajectory of a stochastic equation. It was created by Joseph L...
- Gini coefficientGini coefficientThe Gini coefficient is a measure of statistical dispersion developed by the Italian statistician and sociologist Corrado Gini and published in his 1912 paper "Variability and Mutability" ....
- Girsanov theoremGirsanov theoremIn probability theory, the Girsanov theorem describes how the dynamics of stochastic processes change when the original measure is changed to an equivalent probability measure...
- Gittins indexGittins indexThe Gittins index is a measure of the reward that can be achieved by a process evolving from its present state onwards with the probability that it will be terminated in future...
- GLIM (software)GLIM (software)GLIM is a statistical software program for fitting generalized linear models .It was developed by the Royal Statistical Society'sWorking Party on Statistical Computing...
– software - Glivenko–Cantelli theorem
- GLUE (uncertainty assessment)GLUE (uncertainty assessment)In hydrology, Generalized Likelihood Uncertainty Estimation is a statistical method for quantifying the uncertainty of model predictions. The method has been introduced by Beven and Binley...
- Goldfeld–Quandt testGoldfeld–Quandt testIn statistics, the Goldfeld–Quandt test checks for homoscedasticity in regression analyses. It does this by dividing a dataset into two parts or groups, and hence the test is sometimes called a two-group test. The Goldfeld–Quandt test is one of two tests proposed in a 1965 paper by Stephen...
- Gompertz function
- Gompertz–Makeham law of mortality
- Good–Turing frequency estimation
- Goodhart's lawGoodhart's lawGoodhart's law, although it can be expressed in many ways, states that once a social or economic indicator or other surrogate measure is made a target for the purpose of conducting social or economic policy, then it will lose the information content that would qualify it to play that role...
- Goodman and Kruskal's lambdaGoodman and Kruskal's lambdaIn probability theory and statistics, Goodman & Kruskal's lambda is a measure of proportional reduction in error in cross tabulation analysis...
- Goodness of fitGoodness of fitThe goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g...
- Gordon–Newell network
- Gordon–Newell theoremGordon–Newell theoremIn queueing theory, a discipline within the mathematical theory of probability, the Gordon–Newell theorem is an extension of Jackson's theorem from open queueing networks to closed queueing networks of exponential servers. We cannot apply Jackson's theorem to closed networks because the queue...
- Graeco-Latin squareGraeco-Latin squareIn mathematics, a Graeco-Latin square or Euler square or orthogonal Latin squares of order n over two sets S and T, each consisting of n symbols, is an n×n arrangement of cells, each cell containing an ordered pair , where s is in S and t is in T, such that every row and every column contains...
- Grand meanGrand meanThe grand mean is the mean of the means of several subsamples. For example, consider several lots, each containing several items. The items from each lot are sampled for a measure of some variable and the means of the measurements from each lot are computed. The mean of the measures from each lot...
- Granger causalityGranger causalityThe Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another. Ordinarily, regressions reflect "mere" correlations, but Clive Granger, who won a Nobel Prize in Economics, argued that there is an interpretation of a set of tests...
- Graph cuts in computer visionGraph cuts in computer visionAs applied in the field of computer vision, graph cuts can be employed to efficiently solve a wide variety of low-level computer vision problems , such as image smoothing, the stereo correspondence problem, and many other computer vision problems that can be formulated in terms of energy minimization...
– a potential application of Bayesian analysis - Graphical modelGraphical modelA graphical model is a probabilistic model for which a graph denotes the conditional independence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning....
- Graphical models for protein structureGraphical models for protein structureGraphical models have become powerful frameworks for protein structure prediction, protein–protein interaction and free energy calculations for protein structures...
- GraphPad InStatGraphPad InStatGraphPad InStat is a commercial scientific statistics software published by GraphPad Software, Inc., a privately owned California corporation. InStat is available for both Windows and Macintosh computers.-Features:...
– software - GraphPad PrismGraphPad PrismGraphPad Prism is a commercial scientific 2D graphing and statistics software published by GraphPad Software, Inc., a privately-held California corporation...
– software - Gravity model of tradeGravity model of tradeThe gravity model of trade in international economics, similar to other gravity models in social science, predicts bilateral trade flows based on the economic sizes of and distance between two units. The model was first used by Tinbergen in 1962...
- Greenwood statistic
- GretlGretlgretl is an open-source statistical package, mainly for econometrics. The name is an acronym for Gnu Regression, Econometrics and Time-series Library. It has a graphical user interface and can be used together with X-12-ARIMA, TRAMO/SEATS, R, Octave, and Ox. It is written in C, uses GTK as widget...
- Group familyGroup familyIn probability theory, especially as that field is used in statistics, a group family of probability distributions is a family obtained by subjecting a random variable with a fixed distribution to a suitable family of transformations such as a location-scale family, or otherwise a family of...
- Group method of data handlingGroup method of data handlingGroup method of data handling is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models....
- Group size measuresGroup size measuresMany animals, including humans, tend to live in groups, herds, flocks, bands, packs, shoals, or colonies of conspecific individuals. The size of these groups, as expressed by the number of participant individuals, is an important aspect of their social environment...
- Grouped dataGrouped dataGrouped data is a statistical term used in data analysis. A raw dataset can be organized by constructing a table showing the frequency distribution of the variable...
- Grubbs' test for outliersGrubbs' test for outliersGrubbs' test , also known as the maximum normed residual test, is a statistical test used to detect outliers in a univariate data set assumed to come from a normally distributed population.-Definition:...
- Guess valueGuess valueA guess value is more commonly called a starting value or initial value. These are necessary for most optimization problems which use search algorithms, because those algorithms are mainly deterministic and iterative, and they need to start somewhere...
- GuesstimateGuesstimateGuesstimate is an informal English contraction of guess and estimate, first used by American statisticians in 1934 or 1935. It is defined as an estimate made without using adequate or complete information, or, more strongly, as an estimate arrived at by guesswork or conjecture...
- Gumbel distribution
- Guttman scaleGuttman scaleIn statistical surveys conducted by means of structured interviews or questionnaires, a subset of the survey items having binary answers forms a Guttman scale if they can be ranked in some order so that, for a rational respondent, the response pattern can be captured by a single index on that...
- Gy's sampling theoryGy's sampling theoryGy's sampling theory is a theory about the sampling of materials, developed by Pierre Gy from the 1950s to beginning 2000s in articles and books including:* Sampling nomogram* Sampling of particulate materials; theory and practice...
H
- h-indexH-indexThe h-index is an index that attempts to measure both the productivity and impact of the published work of a scientist or scholar. The index is based on the set of the scientist's most cited papers and the number of citations that they have received in other publications...
- Hájek–Le Cam convolution theoremHájek–Le Cam convolution theoremIn statistics, the Hájek–Le Cam convolution theorem states that any regular estimator in a parametric model is asymptotically equivalent to a sum of two independent random variables, one of which is normal with asymptotic variance equal to the inverse of Fisher information, and the other having...
- Half circle distribution
- Half-logistic distribution
- Half-normal distributionHalf-normal distributionThe half-normal distribution is the probability distribution of the absolute value of a random variable that is normally distributed with expected value 0 and variance σ2. I.e...
- Halton sequence
- Hamburger moment problem
- Hannan–Quinn information criterion
- Harris chain
- Hardy–Weinberg principle – statistical genetics
- Hartley's testHartley's testIn statistics, Hartley's test, also known as the Fmax test or Hartley's Fmax, is used in the analysis of variance to verify that different groups have a similar variance, an assumption needed for other statistical tests.It was developed by H. O...
- Hat matrixHat matrixIn statistics, the hat matrix, H, maps the vector of observed values to the vector of fitted values. It describes the influence each observed value has on each fitted value...
- Hammersley–Clifford theoremHammersley–Clifford theoremThe Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics, that gives necessary and sufficient conditions under which a positive probability distribution can be represented as a Markov network...
- Hausdorff moment problem
- Hausman specification test redirects to Hausman testHausman testThe Hausman test or Hausman specification test is a statistical test in econometrics named after Jerry A. Hausman. The test evaluates the significance of an estimator versus an alternative estimator...
- Haybittle–Peto boundaryHaybittle–Peto boundaryThe Haybittle–Peto boundary is a rule for deciding when to stop a clinical trial prematurely.The typical clinical trial compares two groups of patients. One group are given a placebo or conventional treatment, while the other group of patients are given the treatment that is being tested...
- Hazard function — redirects to Failure rateFailure rateFailure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is often denoted by the Greek letter λ and is important in reliability engineering....
- Hazard ratioHazard ratioIn survival analysis, the hazard ratio is the ratio of the hazard rates corresponding to the conditions described by two sets of explanatory variables. For example, in a drug study, the treated population may die at twice the rate per unit time as the control population. The hazard ratio would be...
- Heaps' lawHeaps' lawIn linguistics, Heaps' law is an empirical law which describes the portion of a vocabulary which is represented by an instance document consisting of words chosen from the vocabulary. This can be formulated as V_R = Kn^\beta...
- Health care analyticsHealth care analyticsHealth care analytics is a rapidly evolving field of health care business solutions that makes extensive use of data, statistical and qualitative analysis, explanatory and predictive modeling.- Theory :...
- Heart rate variabilityHeart rate variabilityHeart rate variability is a physiological phenomenon where the time interval between heart beats varies. It is measured by the variation in the beat-to-beat interval....
- Heavy-tailed distributionHeavy-tailed distributionIn probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution...
- Heckman correctionHeckman correctionThe Heckman correction is any of a number of related statistical methods developed by James Heckman in 1976 through 1979 which allow the researcher to correct for selection bias...
- Hedonic regressionHedonic regressionIn economics, hedonic regression or hedonic demand theory is a revealed preference method of estimating demand or value. It decomposes the item being researched into its constituent characteristics, and obtains estimates of the contributory value of each characteristic...
- Hellin's lawHellin's LawHellin's Law is the principle that one in about 89 pregnancies ends in the birth of twins, triplets once in 892 births, and quadruplets once in 893 births....
- Hellinger distanceHellinger distanceIn probability and statistics, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f-divergence...
- Helmert–Wolf blocking
- Herfindahl indexHerfindahl indexThe Herfindahl index is a measure of the size of firms in relation to the industry and an indicator of the amount of competition among them. Named after economists Orris C. Herfindahl and Albert O. Hirschman, it is an economic concept widely applied in competition law, antitrust and also...
- Heston modelHeston modelIn finance, the Heston model, named after Steven Heston, is a mathematical model describing the evolution of the volatility of an underlying asset...
- Heteroscedasticity
- Heteroscedasticity-consistent standard errorsHeteroscedasticity-consistent standard errorsThe topic of heteroscedasticity-consistent standard errors arises in statistics and econometrics in the context of linear regression and also time series analysis...
- Heteroskedasticity — redirects to Heteroscedasticity
- Hidden Markov modelHidden Markov modelA hidden Markov model is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved states. An HMM can be considered as the simplest dynamic Bayesian network. The mathematics behind the HMM was developed by L. E...
- Hidden Markov random fieldHidden Markov random fieldA hidden Markov random field is a generalization of a hidden Markov model. Instead of having an underlying Markov chain, hidden Markov random fields have an underlying Markov random field.Suppose that we observe a random variable Y_i , where i \in S ....
- Hidden semi-Markov modelHidden semi-Markov modelA hidden semi-Markov model is a statistical model with the same structure as a hidden Markov model except that the unobservable process is semi-Markov rather than Markov. This means that the probability of there being a change in the hidden state depends on the amount of time that has elapsed...
- Hierarchical Bayes modelHierarchical Bayes modelThe hierarchical Bayes model is a method in modern Bayesian statistical inference. It is a framework for describing statistical models that can capture dependencies more realistically than non-hierarchical models....
- Hierarchical clusteringHierarchical clusteringIn statistics, hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types:...
- Hierarchical hidden Markov modelHierarchical hidden Markov modelThe hierarchical hidden Markov model is a statistical model derived from the hidden Markov model . In an HHMM each state is considered to be a self-contained probabilistic model. More precisely each state of the HHMM is itself an HHMM....
- Hierarchical linear modelingHierarchical linear modelingIn statistics, hierarchical linear modeling , a form of multi-level analysis, is a more advanced form of simple linear regression and multiple linear regression. Multilevel analysis allows variance in outcome variables to be analysed at multiple hierarchical levels, whereas in simple linear and...
- High-dimensional statisticsHigh-dimensional statisticsIn statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than dimensions considered in classical multivariate analysis. High-dimensional statistics relies on the theory of random vectors...
- Higher-order factor analysisHigher-order factor analysisHigher-order factor analysis is a statistical method consisting of repeating steps factor analysis – oblique rotation – factor analysis of rotated factors... Its merit is to enable the researcher to see the hierarchical structure of studied phenomena...
- Higher-order statisticsHigher-order statisticsHigher-order statistics are descriptive measures of, among other things, qualities of probability distributions and sample distributions, and are, themselves, extensions of first- and second-order measures to higher orders. Skewness and kurtosis are examples of this...
- Hirschman uncertaintyHirschman uncertaintyIn quantum mechanics, information theory, and Fourier analysis, the Hirschman uncertainty is defined as the sum of the temporal and spectral Shannon entropies. It turns out that Heisenberg's uncertainty principle can be expressed as a lower bound on the sum of these entropies...
- HistogramHistogramIn statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...
- HistoriometryHistoriometryHistoriometry is the historical study of human progress or individual personal characteristics, using statistics to analyze references to geniuses, their statements, behavior and discoveries in relatively neutral texts...
- History of randomnessHistory of randomnessIn ancient history, the concepts of chance and randomness were intertwined with that of fate. Many ancient peoples threw dice to determine fate, and this later evolved into games of chance...
- History of statisticsHistory of statisticsThe history of statistics can be said to start around 1749 although, over time, there have been changes to the interpretation of what the word statistics means. In early times, the meaning was restricted to information about states...
- Hitting timeHitting timeIn the study of stochastic processes in mathematics, a hitting time is a particular instance of a stopping time, the first time at which a given process "hits" a given subset of the state space...
- Hodges’ estimatorHodges’ estimatorIn statistics, Hodges’ estimator is a famous counter example of an estimator which is "superefficient", i.e. it attains smaller asymptotic variance than regular efficient estimators...
- Hodges–Lehmann estimator
- Hoeffding's independence test
- Hoeffding's lemmaHoeffding's lemmaIn probability theory, Hoeffding's lemma is an inequality that bounds the moment-generating function of any bounded random variable. It is named after the Finnish–American mathematical statistician Wassily Hoeffding....
- Hoeffding's inequalityHoeffding's inequalityIn probability theory, Hoeffding's inequality provides an upper bound on the probability for the sum of random variables to deviate from its expected value. Hoeffding's inequality was proved by Wassily Hoeffding.LetX_1, \dots, X_n \!...
- Holm–Bonferroni method
- Holtsmark distributionHoltsmark distributionThe Holtsmark distribution is a continuous probability distribution. The Holtsmark distribution is a special case of a stable distribution with the index of stability or shape parameter \alpha equal to 3/2 and skewness parameter \beta of zero. Since \beta equals zero, the distribution is...
- Homogeneity (statistics)Homogeneity (statistics)In statistics, homogeneity and its opposite, heterogeneity, arise in describing the properties of a dataset, or several datasets. They relate to the validity of the often convenient assumption that the statistical properties of any one part of an overall dataset are the same as any other part...
- HomoscedasticityHomoscedasticityIn statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity...
- Hoover index
- Horvitz–Thompson estimatorHorvitz–Thompson estimatorIn statistics, the Horvitz–Thompson estimator, named after Daniel G. Horvitz and Donovan J. Thompson, is a method for estimating the mean of a superpopulation in a stratified sample. Inverse probability weighting is applied to account for different proportions of observations within strata...
- Hosmer–Lemeshow testHosmer–Lemeshow testThe Hosmer–Lemeshow test is a statistical test for goodness of fit for logistic regression models. It is used frequently in risk prediction models. The test assesses whether or not the observed event rates match expected event rates in subgroups of the model population. The Hosmer–Lemeshow...
- Hotelling's T-squared distribution
- How to Lie with StatisticsHow to Lie with StatisticsHow to Lie with Statistics is a book written by Darrell Huff in 1954 presenting an introduction to statistics for the general reader. Huff was a journalist who wrote many "how to" articles as a freelancer, but was not a statistician....
(book) - Howland will forgery trialHowland will forgery trialThe Howland will forgery trial was a U.S. court case in 1868 to decide Henrietta Howland Robinson's contest of the will of Sylvia Ann Howland. It is famous for the forensic use of mathematics by Benjamin Peirce as an expert witness.-History:...
- Hubbert curveHubbert curveThe Hubbert curve is an approximation of the production rate of a resource over time. It is a symmetric logistic distribution curve, often confused with the "normal" gaussian function. It first appeared in "Nuclear Energy and the Fossil Fuels," geophysicist M...
- Huber–White standard error — redirects to Heteroscedasticity-consistent standard errorsHeteroscedasticity-consistent standard errorsThe topic of heteroscedasticity-consistent standard errors arises in statistics and econometrics in the context of linear regression and also time series analysis...
- Huber loss functionHuber Loss FunctionIn statistical theory, the Huber loss function is a function used in robust estimation that allows construction of an estimate which allows the effect of outliers to be reduced, while treating non-outliers in a more standard way.-Definition:...
- Human subject research
- Hurst exponentHurst exponentThe Hurst exponent is used as a measure of the long term memory of time series. It relates to the autocorrelations of the time series and the rate at which these decrease as the lag between pairs of values increases....
- Hyper-exponential distribution
- Hyper-Graeco-Latin square designHyper-Graeco-Latin square designIn the design of experiments, hyper-Graeco-Latin squares are efficient designs to study the effect of one primary factor in the presence of 4 blocking factors. They are restricted, however, to the case in which all the factors have the same number of levels.Designs for 4- and 5-level factors are...
- Hyperbolic distributionHyperbolic distributionThe hyperbolic distribution is a continuous probability distribution that is characterized by the fact that the logarithm of the probability density function is a hyperbola. Thus the distribution decreases exponentially, which is more slowly than the normal distribution...
- Hyperbolic secant distribution
- Hypergeometric distribution
- HyperparameterHyperparameterIn Bayesian statistics, a hyperparameter is a parameter of a prior distribution; the term is used to distinguish them from parameters of the model for the underlying system under analysis...
- HyperpriorHyperpriorIn Bayesian statistics, a hyperprior is a prior distribution on a hyperparameter, that is, on a parameter of a prior distribution.As with the term hyperparameter, the use of hyper is to distinguish it from a prior distribution of a parameter of the model for the underlying system...
- Hypoexponential distributionHypoexponential distributionIn probability theory the hypoexponential distribution or the generalized Erlang distribution is a continuous distribution, that has found use in the same fields as the Erlang distribution, such as queueing theory, teletraffic engineering and more generally in stochastic processes...
I
- Idealised populationIdealised populationmain article: effective population sizeIn population genetics an idealised population, also sometimes called a Fisher-Wright population after R.A. Fisher and Sewall Wright, is a population whose members can mate and reproduce with any other member of the other gender, has a sex ratio of 1 and no...
- Idempotent matrixIdempotent matrixIn algebra, an idempotent matrix is a matrix which, when multiplied by itself, yields itself. That is, the matrix M is idempotent if and only if MM = M...
- IdentifiabilityIdentifiabilityIn statistics, identifiability is a property which a model must satisfy in order for inference to be possible. We say that the model is identifiable if it is theoretically possible to learn the true value of this model’s underlying parameter after obtaining an infinite number of observations from it...
- IgnorabilityIgnorabilityIn statistics, ignorability refers to an experiment design where the method of data collection do not depend on the missing data...
- Illustration of the central limit theoremIllustration of the central limit theoremThis article gives two concrete illustrations of the central limit theorem. Both involve the sum of independent and identically-distributed random variables and show how the probability distribution of the sum approaches the normal distribution as the number of terms in the sum increases.The first...
- Image denoisingImage denoisingImage denoising refers to the recovery of a digital image that has been contaminated by additive white Gaussian noise .-Technical description:...
- Importance samplingImportance samplingIn statistics, importance sampling is a general technique for estimating properties of a particular distribution, while only having samples generated from a different distribution rather than the distribution of interest. It is related to Umbrella sampling in computational physics...
- Imprecise probabilityImprecise probabilityImprecise probability generalizes probability theory to allow for partial probability specifications, and is applicable when information is scarce, vague, or conflicting, in which case a unique probability distribution may be hard to identify...
- Imputation (statistics)Imputation (statistics)In statistics, imputation is the substitution of some value for a missing data point or a missing component of a data point. Once all missing values have been imputed, the dataset can then be analysed using standard techniques for complete data...
- Incidence (epidemiology)Incidence (epidemiology)Incidence is a measure of the risk of developing some new condition within a specified period of time. Although sometimes loosely expressed simply as the number of new cases during some time period, it is better expressed as a proportion or a rate with a denominator.Incidence proportion is the...
- Inclusion probabilityInclusion probabilityIn statistics, in the theory relating to sampling from finite populations, the inclusion probability of an element or member of the population is its probability of becoming part of the sample during the drawing of a single sample....
- Increasing process
- Indecomposable distributionIndecomposable distributionIn probability theory, an indecomposable distribution is a probability distribution that cannot be represented as the distribution of the sum of two or more non-constant independent random variables: Z ≠ X + Y. If it can be so expressed, it is decomposable:...
- Independence of irrelevant alternativesIndependence of irrelevant alternativesIndependence of irrelevant alternatives is an axiom of decision theory and various social sciences.The word is used in different meanings in different contexts....
- Independent component analysisIndependent component analysisIndependent component analysis is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals...
- Independent and identically distributed random variablesIndependent and identically distributed random variablesIn probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent....
- Index number
- Index of coincidenceIndex of coincidenceIn cryptography, coincidence counting is the technique of putting two texts side-by-side and counting the number of times that identical letters appear in the same position in both texts...
- Index of dispersion
- Indicators of spatial associationIndicators of spatial associationIndicators of spatial association are statistics that evaluate the existence of clusters in the spatial arrangement of a given variable. For instance if we are studying cancer rates among census tracts in a given city local clusters in the rates mean that there are areas that have higher or lower...
- Indirect least squares
- Inductive inferenceInductive inferenceAround 1960, Ray Solomonoff founded the theory of universal inductive inference, the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols...
- An inequality on location and scale parameters — redirects to Chebyshev's inequalityChebyshev's inequalityIn probability theory, Chebyshev’s inequality guarantees that in any data sample or probability distribution,"nearly all" values are close to the mean — the precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean...
- InferenceInferenceInference is the act or process of deriving logical conclusions from premises known or assumed to be true. The conclusion drawn is also called an idiomatic. The laws of valid inference are studied in the field of logic.Human inference Inference is the act or process of deriving logical conclusions...
- Inferential statistics redirects to Statistical inferenceStatistical inferenceIn statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
- Infinite divisibility (probability)Infinite divisibility (probability)The concepts of infinite divisibility and the decomposition of distributions arise in probability and statistics in relation to seeking families of probability distributions that might be a natural choice in certain applications, in the same way that the normal distribution is...
- Infinite monkey theoremInfinite monkey theoremThe infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare....
- Influence diagramInfluence diagramAn influence diagram is a compact graphical and mathematical representation of a decision situation...
- Info-gap decision theoryInfo-gap decision theoryInfo-gap decision theory is a non-probabilistic decision theory that seeks to optimize robustness to failure – or opportuneness for windfall – under severe uncertainty, in particular applying sensitivity analysis of the stability radius type to perturbations in the value of a given estimate of the...
- Information bottleneck methodInformation bottleneck methodThe information bottleneck method is a technique introduced by Naftali Tishby et al. [1] for finding the best tradeoff between accuracy and complexity when summarizing a random variable X, given a joint probability distribution between X and an observed relevant variable Y...
- Information geometryInformation geometryInformation geometry is a branch of mathematics that applies the techniques of differential geometry to the field of probability theory. It derives its name from the fact that the Fisher information is used as the Riemannian metric when considering the geometry of probability distribution families...
- Information gain ratioInformation gain ratio- Information Gain Calculation :Let Attr be the set of all attributes and Ex the set of all training examples,value withx\in Ex defines the value of a specific example x for attribute a\in Attr, H specifies the entropy....
- Information ratioInformation ratioThe Information ratio is a measure of the risk-adjusted return of a financial security . It is also known as Appraisal ratio and is defined as expected active return divided by tracking error, where active return is the difference between the return of the security and the return of a selected...
– finance - Information source (mathematics)Information source (mathematics)In mathematics, an information source is a sequence of random variables ranging over a finite alphabet Γ, having a stationary distribution.The uncertainty, or entropy rate, of an information source is defined as...
- Information theoryInformation theoryInformation theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...
- Inherent biasInherent biasThe term "inherent bias" refers to the effect of underlying factors or assumptions that skew viewpoints a subject under discussion. There are multiple formal definitions of "inherent bias" which depend on the particular field of study....
- Inherent zeroInherent zeroIn statistics, an inherent zero is a reference point used to describe data sets which are indicative of magnitude of an absolute or relative nature. Inherent zeros are used on ratio scales....
- Injury preventionInjury preventionInjury prevention are efforts to prevent or reduce the severity of bodily injuries caused by external mechanisms, such as accidents, before they occur. Injury prevention is a component of safety and public health, and its goal is to improve the health of the population by preventing injuries and...
– application - Innovation (signal processing)Innovation (signal processing)In time series analysis — as conducted in statistics, signal processing, and many other fields — the innovation is the difference between the observed value of a variable at time t and the optimal forecast of that value based on information available prior to time t...
- Innovations vectorInnovations vectorThe innovations vector or residual vector is the difference between the measurement vector and the predicted measurement vector. Each difference represents the deviation of the observed random variable from the predicted response. The innovation vector is often used to check the validity of a...
- Institutional review boardInstitutional review boardAn institutional review board , also known as an independent ethics committee or ethical review board , is a committee that has been formally designated to approve, monitor, and review biomedical and behavioral research involving humans with the aim to protect the rights and welfare of the...
- Instrumental variableInstrumental variableIn statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....
- Intention to treat analysisIntention to treat analysisIn epidemiology, an intention to treat analysis is an analysis based on the initial treatment intent, not on the treatment eventually administered. ITT analysis is intended to avoid various misleading artifacts that can arise in intervention research...
- Interaction (statistics)Interaction (statistics)In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive...
- Interaction variable – see Interaction (statistics)Interaction (statistics)In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive...
- Interclass correlationInterclass correlationIn statistics, the interclass correlation measures a bivariate relation among variables.The Pearson correlation coefficient is the most commonly used interclass correlation....
- Interdecile rangeInterdecile rangeIn statistics, the interdecile range is the difference between the first and the ninth deciles . The interdecile range is a measure of statistical dispersion of the values in a set of data, similar to the range and the interquartile range....
- Interim analysisInterim analysisClinical trials are unique in that enrollment of patients is a continual process staggered in time. This means that if a treatment is particularly beneficial or harmful compared to the concurrent placebo group while the study is on-going, the investigators are ethically obliged to assess that...
- Internal consistencyInternal consistencyIn statistics and research, internal consistency is typically a measure based on the correlations between different items on the same test . It measures whether several items that propose to measure the same general construct produce similar scores...
- Internal validityInternal validityInternal validity is the validity of inferences in scientific studies, usually based on experiments as experimental validity.- Details :...
- Interquartile meanInterquartile meanThe interquartile mean is a statistical measure of central tendency, much like the mean , the median, and the mode....
- Interquartile rangeInterquartile rangeIn descriptive statistics, the interquartile range , also called the midspread or middle fifty, is a measure of statistical dispersion, being equal to the difference between the upper and lower quartiles...
- Inter-rater reliabilityInter-rater reliabilityIn statistics, inter-rater reliability, inter-rater agreement, or concordance is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by...
- Interval estimationInterval estimationIn statistics, interval estimation is the use of sample data to calculate an interval of possible values of an unknown population parameter, in contrast to point estimation, which is a single number. Neyman identified interval estimation as distinct from point estimation...
- Intervening variableIntervening variableAn intervening variable is a hypothetical internal state that is used to explain relationships between observed variables, such as independent and dependent variables, in empirical research.- History :...
- Intra-rater reliabilityIntra-rater reliabilityIn statistics, intra-rater reliability is the degree of agreement among multiple repetitions of a diagnostic test performed by a single rater.-See also:* Inter-rater reliability* Reliability * Repeatability* Test-retest reliability...
- Intraclass correlationIntraclass correlationIn statistics, the intraclass correlation is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other...
- Invariant estimatorInvariant estimatorIn statistics, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities...
- Invariant extended Kalman filterInvariant extended Kalman filterThe invariant extended Kalman filter is a new version of the extended Kalman filter for nonlinear systems possessing symmetries . It combines the advantages of both the EKF and the recently introduced symmetry-preserving filters...
- Inverse distance weightingInverse distance weightingInverse distance weighting is a method for multivariate interpolation, a process of assigning values to unknown points by using values from usually scattered set of known points...
- Inverse Gaussian distributionInverse Gaussian distribution| cdf = \Phi\left +\exp\left \Phi\left...
- Inverse Mills ratioInverse Mills ratioIn statistics, the inverse Mills ratio, named after John P. Mills, is the ratio of the probability density function to the cumulative distribution function of a distribution....
- Inverse probabilityInverse probabilityIn probability theory, inverse probability is an obsolete term for the probability distribution of an unobserved variable.Today, the problem of determining an unobserved variable is called inferential statistics, the method of inverse probability is called Bayesian probability, the "distribution"...
- Inverse relationshipInverse relationshipAn inverse or negative relationship is a mathematical relationship in which one variable, say y, decreases as another, say x, increases. For a linear relation, this can be expressed as y = a-bx, where -b is a constant value less than zero and a is a constant...
- Inverse-chi-squared distribution
- Inverse-gamma distribution
- Inverse transform sampling
- Inverse-variance weightingInverse-variance weightingIn statistics, inverse-variance weighting is a method of aggregating two or more random variables to minimize the variance of the sum. Each random variable in the sum is weighted in inverse proportion to its variance....
- Inverse-Wishart distribution
- Iris flower data setIris flower data setThe Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by Sir Ronald Aylmer Fisher as an example of discriminant analysis...
- Irwin–Hall distribution
- IsomapIsomapIn statistics, Isomap is one of several widely used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling . Isomap is used for computing a quasi-isometric, low-dimensional embedding of a set of high-dimensional data points...
- Isotonic regression
- Item response theoryItem response theoryIn psychometrics, item response theory also known as latent trait theory, strong true score theory, or modern mental test theory, is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is based...
- Item-total correlationItem-total correlationThe item-total correlation test arises in psychometrics in contexts where a number of tests or questions are given to an individual and where the problem is to construct a useful single quantity for each individual that can be used to compare that individual with others in a given population...
- Item tree analysisItem tree analysisItem tree analysis is a data analytical method which allows constructing ahierarchical structure on the items of a questionnaire or test from observed responsepatterns. Assume that we have a questionnaire with m items and that subjects can...
- Iterative proportional fittingIterative proportional fittingThe iterative proportional fitting procedure is an iterative algorithm for estimating cell values of a contingency table such that the marginal totals remain fixed and the estimated table decomposes into an outer...
- Iteratively reweighted least squares
- Itō calculusIto calculusItō calculus, named after Kiyoshi Itō, extends the methods of calculus to stochastic processes such as Brownian motion . It has important applications in mathematical finance and stochastic differential equations....
- Itō isometryIto isometryIn mathematics, the Itō isometry, named after Kiyoshi Itō, is a crucial fact about Itō stochastic integrals. One of its main applications is to enable the computation of variances for stochastic processes....
- Itō's lemmaIto's lemmaIn mathematics, Itō's lemma is used in Itō stochastic calculus to find the differential of a function of a particular type of stochastic process. It is named after its discoverer, Kiyoshi Itō...
J
- Jaccard indexJaccard indexThe Jaccard index, also known as the Jaccard similarity coefficient , is a statistic used for comparing the similarity and diversity of sample sets....
- Jackknife (statistics) redirects to Resampling (statistics)Resampling (statistics)In statistics, resampling is any of a variety of methods for doing one of the following:# Estimating the precision of sample statistics by using subsets of available data or drawing randomly with replacement from a set of data points # Exchanging labels on data points when performing significance...
- Jackson network
- Jackson's theorem (queueing theory)
- Jadad scaleJadad scaleThe Jadad scale, sometimes known as Jadad scoring or the Oxford quality scoring system, is a procedure to independently assess the methodological quality of a clinical trial...
- James–Stein estimator
- Jarque–Bera testJarque–Bera testIn statistics, the Jarque–Bera test is a goodness-of-fit test of whether sample data have the skewness and kurtosis matching a normal distribution. The test is named after Carlos Jarque and Anil K. Bera...
- Jeffreys priorJeffreys priorIn Bayesian probability, the Jeffreys prior, named after Harold Jeffreys, is a non-informative prior distribution on parameter space that is proportional to the square root of the determinant of the Fisher information:...
- Jensen's inequalityJensen's inequalityIn mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906. Given its generality, the inequality appears in many forms depending on the context,...
- Jensen–Shannon divergenceJensen–Shannon divergenceIn probability theory and statistics, the Jensen–Shannon divergence is a popular method of measuring the similarity between two probability distributions. It is also known as information radius or total divergence to the average. It is based on the Kullback–Leibler divergence, with the notable ...
- JMulTiJMulTiJMulTi is an open-source interactive software for econometric analysis, specialised in univariate and multivariate time series analysis. It has a Java graphical user interface....
– software - Johansen testJohansen testIn statistics, the Johansen test, named after Søren Johansen, is a procedure for testing cointegration of several I time series. This test permits more than one cointegrating relationship so is more generally applicable than the Engle–Granger test which is based on the Dickey–Fuller test for...
- Joint probability distribution
- JMP (statistical software)JMP (statistical software)JMP is a computer program that was first developed by John Sall and others to perform simple and complex statistical analyses.It dynamically links statistics with graphics to interactively explore, understand, and visualize data...
- Jump processJump processA jump process is a type of stochastic process that has discrete movements, called jumps, rather than small continuous movements.In physics, jump processes result in diffusion...
- Jump-diffusion model
- Junction tree algorithm
K
- K-distributionK-distributionThe K-distribution is a probability distribution that arises as the consequence of a statistical or probabilistic model used in Synthetic Aperture Radar imagery...
- K-means algorithmK-means algorithmIn statistics and data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean...
redirects to k-means clustering - K-means++K-means++In applied statistics, k-means++ is an algorithm for choosing the initial values for the k-means clustering algorithm. It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor...
- K-medians clusteringK-medians clusteringIn statistics and machine learning, k-medians clustering is a variation of k-means clustering where instead of calculating the mean for each cluster to determine its centroid, one instead calculates the median...
- K-medoidsK-medoidsThe -medoids algorithm is a clustering algorithm related to the -means algorithm and the medoidshift algorithm. Both the -means and -medoids algorithms are partitional and both attempt to minimize squared error, the distance between points labeled to be in a cluster and a point designated as the...
- Kalman filterKalman filterIn statistics, the Kalman filter is a mathematical method named after Rudolf E. Kálmán. Its purpose is to use measurements observed over time, containing noise and other inaccuracies, and produce values that tend to be closer to the true values of the measurements and their associated calculated...
- Kaplan–Meier estimator
- Kappa coefficient
- Kappa statistic
- Karhunen–Loève theorem
- Kendall tau distanceKendall tau distanceThe Kendall tau distance is a metric that counts the number of pairwise disagreements between two lists. The larger the distance, the more dissimilar the two lists are. Kendall tau distance is also called bubble-sort distance since it is equivalent to the number of swaps that the bubble sort...
- Kendall tau rank correlation coefficientKendall tau rank correlation coefficientIn statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's tau coefficient, is a statistic used to measure the association between two measured quantities...
- Kendall's notationKendall's notationIn queueing theory, Kendall's notation is the standard system used to describe and classify the queueing model that a queueing system corresponds to. First suggested by D. G...
- Kendall's WKendall's WKendall's W is a non-parametric statistic. It is a normalization of the statistic of the Friedman test, and can be used for assessing agreement among raters...
– Kendall's coefficient of concordance - Kent distribution
- Kernel density estimationKernel density estimationIn statistics, kernel density estimation is a non-parametric way of estimating the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample...
- Kernel methodsKernel methodsIn computer science, kernel methods are a class of algorithms for pattern analysis, whose best known elementis the support vector machine...
- Kernel principal component analysis
- Kernel regressionKernel regressionThe kernel regression is a non-parametric technique in statistics to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables X and Y....
- Kernel smootherKernel smootherA kernel smoother is a statistical technique for estimating a real valued function f\,\,\left by using its noisy observations, when no parametric model for this function is known...
- Kernel (statistics)Kernel (statistics)A kernel is a weighting function used in non-parametric estimation techniques. Kernels are used in kernel density estimation to estimate random variables' density functions, or in kernel regression to estimate the conditional expectation of a random variable. Kernels are also used in time-series,...
- Khmaladze transformationKhmaladze transformationThe Khmaladze Transformation is a statistical tool.Consider the sequence of empirical distribution functions F_n based on asequence of i.i.d random variables, X_1,\ldots, X_n, as n increases.Suppose F is the hypothetical distribution function of...
(probability theory) - Killed processKilled processIn probability theory — specifically, in stochastic analysis — a killed process is a stochastic process that is forced to assume an undefined or "killed" state at some time.-Definition:...
- Khintchine inequalityKhintchine inequalityIn mathematics, the Khintchine inequality, named after Aleksandr Khinchin and spelled in multiple ways in the Roman alphabet, is a theorem from probability, and is also frequently used in analysis...
- Kingman's formula
- Kirkwood approximationKirkwood approximationThe Kirkwood superposition approximation was introduced by Matsuda as a means of representing a discrete probability distribution. The name apparently refers to a 1942 paper by John G. Kirkwood...
- Kish gridKish gridThe Kish grid is a method for selecting members within a household to be interviewed. In telephone surveys, the next-birthday method is sometimes preferred to the Kish grid.- References :...
- Kitchen sink regressionKitchen sink regressionA kitchen sink regression is an informal and usually pejorative term for a regression analysis which uses a long list of possible independent variables to attempt to explain variance in a dependent variable. In economics, psychology, and other social sciences, regression analysis is typically used...
- Knightian uncertaintyKnightian uncertaintyIn economics, Knightian uncertainty is risk that is immeasurable, not possible to calculate.Knightian uncertainty is named after University of Chicago economist Frank Knight , who distinguished risk and uncertainty in his work Risk, Uncertainty, and Profit:- Common-cause and special-cause :The...
- Kolmogorov backward equationKolmogorov backward equationThe Kolmogorov backward equation and its adjoint sometimes known as the Kolmogorov forward equation are partial differential equations that arise in the theory of continuous-time continuous-state Markov processes. Both were published by Andrey Kolmogorov in 1931...
- Kolmogorov continuity theoremKolmogorov continuity theoremIn mathematics, the Kolmogorov continuity theorem is a theorem that guarantees that a stochastic process that satisfies certain constraints on the moments of its increments will be continuous...
- Kolmogorov extension theoremKolmogorov extension theoremIn mathematics, the Kolmogorov extension theorem is a theorem that guarantees that a suitably "consistent" collection of finite-dimensional distributions will define a stochastic process...
- Kolmogorov’s criterionKolmogorov’s criterionIn probability theory, Kolmogorov's criterion, named after Andrey Kolmogorov, is a theorem in Markov processes concerning stationary Markov chains...
- Kolmogorov’s generalized criterion
- Kolmogorov's inequalityKolmogorov's inequalityIn probability theory, Kolmogorov's inequality is a so-called "maximal inequality" that gives a bound on the probability that the partial sums of a finite collection of independent random variables exceed some specified bound...
- Kolmogorov's zero-one lawKolmogorov's zero-one lawIn probability theory, Kolmogorov's zero-one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, called a tail event, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.Tail...
- Kolmogorov–Smirnov test
- KPSS testKPSS testIn econometrics, Kwiatkowski–Phillips–Schmidt–Shin tests are used for testing a null hypothesis that an observable time series is stationary around a deterministic trend. Such models were proposed in 1982 by Alok Bhargava in his Ph.D. thesis where several John von Neumann or Durbin–Watson type...
- KrigingKrigingKriging is a group of geostatistical techniques to interpolate the value of a random field at an unobserved location from observations of its value at nearby locations....
- Kruskal–Wallis one-way analysis of variance
- Kuder-Richardson Formula 20Kuder-Richardson Formula 20In statistics, the Kuder-Richardson Formula 20 first published in 1937 is a measure of internal consistency reliability for measures with dichotomous choices. It is analogous to Cronbach's α, except Cronbach's α is also used for non-dichotomous measures...
- Kuiper's testKuiper's testKuiper's test is used in statistics to test that whether a given distribution, or family of distributions, is contradicted by evidence from a sample of data. It is named after Dutch mathematician Nicolaas Kuiper....
- Kullback's inequalityKullback's inequalityIn information theory and statistics, Kullback's inequality is a lower bound on the Kullback–Leibler divergence expressed in terms of the large deviations rate function. If P and Q are probability distributions on the real line, such that P is absolutely continuous with respect to Q, i.e...
- Kullback–Leibler divergenceKullback–Leibler divergenceIn probability theory and information theory, the Kullback–Leibler divergence is a non-symmetric measure of the difference between two probability distributions P and Q...
- Kumaraswamy distributionKumaraswamy distributionIn probability and statistics, the Kumaraswamy's double bounded distribution is a family of continuous probability distributions defined on the interval [0,1] differing in the values of their two non-negative shape parameters, a and b....
- KurtosisKurtosisIn probability theory and statistics, kurtosis is any measure of the "peakedness" of the probability distribution of a real-valued random variable...
- Kushner equationKushner equationIn filtering theory the Kushner equation is an equation for the conditional probability density of the state of a stochastic non-linear dynamical system, given noisy measurements of the state. It therefore provides the solution of the nonlinear filtering problem in estimation theory...
L
- L-estimator
- L-momentL-momentIn statistics, L-moments are statistics used to summarize the shape of a probability distribution. They are analogous to conventional moments in that they can be used to calculate quantities analogous to standard deviation, skewness and kurtosis, termed the L-scale, L-skewness and L-kurtosis...
- Labour Force SurveyLabour Force SurveyLabour Force Surveys are statistical surveys conducted in a number of countries designed to capture data about the labour market. All European Union member states are required to conduct a Labour Force Survey annually. Labour Force Surveys are also carried out in some non-EU countries. They are...
- Lack-of-fit sum of squaresLack-of-fit sum of squaresIn statistics, a sum of squares due to lack of fit, or more tersely a lack-of-fit sum of squares, is one of the components of a partition of the sum of squares in an analysis of variance, used in the numerator in an F-test of the null hypothesis that says that a proposed model fits well.- Sketch of...
- Lady tasting teaLady tasting teaIn the design of experiments in statistics, the lady tasting tea is a famous randomized experiment devised by Ronald A. Fisher and reported in his book Statistical methods for research workers . The lady in question was Dr...
- Lag operator
- Lag windowingLag windowingLag windowing is a technique that consists of windowing the auto-correlation coefficients prior to estimating Linear prediction coefficients . The windowing in the auto-correlation domain has the same effect as a convolution in the power spectral domain and helps stabilizing the result of the...
- Lambda distribution — disambiguation
- Landau distribution
- Lander–Green algorithmLander–Green algorithmThe Lander–Green algorithm is an algorithm, due to Eric Lander and Philip Green for computing the likelihood of observed genotype data given a pedigree. It is appropriate for relatively small pedigrees and a large number of markers. It is used in the analysis of genetic linkage....
- Language modelLanguage modelA statistical language model assigns a probability to a sequence of m words P by means of a probability distribution.Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information...
- Laplace distribution
- Laplace principle (large deviations theory)Laplace principle (large deviations theory)In mathematics, Laplace's principle is a basic theorem in large deviations theory, similar to Varadhan's lemma. It gives an asymptotic expression for the Lebesgue integral of exp over a fixed set A as θ becomes large...
- Large deviations theoryLarge deviations theoryIn probability theory, the theory of large deviations concerns the asymptotic behaviour of remote tails of sequences of probability distributions. Some basic ideas of the theory can be tracked back to Laplace and Cramér, although a clear unified formal definition was introduced in 1966 by Varadhan...
- Large deviations of Gaussian random functionsLarge deviations of Gaussian random functionsA random function – of either one variable , or two or more variables – is called Gaussian if every finite-dimensional distribution is a multivariate normal distribution. Gaussian random fields on the sphere are useful when analysing* the anomalies in the cosmic microwave background...
- LARS — see least-angle regressionLeast-angle regressionIn statistics, least-angle regression is a regression algorithm for high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani....
- Latent variableLatent variableIn statistics, latent variables , are variables that are not directly observed but are rather inferred from other variables that are observed . Mathematical models that aim to explain observed variables in terms of latent variables are called latent variable models...
, latent variable modelLatent variable modelA latent variable model is a statistical model that relates a set of variables to a set of latent variables.It is assumed that 1) the responses on the indicators or manifest variables are the result of... - Latent class modelLatent class modelIn statistics, a latent class model relates a set of observed discrete multivariate variables to a set of latent variables. It is a type of latent variable model. It is called a latent class model because the latent variable is discrete...
- Latent Dirichlet allocationLatent Dirichlet allocationIn statistics, latent Dirichlet allocation is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar...
- Latent growth modelingLatent growth modelingLatent growth modeling is a statistical technique used in the structural equation modeling framework to estimate growth trajectory. It is a longitudinal analysis technique to estimate growth over a period of time. It is widely used in the field of behavioral science, education and social science. ...
- Latent semantic analysisLatent semantic analysisLatent semantic analysis is a technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close...
- Latin rectangleLatin rectangleIn combinatorial mathematics, a Latin rectangle is an r × n matrix that has the numbers 1, 2, 3, ..., n as its entries with no number occurring more than once in any row or column where r ≤ n. An n × n Latin rectangle is called a...
- Latin squareLatin squareIn combinatorics and in experimental design, a Latin square is an n × n array filled with n different symbols, each occurring exactly once in each row and exactly once in each column...
- Latin hypercube samplingLatin hypercube samplingLatin hypercube sampling is a statistical method for generating a distribution of plausible collections of parameter values from a multidimensional distribution. The sampling method is often applied in uncertainty analysis....
- Law (stochastic processes)Law (stochastic processes)In mathematics, the law of a stochastic process is the measure that the process induces on the collection of functions from the index set into the state space...
- Law of averagesLaw of averagesThe law of averages is a lay term used to express a belief that outcomes of a random event will "even out" within a small sample.As invoked in everyday life, the "law" usually reflects bad statistics or wishful thinking rather than any mathematical principle...
- Law of comparative judgmentLaw of comparative judgmentThe law of comparative judgment was conceived by L. L. Thurstone. In modern day terminology, it is more aptly described as a model that is used to obtain measurements from any process of pairwise comparison...
- Law of large numbersLaw of large numbersIn probability theory, the law of large numbers is a theorem that describes the result of performing the same experiment a large number of times...
- Law of the iterated logarithmLaw of the iterated logarithmIn probability theory, the law of the iterated logarithm describes the magnitude of the fluctuations of a random walk. The original statement of the law of the iterated logarithm is due to A. Y. Khinchin . Another statement was given by A.N...
- Law of the unconscious statisticianLaw of the unconscious statisticianIn probability theory and statistics, the law of the unconscious statistician is a theorem used to calculate the expected value of a function g of a random variable X when one knows the probability distribution of X but one does not explicitly know the distribution of g.The form of the law can...
- Law of total covarianceLaw of total covarianceIn probability theory, the law of total covariance or covariance decomposition formula states that if X, Y, and Z are random variables on the same probability space, and the covariance of X and Y is finite, then...
- Law of total cumulanceLaw of total cumulanceIn probability theory and mathematical statistics, the law of total cumulance is a generalization to cumulants of the law of total probability, the law of total expectation, and the law of total variance. It has applications in the analysis of time series...
- Law of total expectationLaw of total expectationThe proposition in probability theory known as the law of total expectation, the law of iterated expectations, the tower rule, the smoothing theorem, among other names, states that if X is an integrable random variable The proposition in probability theory known as the law of total expectation, ...
- Law of total probabilityLaw of total probabilityIn probability theory, the law of total probability is a fundamental rule relating marginal probabilities to conditional probabilities.-Statement:The law of total probability is the proposition that if \left\...
- Law of total varianceLaw of total varianceIn probability theory, the law of total variance or variance decomposition formula states that if X and Y are random variables on the same probability space, and the variance of Y is finite, then...
- Law of Truly Large NumbersLaw of Truly Large NumbersThe law of truly large numbers, attributed to Persi Diaconis and Frederick Mosteller, states that with a sample size large enough, any outrageous thing is likely to happen. Because we never find it notable when likely events occur, we highlight unlikely events and notice them more...
- Layered hidden Markov modelLayered hidden Markov modelThe layered hidden Markov model is a statistical model derived from the hidden Markov model .A layered hidden Markov model consists of N levels of HMMs, where the HMMs on level i + 1 correspond to observation symbols or probability generators at level i.Every level i of the LHMM...
- Le Cam's theoremLe Cam's theoremIn probability theory, Le Cam's theorem, named after Lucien le Cam , is as follows.Suppose:* X1, ..., Xn are independent random variables, each with a Bernoulli distribution , not necessarily identically distributed.* Pr = pi for i = 1, 2, 3, ...* \lambda_n = p_1 + \cdots + p_n.\,* S_n = X_1...
- Lead time biasLead time biasLead time is the length of time between the detection of a disease and its usual clinical presentation and diagnosis ....
- Least absolute deviationsLeast absolute deviationsLeast absolute deviations , also known as Least Absolute Errors , Least Absolute Value , or the L1 norm problem, is a mathematical optimization technique similar to the popular least squares technique that attempts to find a function which closely approximates a set of data...
- Least-angle regressionLeast-angle regressionIn statistics, least-angle regression is a regression algorithm for high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani....
- Least squaresLeast squaresThe method of least squares is a standard approach to the approximate solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns. "Least squares" means that the overall solution minimizes the sum of the squares of the errors made in solving every...
- Least-squares spectral analysisLeast-squares spectral analysisLeast-squares spectral analysis is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis...
- Least squares support vector machineLeast squares support vector machineLeast squares support vector machines are least squares versions of support vector machines , which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis...
- Least trimmed squaresLeast Trimmed SquaresLeast trimmed squares , or least trimmed sum of squares, is a robust statistical method that attempts to fit a function to a set of data whilst not being unduly affected by the presence of outliers...
- Learning theory (statistics)
- Leftover hash-lemmaLeftover hash-lemmaThe leftover hash lemma is a lemma in cryptography first stated by Russell Impagliazzo, Leonid Levin, and Michael Luby.Imagine that you have a secret key X that has n uniform random bits, and you would like to use this secret key to encrypt a message. Unfortunately, you were a bit careless with the...
- Lehmann–Scheffé theoremLehmann–Scheffé theoremIn statistics, the Lehmann–Scheffé theorem is prominent in mathematical statistics, tying together the ideas of completeness, sufficiency, uniqueness, and best unbiased estimation...
- Length time biasLength time biasLength time bias is a form of selection bias, a statistical distortion of results which can lead to incorrect conclusions about the data. Length time bias can occur when the lengths of intervals are analysed by selecting intervals that occupy randomly chosen points in time or space...
- Levene's testLevene's testIn statistics, Levene's test is an inferential statistic used to assess the equality of variances in different samples. Some common statistical procedures assume that variances of the populations from which different samples are drawn are equal. Levene's test assesses this assumption. It tests the...
- Level of measurementLevel of measurementThe "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed by the psychologist Stanley Smith Stevens. Stevens proposed his theory in a 1946 Science article titled "On the theory of scales of measurement"...
- Levenberg–Marquardt algorithm
- Leverage (statistics)Leverage (statistics)In statistics, leverage is a term used in connection with regression analysis and, in particular, in analyses aimed at identifying those observations that are far away from corresponding average predictor values...
- Levey–Jennings chart — redirects to Laboratory quality controlLaboratory quality controlLaboratory quality control is designed to detect, reduce, and correct deficiencies in a laboratory's internal analytical process prior to the release of patient results and improve the quality of the results reported by the laboratory. Quality control is a measure of precision or how well the...
- Lévy's convergence theorem
- Lévy's continuity theoremLévy's continuity theoremIn probability theory, the Lévy’s continuity theorem, named after the French mathematician Paul Lévy, connects convergence in distribution of the sequence of random variables with pointwise convergence of their characteristic functions...
- Lévy arcsine lawLévy arcsine lawIn probability theory, the Lévy arcsine law, found by , states that the probability distribution of the proportion of the time that a Wiener process is positive is a random variable whose probability distribution is the arcsine distribution...
- Lévy distribution
- Lévy flightLévy flightA Lévy flight is a random walk in which the step-lengths have a probability distribution that is heavy-tailed. When defined as a walk in a space of dimension greater than one, the steps made are in isotropic random directions...
- Lévy processLévy processIn probability theory, a Lévy process, named after the French mathematician Paul Lévy, is any continuous-time stochastic process that starts at 0, admits càdlàg modification and has "stationary independent increments" — this phrase will be explained below...
- Lewontin's FallacyLewontin's FallacyHuman genetic diversity: Lewontin's fallacy is a 2003 paper by A. W. F. Edwards that refers to an argument first made by Richard Lewontin in his 1972 article The apportionment of human diversity, which argued that race for humans is not a valid taxonomic construct. Edwards' paper criticized and...
- Lexis diagramLexis diagramIn demography a Lexis diagram is a two dimensional diagram that is used to represent events that occur to individuals belonging to different cohorts...
- Lexis ratioLexis ratioThe Lexis ratio is used in statistics as a measure which seeks to evaluate differences between the statistical properties of random mechanisms where the outcome is two-valued — for example "success" or "failure", "win" or "lose"...
- Lies, damned lies, and statisticsLies, damned lies, and statistics"Lies, damned lies, and statistics" is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments...
- Life expectancyLife expectancyLife expectancy is the expected number of years of life remaining at a given age. It is denoted by ex, which means the average number of subsequent years of life for someone now aged x, according to a particular mortality experience...
- Life tableLife tableIn actuarial science, a life table is a table which shows, for each age, what the probability is that a person of that age will die before his or her next birthday...
- Lift (data mining)Lift (data mining)In data mining, lift is a measure of the performance of a model at predicting or classifying cases, measuring against a random choice model.For example, suppose a population has a predicted response rate of 5%, but a certain model has identified a segment with a predicted response rate of 20%...
- Likelihood functionLikelihood functionIn statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...
- Likelihood principleLikelihood principleIn statistics,the likelihood principle is a controversial principle of statistical inference which asserts that all of the information in a sample is contained in the likelihood function....
- Likelihood-ratio testLikelihood-ratio testIn statistics, a likelihood ratio test is a statistical test used to compare the fit of two models, one of which is a special case of the other . The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other...
- Likelihood ratios in diagnostic testingLikelihood ratios in diagnostic testingIn evidence-based medicine, likelihood ratios are used for assessing the value of performing a diagnostic test. They use the sensitivity and specificity of the test to determine whether a test result usefully changes the probability that a condition exists.-Calculation:Two versions of the...
- Likert scaleLikert scaleA Likert scale is a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, or more accurately the Likert-type scale, even though...
- Lilliefors testLilliefors testIn statistics, the Lilliefors test, named after Hubert Lilliefors, professor of statistics at George Washington University, is an adaptation of the Kolmogorov–Smirnov test...
- Limited dependent variableLimited dependent variableA limited dependent variable is a variable whose range ofpossible values is "restricted in some important way." In econometrics, the term is often used whenestimation of the relationship between the limited dependent variable...
- Limiting density of discrete pointsLimiting density of discrete pointsIn information theory, the limiting density of discrete points is an adjustment to the formula of Claude Elwood Shannon for differential entropy.It was formulated by Edwin Thompson Jaynes to address defects in the initial definition of differential entropy....
- Lincoln indexLincoln IndexThe Lincoln index is a statistical measure used in several fields to estimate the number of cases that have not yet been observed, based on two independent sets of observed cases. It is also sometimes known as the Lincoln-Petersen method.-Applications:...
- Lindeberg's conditionLindeberg's conditionIn probability theory, Lindeberg's condition is a sufficient condition for the central limit theorem to hold for a sequence of independent random variables...
- Lindley equationLindley equationIn probability theory, the Lindley equation, Lindley recursion or Lindley processes is a discrete time stochastic process An where n takes integer values and...
- Lindley's paradoxLindley's paradoxLindley's paradox is a counterintuitive situation in statistics in which the Bayesian and frequentist approaches to a hypothesis testing problem give opposite results for certain choices of the prior distribution...
- Line chartLine chartA line chart or line graph is a type of graph, which displays information as a series of data points connected by straight line segments. It is a basic type of chart common in many fields. It is an extension of a scatter graph, and is created by connecting a series of points that represent...
- Line-intercept samplingLine-intercept samplingIn statistics, line-intercept sampling is a method of sampling elements in a region whereby an element is sampled if a chosen line segment, called a “transect”, intersects the element ....
- Linear classifierLinear classifierIn the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics...
- Linear discriminant analysisLinear discriminant analysisLinear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...
- Linear least squares — disambiguation
- Linear least squares (mathematics)
- Linear modelLinear modelIn statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However the term is also used in time series analysis with a different...
- Linear predictionLinear predictionLinear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples....
- Linear probability model
- Linear regressionLinear regressionIn statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple regression...
- Linguistic demography
- LISRELLISRELLISREL, an acronym for linear structural relations, is a statistical software package used in structural equation modeling. LISREL was developed in 1970s by Karl Jöreskog, then a scientist at Educational Testing Service in Princeton, NJ, and Dag Sörbom, later both professors of Uppsala University,...
— proprietary statistical software package - List of basic statistics topics — redirects to Outline of statistics
- List of convolutions of probability distributions
- List of graphical methods
- List of information graphics software
- List of probability topics
- List of random number generators
- List of scientific journals in statistics
- List of statistical packages
- List of statisticians
- Listwise deletion
- Little's lawLittle's lawIn the mathematical theory of queues, Little's result, theorem, lemma, law or formula says:It is a restatement of the Erlang formula, based on the work of Danish mathematician Agner Krarup Erlang...
- Littlewood's lawLittlewood's lawLittlewood's Law states that individuals can expect a "miracle" to happen to them at the rate of about one per month.-History:The law was framed by Cambridge University Professor J. E...
- Ljung–Box testLjung–Box testThe Ljung–Box test is a type of statistical test of whether any of a group of autocorrelations of a time series are different from zero...
- Local convex hull
- Local independenceLocal independenceLocal independence is the underlying assumption of latent variable models.The observed items are conditionally independent of each other given an individual score on the latent variable. This means that the latent variable explains why the observed items are related to another...
- Local martingaleLocal martingaleIn mathematics, a local martingale is a type of stochastic process, satisfying the localized version of the martingale property. Every martingale is a local martingale; every bounded local martingale is a martingale; however, in general a local martingale is not a martingale, because its...
- Local regressionLocal regressionLOESS, or LOWESS , is one of many "modern" modeling methods that build on "classical" methods, such as linear and nonlinear least squares regression. Modern regression methods are designed to address situations in which the classical procedures do not perform well or cannot be effectively applied...
- Location estimation redirects to Location parameterLocation parameterIn statistics, a location family is a class of probability distributions that is parametrized by a scalar- or vector-valued parameter μ, which determines the "location" or shift of the distribution...
- Location estimation in sensor networksLocation estimation in sensor networksLocation estimation in wireless sensor networks is the problem of estimating the location of an object from a set of noisy measurements, when the measurements are acquired in a distributedmanner by a set of sensors.-Motivation:...
- Location parameterLocation parameterIn statistics, a location family is a class of probability distributions that is parametrized by a scalar- or vector-valued parameter μ, which determines the "location" or shift of the distribution...
- Location testLocation testA location test is a statistical hypothesis test that compares the location parameter of a statistical population to a given constant, or that compares the location parameters of two statistical populations to each other...
- Location-scale familyLocation-scale familyIn probability theory, especially as that field is used in statistics, a location-scale family is a family of univariate probability distributions parametrized by a location parameter and a non-negative scale parameter; if X is any random variable whose probability distribution belongs to such a...
- Local asymptotic normalityLocal asymptotic normalityIn statistics, local asymptotic normality is a property of a sequence of statistical models, which allows this sequence to be asymptotically approximated by a normal location model, after a rescaling of the parameter...
- Locality (statistics)
- Loess curve redirects to Local regressionLocal regressionLOESS, or LOWESS , is one of many "modern" modeling methods that build on "classical" methods, such as linear and nonlinear least squares regression. Modern regression methods are designed to address situations in which the classical procedures do not perform well or cannot be effectively applied...
- Log-Cauchy distributionLog-Cauchy distributionIn probability theory, a log-Cauchy distribution is a probability distribution of a random variable whose logarithm is distributed in accordance with a Cauchy distribution...
- Log-Laplace distributionLog-Laplace distributionIn probability theory and statistics, the log-Laplace distribution is the probability distribution of a random variable whose logarithm has a Laplace distribution. If X has a Laplace distribution with parameters μ and b, then Y = eX has a log-Laplace distribution...
- Log-normal distribution
- Log-linear model
- Log-linear modeling
- Log-log graphLog-log graphIn science and engineering, a log-log graph or log-log plot is a two-dimensional graph of numerical data that uses logarithmic scales on both the horizontal and vertical axes...
- Log-logistic distributionLog-logistic distributionIn probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, for example mortality from cancer following...
- Logarithmic distribution
- Logarithmic meanLogarithmic meanIn mathematics, the logarithmic mean is a function of two non-negative numbers which is equal to their difference divided by the logarithm of their quotient...
- Logistic distribution
- Logistic functionLogistic functionA logistic function or logistic curve is a common sigmoid curve, given its name in 1844 or 1845 by Pierre François Verhulst who studied it in relation to population growth. It can model the "S-shaped" curve of growth of some population P...
- Logistic regressionLogistic regressionIn statistics, logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logit function logistic curve. It is a generalized linear model used for binomial regression...
- LogitLogitThe logit function is the inverse of the sigmoidal "logistic" function used in mathematics, especially in statistics.Log-odds and logit are synonyms.-Definition:The logit of a number p between 0 and 1 is given by the formula:...
- Logit analysis in marketingLogit analysis in marketingLogit analysis is a statistical technique used by marketers to assess the scope of customer acceptance of a product, particularly a new product. It attempts to determine the intensity or magnitude of customers' purchase intentions and translates that into a measure of actual buying behaviour...
- Logit-normal distribution
- Lognormal distribution
- Logrank testLogrank testIn statistics, the logrank test is a hypothesis test to compare the survival distributions of two samples. It is a nonparametric test and appropriate to use when the data are right skewed and censored...
- Lomax distribution
- Long-range dependencyLong-range dependencyLong-range dependency is a phenomenon that may arise in the analysis of spatial or time series data. It relates to the rate of decay of statistical dependence, with the implication that this decays more slowly than an exponential decay, typically a power-like decay...
- Long TailLong tailLong tail may refer to:*The Long Tail, a consumer demographic in business*Power law's long tail, a statistics term describing certain kinds of distribution*Long-tail boat, a type of watercraft native to Southeast Asia...
- Long-tail trafficLong-tail trafficThis article covers a range of tools from different disciplines that may be used in the important science of determining the probability of rare events....
- Longitudinal studyLongitudinal studyA longitudinal study is a correlational research study that involves repeated observations of the same variables over long periods of time — often many decades. It is a type of observational study. Longitudinal studies are often used in psychology to study developmental trends across the...
- Lorenz curveLorenz curveIn economics, the Lorenz curve is a graphical representation of the cumulative distribution function of the empirical probability distribution of wealth; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values...
- Loss functionLoss functionIn statistics and decision theory a loss function is a function that maps an event onto a real number intuitively representing some "cost" associated with the event. Typically it is used for parameter estimation, and the event in question is some function of the difference between estimated and...
- Lot quality assurance samplingLot Quality Assurance SamplingLot quality assurance sampling is a simple, low-cost random sampling methodology developed in the 1920s to control the quality of output in industrial production processes....
- Lotka's law
- Low birth weight paradoxLow birth weight paradoxThe low birth weight paradox is an apparently paradoxical observation relating to the birth weights and mortality of children born to tobacco smoking mothers. Low birth weight children born to smoking mothers have a lower infant mortality rate than the low birth weight children of non-smokers...
- Lucia de BerkLucia de BerkLucia de Berk, often called Lucia de B. or Lucy de B is a Dutch licenced paediatric nurse, who was subject to a miscarriage of justice. She was sentenced to life imprisonment in 2003 for four murders and three attempted murders of patients in her care...
– prob/stats related court case - Lukacs's proportion-sum independence theoremLukacs's proportion-sum independence theoremIn statistics, Lukacs's proportion-sum independence theorem is a result that is used when studying proportions, in particular the Dirichlet distribution...
- LumpabilityLumpabilityIn probability theory, lumpability is a method for reducing the size of the state space of some continuous-time Markov chains, first published by Kemeny and Snell.-Definition:...
- Lusser's lawLusser's LawLusser's law, named after Robert Lusser, is a prediction of reliability named after Robert Lusser. It is also called the "probability product law of series components". It states that the reliability of a series system is equal to the product of the reliability of its component subsystems, if their...
- Lyapunov's central limit theorem
M
- M/G/1 model
- M/M/1 modelM/M/1 modelIn queueing theory, a discipline within the mathematical theory of probability, a M/M/1 queue represents the queue length in a system having a single server, where arrivals are detemined by a Poisson process and job service times have an exponential distribution. The model name is written in...
- M/M/c modelM/M/c modelIn the mathematical theory of random processes, the M/M/c queue is a multi-server queue model. It is a generalisation of the M/M/1 queue.Following Kendall's notation it indicates a system where:*Arrivals are a Poisson process...
- M-estimatorM-estimatorIn statistics, M-estimators are a broad class of estimators, which are obtained as the minima of sums of functions of the data. Least-squares estimators and many maximum-likelihood estimators are M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new...
- Redescending M-estimatorRedescending M-estimatorIn statistics, Redescending M-estimators are Ψ-type M-estimators which have Ψ functions that are non-decreasing near the origin, but decreasing toward 0 far from the origin...
- Redescending M-estimator
- M-separationM-separationIn statistics, m-separation is a measure of disconnectedness in ancestral graphs and a generalization of d-separation for directed acyclic graphs. It is the opposite of m-connectedness....
- Machine learningMachine learningMachine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...
- Mahalanobis distanceMahalanobis distanceIn statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analyzed. It gauges similarity of an unknown sample set to a known one. It differs from Euclidean...
- Main effectMain effectIn the design of experiments and analysis of variance, a main effect is the effect of an independent variable on a dependent variable averaging across the levels of any other independent variables...
- Mallows' CpMallows' CpIn statistics, Mallows' Cp, named for Colin L. Mallows, is used to assess the fit of a regression model that has been estimated using ordinary least squares. It is applied in the context of model selection, where a number of predictor variables are available for predicting some outcome, and the...
- MANCOVAMANCOVAMultivariate analysis of covariance is an extension of analysis of covariance methods to cover cases where there is more than one dependent variable and where the dependent variables cannot simply be combined....
- Manhattan plotManhattan plotA Manhattan plot is a type of scatter plot, usually used to display data with a large number of data-points - many of non-zero amplitude, and with a distribution of higher-magnitude values, for instance in genome-wide association studies...
- Mann–Whitney U
- MANOVAMANOVAMultivariate analysis of variance is a generalized form of univariate analysis of variance . It is used when there are two or more dependent variables. It helps to answer : 1. do changes in the independent variable have significant effects on the dependent variables; 2. what are the interactions...
- Mantel testMantel testThe Mantel test, named after Nathan Mantel, is a statistical test of the correlation between two matrices. The matrices must be of the same rank, in most applications they are matrices of interrelations between the same vectors of objects....
- MAP estimator — redirects to Maximum a posteriori estimation
- Marchenko–Pastur distributionMarchenko–Pastur distributionIn random matrix theory, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the asymptotic behavior of singular values of large rectangular random matrices...
- Marcinkiewicz–Zygmund inequalityMarcinkiewicz–Zygmund inequalityIn mathematics, the Marcinkiewicz–Zygmund inequality, named after Józef Marcinkiewicz and Antoni Zygmund, gives relations between moments of a collection of independent random variables...
- Marcum Q-function
- Margin of errorMargin of errorThe margin of error is a statistic expressing the amount of random sampling error in a survey's results. The larger the margin of error, the less faith one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population...
- Marginal distributionMarginal distributionIn probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained...
- Marginal likelihoodMarginal likelihoodIn statistics, a marginal likelihood function, or integrated likelihood, is a likelihood function in which some parameter variables have been marginalised...
- Marginal modelMarginal modelIn statistics, marginal models are a technique for obtaining regression estimates in multilevel modeling, also called hierarchical linear models....
- Marginal variable — redirects to Marginal distributionMarginal distributionIn probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained...
- Mark and recaptureMark and recaptureMark and recapture is a method commonly used in ecology to estimate population size. This method is most valuable when a researcher fails to detect all individuals present within a population of interest every time that researcher visits the study area...
- Markov additive process
- Markov blanketMarkov blanketIn machine learning, the Markov blanket for a node A in a Bayesian network is the set of nodes \partial A composed of A's parents, its children, and its children's other parents. In a Markov network, the Markov blanket of a node is its set of neighbouring nodes...
- Markov chainMarkov chainA Markov chain, named after Andrey Markov, is a mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process characterized as memoryless: the next state depends only on the current state and not on the...
- Markov chain geostatisticsMarkov chain geostatisticsMarkov chain geostatistics refer to the Markov chain models, simulation algorithms and associated spatial correlation measures based on the Markov chain random field theory, which extends a single Markov chain into a multi-dimensional field for geostatistical modeling. A Markov chain random field...
- Markov chain mixing timeMarkov chain mixing timeIn probability theory, the mixing time of a Markov chain is the time until the Markov chain is "close" to its steady state distribution.More precisely, a fundamental result about Markov chains is that a finite state irreducible aperiodic chain has a unique stationary distribution π and,...
- Markov chain geostatistics
- Markov chain Monte CarloMarkov chain Monte CarloMarkov chain Monte Carlo methods are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a large number of steps is then used as a sample of the...
- Markov decision processMarkov decision processMarkov decision processes , named after Andrey Markov, provide a mathematical framework for modeling decision-making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying a wide range of optimization problems solved via...
- Markov information sourceMarkov information sourceIn mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain.-Formal definition:...
- Markov kernelMarkov kernelIn probability theory, a Markov kernel is a map that plays the role, in the general theory of Markov processes, that the transition matrix does in the theory of Markov processes with a finite state space.- Formal definition :...
- Markov logic networkMarkov logic networkA Markov logic network is a probabilistic logic which applies the ideas of a Markov network to first-order logic, enabling uncertain inference...
- Markov modelMarkov modelIn probability theory, a Markov model is a stochastic model that assumes the Markov property. Generally, this assumption enables reasoning and computation with the model that would otherwise be intractable.-Introduction:...
- Markov networkMarkov networkA Markov random field, Markov network or undirected graphical model is a set of variables having a Markov property described by an undirected graph. A Markov random field is similar to a Bayesian network in its representation of dependencies...
- Markov processMarkov processIn probability theory and statistics, a Markov process, named after the Russian mathematician Andrey Markov, is a time-varying random phenomenon for which a specific property holds...
- Markov propertyMarkov propertyIn probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process. It was named after the Russian mathematician Andrey Markov....
- Markov random field
- Markov's inequalityMarkov's inequalityIn probability theory, Markov's inequality gives an upper bound for the probability that a non-negative function of a random variable is greater than or equal to some positive constant...
- Markovian arrival processesMarkovian arrival processesIn queueing theory, Markovian arrival processes are used to model the arrival of customers to a queue.Some of the most common include the Poisson process, Markov arrival process and the batch Markov arrival process.-Background:...
- Marsaglia polar methodMarsaglia polar methodThe polar method is a pseudo-random number sampling method for generating a pair of independent standard normal random variables...
- Martingale (probability theory)Martingale (probability theory)In probability theory, a martingale is a model of a fair game where no knowledge of past events can help to predict future winnings. In particular, a martingale is a sequence of random variables for which, at a particular time in the realized sequence, the expectation of the next value in the...
- Martingale difference sequenceMartingale difference sequenceIn probability theory, a martingale difference sequence is related to the concept of the martingale. A stochastic series Y is an MDS if its expectation with respect to past values of another stochastic series X is zero...
- Martingale representation theoremMartingale representation theoremIn probability theory, the martingale representation theorem states that a random variable which is measurable with respect to the filtration generated by a Brownian motion can be written in terms of an Itô integral with respect to this Brownian motion....
- Master equationMaster equationIn physics and chemistry and related fields, master equations are used to describe the time-evolution of a system that can be modelled as being in exactly one of countable number of states at any given time, and where switching between states is treated probabilistically...
- Matched filterMatched filterIn telecommunications, a matched filter is obtained by correlating a known signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template...
- Matching pursuitMatching pursuitMatching pursuit is a type of numerical technique which involves finding the "best matching" projections of multidimensional data onto an over-complete dictionary D...
- Matching (statistics)Matching (statistics)Matching is a statistical technique which is used to evaluate the effect of a treatment by comparing the treated and the non-treated in non experimental design . People use this technique with observational data...
- Matérn covariance functionMatérn covariance functionIn statistics, the Matérn covariance is a covariance function used in spatial statistics, geostatistics, machine learning, image analysis, and other applications of multivariate statistical analysis on metric spaces...
- MathematicaMathematicaMathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...
– software - Mathematical biologyMathematical biologyMathematical and theoretical biology is an interdisciplinary scientific research field with a range of applications in biology, medicine and biotechnology...
- Mathematical modelling in epidemiologyMathematical modelling in epidemiologyIt is possible to mathematically model the progress of most infectious diseases to discover the likely outcome of an epidemic or to help manage them by vaccination...
- Mathematical modelling of infectious disease
- Mathematical statisticsMathematical statisticsMathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and analysis...
- Matthews correlation coefficientMatthews Correlation CoefficientThe Matthews correlation coefficient is used in machine learning as a measure of the quality of binary classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes...
- Matrix normal distributionMatrix normal distributionThe matrix normal distribution is a probability distribution that is a generalization of the normal distribution to matrix-valued random variables.- Definition :...
- Matrix population modelsMatrix population modelsPopulation models are used in population ecology to model the dynamics of wildlife or human populations. Matrix population models are a specific type of population model that uses matrix algebra...
- Mauchly's sphericity testMauchly's sphericity testMauchly's sphericity test is a statistical test used to validate repeated measures factor ANOVAs. The test was introduced by ENIAC co-inventor John Mauchly in 1940.-What is sphericity?:...
- Maximal ergodic theorem
- Maximum a posteriori estimation
- Maximum entropy classifier redirects to Logistic regressionLogistic regressionIn statistics, logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logit function logistic curve. It is a generalized linear model used for binomial regression...
- Maximum entropy Markov modelMaximum entropy Markov modelIn machine learning, a maximum-entropy Markov model , or conditional Markov model , is a graphical model for sequence labeling that combines features of hidden Markov models and maximum entropy models...
- Maximum entropy method redirects to Principle of maximum entropyPrinciple of maximum entropyIn Bayesian probability, the principle of maximum entropy is a postulate which states that, subject to known constraints , the probability distribution which best represents the current state of knowledge is the one with largest entropy.Let some testable information about a probability distribution...
- Maximum entropy probability distributionMaximum entropy probability distributionIn statistics and information theory, a maximum entropy probability distribution is a probability distribution whose entropy is at least as great as that of all other members of a specified class of distributions....
- Maximum entropy spectral estimationMaximum entropy spectral estimationThe maximum entropy method applied to spectral density estimation. The overall idea is that the maximum entropy rate stochastic process that satisfies the given constant autocorrelation and variance constraints, is a linear Gauss-Markov process with i.i.d...
- Maximum likelihoodMaximum likelihoodIn statistics, maximum-likelihood estimation is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters....
- Maximum likelihood sequence estimationMaximum Likelihood Sequence EstimationMaximum likelihood sequence estimation is a mathematical algorithm to extract useful data out of a noisy data stream.-Theory:For an optimized detector for digital signals the priority is not to reconstruct the transmitter signal, but it should do a best estimation of the transmitted data with the...
- Maximum parsimonyMaximum parsimonyParsimony is a non-parametric statistical method commonly used in computational phylogenetics for estimating phylogenies. Under parsimony, the preferred phylogenetic tree is the tree that requires the least evolutionary change to explain some observed data....
- Maximum spacing estimationMaximum spacing estimationIn statistics, maximum spacing estimation , or maximum product of spacing estimation , is a method for estimating the parameters of a univariate statistical model...
- Maxwell speed distributionMaxwell Speed DistributionClassically, an ideal gas' molecules bounce around with somewhat arbitrary velocities, never interacting with each other. In reality, however, an ideal gas is subjected to intermolecular forces. It is to be noted that the aforementioned classical treatment of an ideal gas is only useful when...
- Maxwell–Boltzmann distribution
- Maxwell’s theorem
- MCARMCARIn statistical analysis, data-values in a data set are missing completely at random if the events that lead to any particular data-item being missing are independent both of observable variables and of unobservable parameters of interest....
(missing completely at random) - McCullagh's parametrization of the Cauchy distributions
- McDiarmid's inequality
- McDonald–Kreitman test — statistical genetics
- McNemar's testMcNemar's testIn statistics, McNemar's test is a non-parametric method used on nominal data. It is applied to 2 × 2 contingency tables with a dichotomous trait, with matched pairs of subjects, to determine whether the row and column marginal frequencies are equal...
- Meadow's lawMeadow's lawMeadow's Law was a precept much in use until recently in the field of child protection, specifically by those investigating cases of multiple cot or crib death — SIDS — within a single family.-History:...
- MeanMeanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
- Mean – see also expected valueExpected valueIn probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...
- Mean absolute errorMean absolute errorIn statistics, the mean absolute error is a quantity used to measure how close forecasts or predictions are to the eventual outcomes. The mean absolute error is given by...
- Mean absolute percentage errorMean Absolute Percentage ErrorMean absolute percentage error is measure of accuracy in a fitted time series value in statistics, specifically trending. It usually expresses accuracy as a percentage, and is defined by the formula:...
- Mean absolute scaled errorMean absolute scaled errorIn statistics, the mean absolute scaled error is a measure of the accuracy of forecasts . It was proposed in 2006 by Australian statistician Rob Hyndman, who described it as a "generally applicable measurement of forecast accuracy without the problems seen in the other measurements."The mean...
- Mean and predicted responseMean and predicted responseIn linear regression mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable...
- Mean deviation
- Mean differenceMean differenceThe mean difference is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean difference, which is the mean difference divided by the arithmetic mean...
- Mean integrated squared error
- Mean of circular quantitiesMean of circular quantitiesIn mathematics, a mean of circular quantities is a mean which is suited for quantities like angles, daytimes, and fractional parts of real numbers. This is necessary since most of the usual means fail on circular quantities...
- Mean percentage errorMean Percentage ErrorIn statistics, the mean percentage error is the computed average of percentage errors by which estimated forecasts differ from actual values of the quantity being forecast.Formula for mean percentage error calculation is:...
- Mean preserving spread
- Mean reciprocal rankMean reciprocal rankMean reciprocal rank is a statistic for evaluating any process that produces a list of possible responses to a query, ordered by probability of correctness. The reciprocal rank of a query response is the multiplicative inverse of the rank of the first correct answer...
- Mean signed difference
- Mean square quantization errorMean square quantization errorMean square quantization error is a figure of merit for the process of analog to digital conversion.As the input is varied, the input's value is recorded when the digital output changes. For each digital output, the input's difference from ideal is normalized to the value of the least significant...
- Mean square weighted deviationMean square weighted deviationMean square weighted deviation is used extensively in geochronology, the science of obtaining information about the time of formation of, for example, rocks, minerals, bones, corals, or charcoal, or the time at which particular processes took place in a rock mass, for example recrystallization and...
- Mean squared errorMean squared errorIn statistics, the mean squared error of an estimator is one of many ways to quantify the difference between values implied by a kernel density estimator and the true values of the quantity being estimated. MSE is a risk function, corresponding to the expected value of the squared error loss or...
- Mean squared prediction error
- Mean time between failures
- Mean-reverting process — redirects to Ornstein–Uhlenbeck process
- Mean value analysisMean value analysisIn queueing theory, a specialty within the mathematical theory of probability, mean value analysis is a technique for computing expected queue lengths in equilibrium for a closed separable system of queues...
- Measurement, level of — see level of measurementLevel of measurementThe "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed by the psychologist Stanley Smith Stevens. Stevens proposed his theory in a 1946 Science article titled "On the theory of scales of measurement"...
. - MedCalcMedCalcMedCalc is a statistical software package designed for the biomedical sciences. It has an integrated spreadsheet for data input and can import files in several formats...
– software - MedianMedianIn probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...
- Median absolute deviationMedian absolute deviationIn statistics, the median absolute deviation is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample....
- Median polishMedian polishThe median polish is an exploratory data analysis procedure proposed by the statistician John Tukey. It finds an additively-fit model for data in a two-way layout table of the form row effect + column effect + overall median.-References:* Frederick Mosteller and John Tukey . "Data Analysis and...
- Median testMedian testIn statistics, Mood's median test is a special case of Pearson's chi-squared test. It is a nonparametric test that tests the null hypothesis that the medians of the populations from which two samples are drawn are identical...
- Mediation (statistics)Mediation (Statistics)In statistics, a mediation model is one that seeks to identify and explicate the mechanism that underlies an observed relationship between an independent variable and a dependent variable via the inclusion of a third explanatory variable, known as a mediator variable...
- Medical statisticsMedical statisticsMedical statistics deals with applications of statistics to medicine and the health sciences, including epidemiology, public health, forensic medicine, and clinical research...
- MedoidMedoidMedoids are representative objects of a data set or a cluster with a data set whose average dissimilarity to all the objects in the cluster is minimal. Medoids are similar in concept to means or centroids, but medoids are always members of the data set...
- MemorylessnessMemorylessnessIn probability and statistics, memorylessness is a property of certain probability distributions: the exponential distributions of non-negative real numbers and the geometric distributions of non-negative integers....
- Mendelian randomizationMendelian randomizationIn epidemiology, Mendelian randomization is a method of using measured variation in genes of known function to examine the causal effect of a modifiable exposure on disease in non-experimental studies...
- Mentor (statistics)Mentor (statistics)Mentor is a flexible and sophisticated statistical analysis system produced by CfMC. It specializes in the tabulation and graphical display of market and opinion research data, and is integrated with their Survent data collection software....
– software - Meta-analysisMeta-analysisIn statistics, a meta-analysis combines the results of several studies that address a set of related research hypotheses. In its simplest form, this is normally by identification of a common measure of effect size, for which a weighted average might be the output of a meta-analyses. Here the...
- Meta-analytic thinking
- Method of moments (statistics)
- Method of simulated momentsMethod of simulated momentsIn econometrics, the method of simulated moments is a structural estimation technique introduced by Daniel McFadden. It extends the generalized method of moments to cases where theoretical moment functions cannot be evaluated directly, such as when moment functions involve high-dimensional...
- Method of supportMethod of supportIn statistics, the method of support is a technique that is used to make inferences from datasets.According to A. W. F. Edwards, the method of support aims to make inferences about unknown parameters in terms of the relative support, or log likelihood, induced by a set of data for a particular...
- Metropolis–Hastings algorithm
- Mexican paradoxMexican paradoxThe Mexican paradox is the observation that the Mexican people exhibit a surprisingly low incidence of low birth mass, contrary to what would be expected from their socioeconomic status...
- Microdata (statistics)Microdata (statistics)In the study of survey and census data, microdata is information at the level of individual respondents. For instance, a national census might collect age, home address, educational level, employment status, and many other variables, recorded separately for every person who responds; this is...
- MidhingeMidhingeIn statistics, the midhinge is the average of the first and third quartiles and is thus a measure of location.Equivalently, it is the 25% trimmed mid-range; it is an L-estimator....
- Mid-range
- MinHashMinHashIn computer science, MinHash is a technique for quickly estimating how similar two sets are...
- MinimaxMinimaxMinimax is a decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case scenario. Alternatively, it can be thought of as maximizing the minimum gain...
- Minimax estimator
- Minimisation (clinical trials)
- Minimum distance estimationMinimum distance estimationMinimum distance estimation is a statistical method for fitting a mathematical model to data, usually the empirical distribution.-Definition:...
- Minimum mean square error
- Minimum-variance unbiased estimatorMinimum-variance unbiased estimatorIn statistics a uniformly minimum-variance unbiased estimator or minimum-variance unbiased estimator is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter.The question of determining the UMVUE, if one exists, for a particular...
- Minimum viable populationMinimum Viable PopulationMinimum viable population is a lower bound on the population of a species, such that it can survive in the wild. This term is used in the fields of biology, ecology, and conservation biology...
- MinitabMinitabMinitab is a statistics package. It was developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner in 1972...
- MINQUEMinqueIn statistics, the theory of minimum norm quadratic unbiased estimation was developed by C.R. Rao. Its application was originally to the estimation of variance components in random effects models.The theory involves three stages:...
– minimum norm quadratic unbiased estimation - Missing completely at random
- Missing data
- Missing values — redirects to Missing data
- Mittag–Leffler distribution
- Mixed logitMixed logitMixed logit is a fully general statistical model for examining discrete choices. The motivation for the mixed logit model arises from the limitations of the standard logit model...
- Misuse of statisticsMisuse of statisticsA misuse of statistics occurs when a statistical argument asserts a falsehood. In some cases, the misuse may be accidental. In others, it is purposeful and for the gain of the perpetrator. When the statistical reason involved is false or misapplied, this constitutes a statistical fallacy.The false...
- Mixed data samplingMixed data samplingMixed data sampling is an econometric regression or filtering method developed by Ghysels et al. A simple regression example has the regressor appearing at a higher frequency than the regressand:...
- Mixed-design analysis of varianceMixed-design analysis of varianceIn statistics, a mixed-design analysis of variance model is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures...
- Mixed modelMixed modelA mixed model is a statistical model containing both fixed effects and random effects, that is mixed effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences....
- Mixing (mathematics)Mixing (mathematics)In mathematics, mixing is an abstract concept originating from physics: the attempt to describe the irreversible thermodynamic process of mixing in the everyday world: mixing paint, mixing drinks, etc....
- Mixture distribution
- Mixture modelMixture modelIn statistics, a mixture model is a probabilistic model for representing the presence of sub-populations within an overall population, without requiring that an observed data-set should identify the sub-population to which an individual observation belongs...
- Mixture (probability)Mixture (probability)In probability theory and statistics, a mixture is a combination of two or more probability distributions. The concept arises in two contexts:* A mixture defining a new probability distribution from some existing ones, as in a mixture density...
- MLwiNMLwiNMLwiN is a statistical software package for fitting multilevel models. It uses both maximum likelihood estimation and Markov Chain Monte Carlo methods...
- Mode (statistics)Mode (statistics)In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....
- Model output statisticsModel output statisticsModel Output Statistics is an omnipresent statistical technique that forms the backbone of modern weather forecasting. The technique pioneered in the 1960s and early 1970s is used to post-process output from numerical weather forecast models...
- Model selectionModel selectionModel selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered...
- Moderator variable redirects to Moderation (statistics)Moderation (statistics)In statistics, moderation occurs when the relationship between two variables depends on a third variable. The third variable is referred to as the moderator variable or simply the moderator...
- Modifiable areal unit problemModifiable Areal Unit ProblemThe modifiable areal unit problem is a source of statistical bias that can radically affect the results of statistical hypothesis tests. It affects results when point-based measures of spatial phenomena are aggregated into districts. The resulting summary values are influenced by the choice of...
- Moffat distributionMoffat distributionThe Moffat distribution, named after the physicist Anthony Moffat, is a continuous probability distribution based upon the Lorentzian distribution...
- Moment (mathematics)Moment (mathematics)In mathematics, a moment is, loosely speaking, a quantitative measure of the shape of a set of points. The "second moment", for example, is widely used and measures the "width" of a set of points in one dimension or in higher dimensions measures the shape of a cloud of points as it could be fit by...
- Moment-generating functionMoment-generating functionIn probability theory and statistics, the moment-generating function of any random variable is an alternative definition of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or...
- Moments, method of — see method of moments (statistics)
- Moment problem
- Monotone likelihood ratio
- Monte Carlo integration
- Monte Carlo methodMonte Carlo methodMonte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical and mathematical systems...
- Monte Carlo method for photon transportMonte Carlo method for photon transportModeling photon propagation with Monte Carlo methods is a flexible yet rigorous approach to simulate photon transport. In the method, local rules of photon transport are expressed as probability distributions which describe the step size of photon movement between sites of photon-tissue interaction...
- Monte Carlo methods for option pricing
- Monte Carlo methods in financeMonte Carlo methods in financeMonte Carlo methods are used in finance and mathematical finance to value and analyze instruments, portfolios and investments by simulating the various sources of uncertainty affecting their value, and then determining their average value over the range of resultant outcomes. This is usually done...
- Monte Carlo molecular modelingMonte Carlo molecular modelingMonte Carlo molecular modeling is the application of Monte Carlo methods to molecular problems. These problems can also be modeled by the molecular dynamics method. The difference is that this approach relies on statistical mechanics rather than molecular dynamics. Instead of trying to reproduce...
- Moral graphMoral graphA moral graph is a concept in graph theory, used to find the equivalent undirected form of a directed acyclic graph. It is a key step of the junction tree algorithm, used in belief propagation on graphical models....
- Moran processMoran processA Moran process, named after Patrick Moran, is a stochastic process used in biology to describe finite populations. It can be used to model variety-increasing processes such as mutation as well as variety-reducing effects such as genetic drift and natural selection...
- Moran's IMoran's IIn statistics, Moran's I is a measure of spatial autocorrelation developed by Patrick A.P. Moran. Spatial autocorrelation is characterized by a correlation in a signal among nearby locations in space. Spatial autocorrelation is more complex than one-dimensional autocorrelation because spatial...
- Morisita's overlap index
- Morris methodMorris methodIn applied statistics, the Morris method for global sensitivity analysis is a so-called one-step-at-a-time method , meaning that in each run only one input parameter is given a new value. It facilitates a global sensitivity analysis by making a number r of local changes at different points x of the...
- Mortality rateMortality rateMortality rate is a measure of the number of deaths in a population, scaled to the size of that population, per unit time...
- Most probable numberMost probable numberThe most probable number method, otherwise known as the method of Poisson zeroes, is a method of getting quantitative data on concentrations of discrete items from positive/negative data....
- Moving average
- Moving average modelMoving average modelIn time series analysis, the moving-average model is a common approach for modeling univariate time series models. The notation MA refers to the moving average model of order q:...
- Moving average representation — redirects to Wold's theorem
- Moving least squaresMoving least squaresMoving least squares is a method of reconstructing continuous functions from a set of unorganized point samples via the calculation of a weighted least squares measure biased towards the region around the point at which the reconstructed value is requested....
- Multi-armed banditMulti-armed banditIn statistics, particularly in the design of sequential experiments, a multi-armed bandit takes its name from a traditional slot machine . Multiple levers are considered in the motivating applications in statistics. When pulled, each lever provides a reward drawn from a distribution associated...
- Multi-vari chartMulti-vari chartIn quality control, multi-vari charts are a visual way of presenting variability through a series of charts. The content and format of the charts has evolved over time.-Original concept:...
- Multiclass classificationMulticlass classificationIn machine learning, multiclass or multinomial classification is the problem of classifying instances into more than two classes.While some classification algorithms naturally permit the use of more than two classes, others are by nature binary algorithms; these can, however, be turned into...
- Multiclass LDA (Linear discriminant analysis) — redirects to Linear discriminant analysisLinear discriminant analysisLinear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...
- MulticollinearityMulticollinearityMulticollinearity is a statistical phenomenon in which two or more predictor variables in a multiple regression model are highly correlated. In this situation the coefficient estimates may change erratically in response to small changes in the model or the data...
- Multidimensional analysisMultidimensional analysisIn statistics, econometrics, and related fields, multidimensional analysis is a data analysis process that groups data into two or more categories: data dimensions and measurements. For example, a data set consisting of the number of wins for a single football team at each of several years is a...
- Multidimensional Chebyshev's inequalityMultidimensional Chebyshev's inequalityIn probability theory, the multidimensional Chebyshev's inequality is a generalization of Chebyshev's inequality, which puts a bound on the probability of the event that a random variable differs from its expected value by more than a specified amount....
- Multidimensional panel dataMultidimensional panel dataIn econometrics, panel data is data observed over two dimensions . A panel data set is termed "multidimensional" when the phenomenon is observed over three or more dimensions...
- Multidimensional scalingMultidimensional scalingMultidimensional scaling is a set of related statistical techniques often used in information visualization for exploring similarities or dissimilarities in data. MDS is a special case of ordination. An MDS algorithm starts with a matrix of item–item similarities, then assigns a location to each...
- Multifactor design of experiments softwareMultifactor design of experiments softwareSoftware that is used for designing factorial experiments plays an important role in scientific experiments generally and represents a route to the implementation of design of experiments procedures that derive from statistical and combinatoric theory...
- Multifactor dimensionality reductionMultifactor dimensionality reductionMultifactor dimensionality reduction is a data mining approach for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable...
- Multilevel modelMultilevel modelMultilevel models are statistical models of parameters that vary at more than one level...
- Multinomial distribution
- Multinomial logitMultinomial logitIn statistics, economics, and genetics, a multinomial logit model, also known as multinomial logistic regression, is a regression model which generalizes logistic regression by allowing more than two discrete outcomes...
- Multinomial probitMultinomial probitIn econometrics and statistics, the multinomial probit model, a popular alternative to the multinomial logit model, is a generalization of the probit model that allows more than two discrete, unordered outcomes. It is not to be confused with the multivariate probit model, which is used to model...
- Multinomial testMultinomial testIn statistics, the multinomial test is the test of the null hypothesis that the parameters of a multinomial distribution equal specified values. It is used for categorical data; see Read and Cressie....
- Multiple baseline designMultiple Baseline DesignA multiple baseline design is a style of research involving the careful measurement of multiple persons, traits or settings both before and after a treatment. This design is used in medical, psychological and biological research to name a few areas. It has several advantages over AB designs which...
- Multiple comparisonsMultiple comparisonsIn statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly...
- Multiple correlationMultiple correlationIn statistics, multiple correlation is a linear relationship among more than two variables. It is measured by the coefficient of multiple determination, denoted as R2, which is a measure of the fit of a linear regression...
- Multiple correspondence analysisMultiple correspondence analysisIn statistics, multiple correspondence analysis is a data analysis technique for nominal categorical data, used to detect and represent underlying structures in a data set. It does this by representing data as points in a low-dimensional Euclidean space. The procedure thus appears to be the...
- Multiple discriminant analysisMultiple discriminant analysisMultiple Discriminant Analysis is a method for compressing a multivariate signal to yield a lower dimensional signal amenable to classification....
- Multiple-indicator krigingMultiple-indicator krigingMultiple-indicator kriging is a recent advance on other techniques for mineral deposit modeling and resource block model estimation, such as ordinary kriging....
- Multiple Indicator Cluster SurveyMultiple Indicator Cluster SurveyThe Multiple Indicator Cluster Surveys are a survey program developed by the United Nations Children's Fund to provide internationally comparable, statistically rigorous data on the situation of children and women. The first round of surveys was carried out in over 60 countries in 1995 in...
- Multiple of the medianMultiple of the medianA multiple of the median is a measure of how far an individual test result deviates from the median. MoM is commonly used to report the results of medical screening tests, particularly where the results of the individual tests are highly variable....
- Multiple testing correction redirects to Multiple comparisonsMultiple comparisonsIn statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly...
- Multiple-try MetropolisMultiple-try MetropolisIn Markov chain Monte Carlo, the Metropolis–Hastings algorithm can be used to sample from a probability distribution which is difficult to sample from directly. However, the MH algorithm requires the user to supply a proposal distribution, which can be relatively arbitrary...
- Multiresolution analysisMultiresolution analysisA multiresolution analysis or multiscale approximation is the design method of most of the practically relevant discrete wavelet transforms and the justification for the algorithm of the fast wavelet transform...
- Multiscale decision makingMultiscale decision makingMultiscale decision making, also referred to as Multiscale decision theory , is a recently developed approach in operations research that fuses game theory, multi-agent influence diagrams, in particular dependency graphs, and Markov decision processes to solve multiscale challenges across...
- Multiscale geometric analysisMultiscale geometric analysisMultiscale geometric analysis or geometric multiscale analysis is an emerging area of high-dimensional signal processing and data analysis.-See also:*Wavelet*Scale space*Multi-scale approaches*Multiresolution analysis*Singular value decomposition...
- Multistage testingMultistage testingMultistage testing is an algorithm-based approach to administering tests. It is very similar to computer-adaptive testing in that items are interactively selected for each examinee by the algorithm, but rather than selecting individual items, groups of items are selected, building the test in stages...
- Multitrait-multimethod matrixMultitrait-multimethod matrixThe multitrait-multimethod matrix is an approach to examining Construct Validity developed by Campbell and Fiske. There are six major considerations when examining a construct's validity through the MTMM matrix, which are as follows:...
- Multivariate adaptive regression splinesMultivariate adaptive regression splinesMultivariate adaptive regression splines is a form of regression analysis introduced by Jerome Friedman in 1991. It is a non-parametric regression techniqueand can be seen as an extension of linear models that...
- Multivariate analysisMultivariate analysisMultivariate analysis is based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical variable at a time...
- Multivariate analysis of variance
- Multivariate distribution – redirects to Joint probability distribution
- Multivariate kernel density estimationMultivariate kernel density estimationKernel density estimation is a nonparametric technique for density estimation i.e., estimation of probability density functions, which is one of the fundamental questions in statistics. It can be viewed as a generalisation of histogram density estimation with improved statistical properties...
- Multivariate normal distribution
- Multivariate Pólya distributionMultivariate Polya distributionThe multivariate Pólya distribution, named after George Pólya, also called the Dirichlet compound multinomial distribution, is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector \alpha, and a set of discrete samples is...
- Multivariate probitMultivariate probitIn statistics and econometrics, the multivariate probit model is a generalization of the probit model used to estimate several correlated binary outcomes jointly...
- Multivariate random variableMultivariate random variableIn mathematics, probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose values is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value.More formally, a multivariate random...
- Multivariate stable distribution
- Multivariate statisticsMultivariate statisticsMultivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one statistical variable. The application of multivariate statistics is multivariate analysis...
- Multivariate Student distribution
N
- n = 1 fallacy
- Naive Bayes classifierNaive Bayes classifierA naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong independence assumptions...
- Nakagami distribution
- National and international statistical services
- Nash–Sutcliffe model efficiency coefficient
- National Health Interview SurveyNational Health Interview SurveyThe National Health Interview Survey is an annual, cross-sectional survey intended to provide nationally-representative estimates on a wide range of health status and utilization measures among the nonmilitary, noninstitutionalized population of the United States...
- Natural experimentNatural experimentA natural experiment is an observational study in which the assignment of treatments to subjects has been haphazard: That is, the assignment of treatments has been made "by nature", but not by experimenters. Thus, a natural experiment is not a controlled experiment...
- Natural exponential familyNatural exponential familyIn probability and statistics, the natural exponential family is a class of probability distributions that is a special case of an exponential family...
- Natural process variation
- NCSS (statistical software)
- Negative binomial distributionNegative binomial distributionIn probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...
- Negative multinomial distribution
- Negative predictive valueNegative predictive valueIn statistics and diagnostic testing, the negative predictive value is a summary statistic used to describe the performance of a diagnostic testing procedure. It is defined as the proportion of subjects with a negative test result who are correctly diagnosed. A high NPV means that when the test...
- Negative relationshipNegative relationshipIn statistics, a relationship between two variables is negative if the slope in a corresponding graph is negative, or—what is in some contexts equivalent—if the correlation between them is negative...
- NegentropyNegentropyThe negentropy, also negative entropy or syntropy, of a living system is the entropy that it exports to keep its own entropy low; it lies at the intersection of entropy and life...
- Neighbourhood components analysisNeighbourhood components analysisNeighbourhood components analysis is a supervised learning method for clustering multivariate data into distinct classes according to a given distance metric over the data...
- Nelson rulesNelson rulesNelson rules are a method in process control of determining if some measured variable is out of control . Rules, for detecting "out-of-control" or non-random conditions were first postulated by Walter A. Shewhart in the 1920s...
- Nelson–Aalen estimatorNelson–Aalen estimatorThe Nelson–Aalen estimator is a non-parametric estimator of the cumulative hazard rate function in case of censored data or incomplete data. It is used in survival theory, reliability engineering and life insurance to estimate the cumulative number of expected events. An event can be a failure of a...
- Nested case-control studyNested case-control studyA nested case control study is a variation of a case-cohort study in which only a subset of controls from the cohort are compared to the incident cases. In a case-cohort study, all incident cases in the cohort are compared to a random subset of participants who do not develop the disease of interest...
- Nested sampling algorithmNested sampling algorithmThe nested sampling algorithm is a computational approach to the problem of comparing models in Bayesian statistics, developed in 2004 by physicist John Skilling.-Background:...
- Network probability matrixNetwork Probability MatrixThe network probability matrix describes the probability structure of a network based on the historical presence or absence of edges in a network. For example, individuals in a social network are not connected to other individuals with uniform random probability. The probability structure is much...
- Neural networkNeural networkThe term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...
- Neutral vectorNeutral vectorIn statistics, and specifically in the study of the Dirichlet distribution, a neutral vector of random variables is one that exhibits a particular type of statistical independence amongst its elements...
- Newcastle–Ottawa scaleNewcastle–Ottawa scaleIn statistics, the Newcastle–Ottawa scale is a method for assessing the quality of nonrandomised studies in meta-analyses. The scales allocate stars, maximum of nine, for quality of selection, comparability, exposure and outcome of study participants. The method was developed as a collaboration...
- Newey–West estimatorNewey–West estimatorA Newey–West estimator is used in statistics and econometrics to provide an estimate of the covariance matrix of the parameters of a regression-type model when this model is applied in situations where the standard assumptions of regression analysis do not apply. It was devised by Whitney K. Newey...
- Newman–Keuls methodNewman–Keuls methodIn statistics, the Newman–Keuls method is a post-hoc test used for comparisons after the performed F-test is found to be significant...
- Neyer d-optimal testNeyer d-optimal testThe Neyer D-Optimal Test is one way of analyzing a sensitivity test of explosives as described by Barry T. Neyer in 1994. This method has replaced the earlier Bruceton analysis or "Up and Down Test" that was devised by Dixon and Mood in 1948 to allow computation with pencil and paper. Samples are...
- Neyman constructionNeyman constructionNeyman construction is a frequentist method to construct an interval at a confidence level C\, that if we repeat the experiment many times the interval will contain the true value a fraction C\, of the time. The probability that the intervals contains the true value is called the coverage.-...
- Neyman–Pearson lemma
- Nicholson–Bailey model
- Nominal categoryNominal categoryA nominal category or a nominal group is a group of objects or ideas that can be collectively grouped on the basis of shared, arbitrary characteristic....
- Noncentral beta distributionNoncentral beta distributionIn probability theory and statistics, the noncentral beta distribution is a continuous probability distribution that is a generalization of the beta distribution.- Probability density function :...
- Noncentral chi distribution
- Noncentral chi-squared distribution
- Noncentral F-distributionNoncentral F-distributionIn probability theory and statistics, the noncentral F-distribution is a continuous probability distribution that is a generalization of the F-distribution...
- Noncentral hypergeometric distributionsNoncentral hypergeometric distributionsIn statistics, the hypergeometric distribution is the discrete probability distribution generated by picking colored balls at random from an urn without replacement....
- Noncentral t-distributionNoncentral t-distributionIn probability and statistics, the noncentral t-distribution generalizes Student's t-distribution using a noncentrality parameter. Like the central t-distribution, the noncentral t-distribution is primarily used in statistical inference, although it may also be used in robust modeling for data...
- Noncentrality parameterNoncentrality parameterNoncentrality parameters are parameters of families of probability distributions which are related to other "central" families of distributions. If the noncentrality parameter of a distribution is zero, the distribution is identical to a distribution in the central family...
- Nonlinear autoregressive exogenous modelNonlinear autoregressive exogenous modelIn time series modeling, a nonlinear autoregressive exogenous model is a nonlinear autoregressive model which has exogenous inputs. This means that the model relates the current value of a time series which one would like to explain or predict to both:...
- Nonlinear dimensionality reductionNonlinear dimensionality reductionHigh-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lies on an embedded non-linear manifold within the higher-dimensional space...
- Non-linear iterative partial least squaresNon-linear iterative partial least squaresIn statistics, non-linear iterative partial least squares is an algorithm for computing the first few components in a principal component or partial least squares analysis. For very high-dimensional datasets, such as those generated in the 'omics sciences it is usually only necessary to compute...
- Nonlinear regressionNonlinear regressionIn statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables...
- Non-homogeneous Poisson processNon-homogeneous Poisson processIn probability theory, a non-homogeneous Poisson process is a Poisson process with rate parameter \lambda such that the rate parameter of the process is a function of time...
- Non-linear least squaresNon-linear least squaresNon-linear least squares is the form of least squares analysis which is used to fit a set of m observations with a model that is non-linear in n unknown parameters . It is used in some forms of non-linear regression. The basis of the method is to approximate the model by a linear one and to...
- Non-negative matrix factorization
- Non-parametric statisticsNon-parametric statisticsIn statistics, the term non-parametric statistics has at least two different meanings:The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:...
- Non-response biasNon-response biasNon-response bias occurs in statistical surveys if the answers of respondents differ from the potential answers of those who did not answer.- Example :...
- Non-sampling errorNon-sampling errorIn statistics, non-sampling error is a catch-all term for the deviations from the true value that are not a function of the sample chosen, including various systematic errors and any random errors that are not due to sampling. Non-sampling errors are much harder to quantify than sampling errors ....
- Nonparametric regressionNonparametric regressionNonparametric regression is a form of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data...
- Nonprobability samplingNonprobability samplingSampling is the use of a subset of the population to represent the whole population. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. Nonprobability sampling does not meet this criterion and should be...
- Normal curve equivalentNormal curve equivalentIn educational statistics, a normal curve equivalent , developed for the United States Department of Education by the RMC Research Corporation,NCE stands for Normal Curve Equivalent and was developed [for] the [US] Department of Education. is a way of standardizing scores received on a test. It is...
- Normal distribution
- Normal probability plotNormal probability plotThe normal probability plot is a graphical technique for normality testing: assessing whether or not a data set is approximately normally distributed....
– see also rankitRankitIn statistics, rankits of a set of data are the expected values of the order statistics of a sample from the standard normal distribution the same size as the data. They are primarily used in the normal probability plot, a graphical technique for normality testing.-Example:This is perhaps most... - Normal scoreNormal scoreThe term normal score is used with two different meanings in statistics. One of them relates to creating a single value which can be treated as if it had arisen from a standard normal distribution...
– see also rankitRankitIn statistics, rankits of a set of data are the expected values of the order statistics of a sample from the standard normal distribution the same size as the data. They are primarily used in the normal probability plot, a graphical technique for normality testing.-Example:This is perhaps most...
and Z score - Normal variance-mean mixture
- Normal-exponential-gamma distributionNormal-exponential-gamma distributionIn probability theory and statistics, the normal-exponential-gamma distribution is a three-parameter family of continuous probability distributions...
- Normal-gamma distribution
- Normal-inverse Gaussian distribution
- Normal-scaled inverse gamma distribution
- Normality testNormality testIn statistics, normality tests are used to determine whether a data set is well-modeled by a normal distribution or not, or to compute how likely an underlying random variable is to be normally distributed....
- Normalization (statistics)Normalization (statistics)In one usage in statistics, normalization is the process of isolating statistical error in repeated measured data. A normalization is sometimes based on a property...
- Normally distributed and uncorrelated does not imply independentNormally distributed and uncorrelated does not imply independentIn probability theory, two random variables being uncorrelated does not imply their independence. In some contexts, uncorrelatedness implies at least pairwise independence ....
- Notation in probability and statistics
- Novikov's conditionNovikov's conditionIn probability theory, Novikov's condition is the sufficient condition for a stochastic process which takes the form of the Radon-Nikodym derivative in Girsanov's theorem to be a martingale...
- np-chart
- Null distributionNull distributionIn statistical hypothesis testing, the null distribution is the probability distribution of the test statistic when the null hypothesis is true.In an F-test, the null distribution is an F-distribution....
- Null hypothesisNull hypothesisThe practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...
- Null resultNull resultIn science, a null result is a result without the expected content: that is, the proposed result is absent. It is an experimental outcome which does not show an otherwise expected effect. This does not imply a result of zero or nothing, simply a result that does not support the hypothesis...
- Nuisance parameter
- Nuisance variableNuisance variableIn statistics, a nuisance parameter is any parameter which is not of immediate interest but which must be accounted for in the analysis of those parameters which are of interest...
- Numerical dataNumerical dataNumerical data is data measured or identified on a numerical scale. Numerical data can be analyzed using statistical methods, and results can be displayed using tables, charts, histograms and graphs. For example, a researcher will ask a questions to a participant that include words how often, how...
- Numerical methods for linear least squares
- Numerical parameter
- Numerical smoothing and differentiationNumerical smoothing and differentiationAn experimental datum value can be conceptually described as the sum of a signal and some noise, but in practice the two contributions cannot be separated. The purpose of smoothing is to increase the Signal-to-noise ratio without greatly distorting the signal...
- NumXLNumXLNumXL is an econometrics/time series analysis add-in for Microsoft Excel. Developed by Spider Financial, NumXL provides a wide variety of statistical and time series analysis techniques, including linear and nonlinear time series modeling, statistical tests and others...
— software (Excel addin) - Nuremberg CodeNuremberg CodeThe Nuremberg Code is a set of research ethics principles for human experimentation set as a result of the Subsequent Nuremberg Trials at the end of the Second World War.-Background:...
O
- Observable variableObservable variableIn statistics, observable variables or manifest variables, as opposed to latent variables, are those variables that can be observed and directly measured.- See also :* Observables in physics* Observability in control theory* Latent variable model...
- Observational equivalenceObservational equivalenceIn econometrics, two parameter values are considered observationally equivalent if they both result in the same probability distribution of observable data...
- Observational errorObservational errorObservational error is the difference between a measured value of quantity and its true value. In statistics, an error is not a "mistake". Variability is an inherent part of things being measured and of the measurement process.-Science and experiments:...
- Observational studyObservational studyIn epidemiology and statistics, an observational study draws inferences about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator...
- Observed informationObserved informationIn statistics, the observed information, or observed Fisher information, is the negative of the second derivative of the "log-likelihood"...
- Occupancy frequency distributionOccupancy frequency distributionIn macroecology and community ecology, an occupancy frequency distribution is the distribution of the numbers of species occupying different numbers of areas. It was first reported in 1918 by the Danish botanist Christen C. Raunkiær in his study on plant communities...
- OddsOddsThe odds in favor of an event or a proposition are expressed as the ratio of a pair of integers, which is the ratio of the probability that an event will happen to the probability that it will not happen...
- Odds algorithmOdds algorithmThe odds-algorithm is a mathematical method for computing optimalstrategies for a class of problems that belong to the domain of optimal stopping problems. Their solution follows from the odds-strategy, and the importance of the...
- Odds ratioOdds ratioThe odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...
- Official statisticsOfficial statisticsOfficial statistics are statistics published by government agencies or other public bodies such as international organizations. They provide quantitative or qualitative information on all major areas of citizens' lives, such as economic and social development, living conditions, health, education,...
- Ogden tablesOgden tablesOgden tables are a set of statistical tables and other information for use in court cases in the UK.Their purpose is to make it easier to calculate future losses in personal injury and fatal accident cases. The tables take into account life expectancy and provide a range of discount rates from...
- OgiveOgiveAn ogive is the roundly tapered end of a two-dimensional or three-dimensional object.-Applied physical science and engineering:In ballistics or aerodynamics, an ogive is a pointed, curved surface mainly used to form the approximately streamlined nose of a bullet or other projectile.The traditional...
- Omitted-variable biasOmitted-variable biasIn statistics, omitted-variable bias occurs when a model is created which incorrectly leaves out one or more important causal factors. The 'bias' is created when the model compensates for the missing factor by over- or under-estimating one of the other factors.More specifically, OVB is the bias...
- Omnibus testOmnibus testOmnibus tests are a kind of statistical test. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of variance. There can be legitimate significant effects within a model even if the...
- One-class classificationOne-class classificationOne-class classification tries to distinguish one class of objects from all other possible objects, by learning from a training set containing only the objects of that class. This is different from and more difficult than the traditional classification problem, which tries to distinguish between...
- One-factor-at-a-time methodOne-factor-at-a-time methodThe one-factor-at-a-time method is a method of designing experiments involving the testing of factors, or causes, one at a time instead of all simultaneously. Prominent text books and academic papers currently favor factorial experimental designs, a method pioneered by Sir Ronald A. Fisher, where...
- One-tailed test — redirects to two-tailed testTwo-tailed testThe two-tailed test is a statistical test used in inference, in which a given statistical hypothesis, H0 , will be rejected when the value of the test statistic is either sufficiently small or sufficiently large...
- One-way ANOVAOne-way ANOVAIn statistics, one-way analysis of variance is a technique used to compare means of two or more samples . This technique can be used only for numerical data....
- Online NMFOnline NMFOnline NMF is a recently developed method for real time data analysis in an online context. Non-negative matrix factorization in the past has been used for static data analysis and pattern recognition...
Online Non-negative Matrix Factorisation - Open-label trialOpen-label trialAn open-label trial or open trial is a type of clinical trial in which both the researchers and participants know which treatment is being administered....
- OpenEpiOpenEpiOpenEpi is a free, web-based, open source, operating system-independent series of programs for use in epidemiology, biostatistics, public health, and medicine, providing a number of epidemiologic and statistical tools for summary data. OpenEpi was developed in JavaScript and HTML, and can be run in...
– software - OpenBUGSOpenBUGSOpenBUGS is a computer software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo methods. OpenBUGS is the open source variant of WinBUGS . It runs under Windows and Linux, as well as from inside the R statistical package...
– software - Operational confound
- Operational sex ratioOperational sex ratioIn the evolutionary biology of sexual reproduction, the operational sex ratio is the ratio of sexually competing males that are ready to mate to sexually competing females that are ready to mate...
- Operations researchOperations researchOperations research is an interdisciplinary mathematical science that focuses on the effective use of technology by organizations...
- Opinion pollOpinion pollAn opinion poll, sometimes simply referred to as a poll is a survey of public opinion from a particular sample. Opinion polls are usually designed to represent the opinions of a population by conducting a series of questions and then extrapolating generalities in ratio or within confidence...
- Optimal decisionOptimal decisionAn optimal decision is a decision such that no other available decision options will lead to a better outcome. It is an important concept in decision theory. In order to compare the different decision outcomes, one commonly assigns a relative utility to each of them...
- Optimal designOptimal designOptimal designs are a class of experimental designs that are optimal with respect to some statistical criterion.In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated without bias and with minimum-variance...
- Optimal discriminant analysisOptimal discriminant analysisOptimal discriminant analysis and the related classification tree analysis are statistical methods that maximize predictive accuracy...
- Optimal matchingOptimal matchingOptimal matching is a sequence analysis method used in social science, to assess the dissimilarity of ordered arrays of tokens that usually represent a time-ordered sequence of socio-economic states two individuals have experienced. Once such distances have been calculated for a set of observations...
- Optimal stoppingOptimal stoppingIn mathematics, the theory of optimal stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be found in areas of statistics, economics, and mathematical finance...
- Optimality criterionOptimality criterionIn statistics, an optimality criterion provides a measure of the fit of the data to a given hypothesis. The selection process is determined by the solution that optimizes the criteria used to evaluate the alternative hypotheses...
- Optional stopping theoremOptional stopping theoremIn probability theory, the optional stopping theorem says that, under certain conditions, the expected value of a martingale at a stopping time is equal to its initial value...
- Order of a kernelOrder of a kernelThe order of a kernel is the first non-zero moment of a kernel....
- Order of integrationOrder of integrationOrder of integration, denoted I, is a summary statistic for a time series. It reports the minimum number of differences required to obtain a stationary series.- Integration of order zero :...
- Order statisticOrder statisticIn statistics, the kth order statistic of a statistical sample is equal to its kth-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference....
- Ordered logitOrdered logitIn statistics, the ordered logit model , is a regression model for ordinal dependent variables...
- Ordered probitOrdered probitIn statistics, ordered probit is a generalization of the popular probit analysis to the case of more than two outcomes of an ordinal dependent variable. Similarly, the popular logit method also has a counterpart ordered logit....
- Ordered subset expectation maximization
- Ordinary least squaresOrdinary least squaresIn statistics, ordinary least squares or linear least squares is a method for estimating the unknown parameters in a linear regression model. This method minimizes the sum of squared vertical distances between the observed responses in the dataset and the responses predicted by the linear...
- Ordination (statistics)Ordination (statistics)In multivariate analysis, ordination is a method complementary to data clustering, and used mainly in exploratory data analysis . Ordination orders objects that are characterized by values on multiple variables so that similar objects are near each other and dissimilar objects are farther from...
- Ornstein–Uhlenbeck process
- Orthogonal array testing
- OrthogonalityOrthogonalityOrthogonality occurs when two things can vary independently, they are uncorrelated, or they are perpendicular.-Mathematics:In mathematics, two vectors are orthogonal if they are perpendicular, i.e., they form a right angle...
- Orthogonality principleOrthogonality principleIn statistics and signal processing, the orthogonality principle is a necessary and sufficient condition for the optimality of a Bayesian estimator. Loosely stated, the orthogonality principle says that the error vector of the optimal estimator is orthogonal to any possible estimator...
- OutlierOutlierIn statistics, an outlier is an observation that is numerically distant from the rest of the data. Grubbs defined an outlier as: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs....
- Outliers in statistics – redirects to Robust statisticsRobust statisticsRobust statistics provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not unduly affected by small departures from model assumptions.- Introduction :...
(section) - Outliers ratioOutliers RatioIn objective video quality assessment, the outliers ratio is a measure of the performance of an objective video quality metric. It is the ratio of "false" scores given by the objective metric to the total number of scores. The "false" scores are the scores that lie outside the intervalwhere MOS...
- Outline of probabilityOutline of probabilityProbability is the likelihood or chance that something is the case or will happen. Probability theory is used extensively in statistics, mathematics, science and philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of complex systems.The following...
- Outline of regression analysisOutline of regression analysisIn statistics, regression analysis includes any technique for learning about the relationship between one or more dependent variables Y and one or more independent variables X....
- Outline of statistics
- OverdispersionOverdispersionIn statistics, overdispersion is the presence of greater variability in a data set than would be expected based on a given simple statistical model....
- OverfittingOverfittingIn statistics, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations...
- Owen's T function
- OxMetricsOxMetricsOxMetrics is an econometric software including the Ox programming language for econometrics and statistics, developed by Jurgen Doornik and David Hendry...
– software
P
- p-chart
- p-repP-repIn statistical hypothesis testing, p-rep or prep has been proposed as a statistical to the classic p-value. Whereas a p-value is the probability of obtaining a result under the null hypothesis, p-rep computes the probability of replicating an effect...
- P-valueP-valueIn statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. One often "rejects the null hypothesis" when the p-value is less than the significance level α ,...
- P-P plot
- Page's trend testPage's trend testIn statistics, the Page test for multiple comparisons between ordered correlated variables is the counterpart of Spearman's rank correlation coefficient which summarizes the association of continuous variables. It is also known as Page's trend test or Page's L test...
- Paid surveyPaid surveyA paid or incentivized survey is a type of statistical survey where the participants/members are rewarded through an incentive program, generally entry into a sweepstakes program or a small cash reward, for completing one or more surveys.- Details :...
- Paired comparison analysisPaired comparison analysisIn paired-comparison analysis, also known as paired-choice analysis, a range of options are compared and the results are tallied to find an overall winner. A range of plausible options is listed. Each option is compared against each of the other options, determining the preferred option in each case...
- Paired difference testPaired difference testIn statistics, a paired difference test is a type of location test that is used when comparing two sets of measurements to assess whether their population means differ...
- Pairwise comparisonPairwise comparisonPairwise comparison generally refers to any process of comparing entities in pairs to judge which of each entity is preferred, or has a greater amount of some quantitative property. The method of pairwise comparison is used in the scientific study of preferences, attitudes, voting systems, social...
- Pairwise independencePairwise independenceIn probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are independent. Any collection of mutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent...
- Panel analysisPanel analysisPanel analysis is statistical method, widely used in social science, epidemiology, and econometrics, which deals with two-dimensional panel data. The data are usually collected over time and over the same individuals and then a regression is run over these two dimensions...
- Panel dataPanel dataIn statistics and econometrics, the term panel data refers to multi-dimensional data. Panel data contains observations on multiple phenomena observed over multiple time periods for the same firms or individuals....
- Panjer recursionPanjer recursionThe Panjer recursion is an algorithm to compute the probability distribution of a compound random variablewhere both N\, and X_i\, are random variables and of special types. In more general cases the distribution of S is a compound distribution. The recursion for the special cases considered was...
– a class of discrete compound distributions - PaleostatisticsPaleostatisticsPaleontology often faces phenomena so vast and complex they can be described only through statistics.First applied to the study of a population in 1662 statistics is today a basic tool for natural sciences practitioners, and a solid acquaintance with methods and applications is essential for...
- Paley–Zygmund inequalityPaley–Zygmund inequalityIn mathematics, the Paley–Zygmund inequality bounds theprobability that a positive random variable is small, in terms ofits mean and variance...
- Parabolic fractal distributionParabolic fractal distributionIn probability and statistics, the parabolic fractal distribution is a type of discrete probability distribution in which the logarithm of the frequency or size of entities in a population is a quadratic polynomial of the logarithm of the rank...
- PARAFAC (parallel factor analysis)
- Parallel factor analysis redirects to PARAFAC
- Paradigm (experimental)Paradigm (experimental)In the behavioural sciences, e.g. Psychology, Biology, Neurosciences, an experimental paradigm is an experimental setup that is defined by certain fine-tuned standards and often has a theoretical background...
- Parameter identification problemParameter identification problemThe parameter identification problem is a problem which can occur in the estimation of multiple-equation econometric models where the equations have variables in common....
- Parameter spaceParameter spaceIn science, a parameter space is the set of values of parameters encountered in a particular mathematical model. Often the parameters are inputs of a function, in which case the technical term for the parameter space is domain of a function....
- Parametric familyParametric familyIn mathematics and its applications, a parametric family or a parameterized family is a family of objects whose definitions depend on a set of parameters....
- Parametric modelParametric modelIn statistics, a parametric model or parametric family or finite-dimensional model is a family of distributions that can be described using a finite number of parameters...
- Parametric statisticsParametric statisticsParametric statistics is a branch of statistics that assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. Most well-known elementary statistical methods are parametric....
- Pareto analysisPareto analysisPareto analysis is a statistical technique in decision making that is used for selection of a limited number of tasks that produce significant overall effect. It uses the Pareto principle – the idea that by doing 20% of work, 80% of the advantage of doing the entire job can be generated...
- Pareto chart
- Pareto distribution
- Pareto indexPareto indexIn economics the Pareto index, named after the Italian economist and sociologist Vilfredo Pareto, is a measure of the breadth of income or wealth distribution. It is one of the parameters specifying a Pareto distribution and embodies the Pareto principle...
- Pareto interpolationPareto interpolationPareto interpolation is a method of estimating the median and other properties of a population that follows a Pareto distribution. It is used in economics when analysing the distribution of incomes in a population, when one must base estimates on a relatively small random sample taken from the...
- Pareto principlePareto principleThe Pareto principle states that, for many events, roughly 80% of the effects come from 20% of the causes.Business-management consultant Joseph M...
- Partial autocorrelation — redirects to Partial autocorrelation functionPartial autocorrelation functionIn time series analysis, the partial autocorrelation function plays an important role in data analyses aimed at identifying the extent of the lag in an autoregressive model...
- Partial autocorrelation functionPartial autocorrelation functionIn time series analysis, the partial autocorrelation function plays an important role in data analyses aimed at identifying the extent of the lag in an autoregressive model...
- Partial correlationPartial correlationIn probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed.-Formal definition:...
- Partial least squares
- Partial least squares regressionPartial least squares regressionPartial least squares regression is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a linear regression model by projecting the predicted variables and the...
- Partial leverage
- Partial regression plotPartial regression plotIn applied statistics, a partial regression plot attempts to show the effect of adding an additional variable to the model...
- Partial residual plotPartial residual plotIn applied statistics, a partial residual plot is a graphical technique that attempts to show the relationship between a given independent variable and the response variable given that other independent variables are also in the model.-Background:...
- Particle filterParticle filterIn statistics, particle filters, also known as Sequential Monte Carlo methods , are sophisticated model estimation techniques based on simulation...
- Partition of sums of squares
- Parzen window
- Path analysis (statistics)
- Path coefficient
- Path spacePath spaceIn mathematics, the term path space refers to any topological space of paths from one specified set into another. In particular, it may refer to* the classical Wiener space of continuous paths;* the Skorokhod space of càdlàg paths....
- Pattern recognitionPattern recognitionIn machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...
- Pearson's chi-squared testPearson's chi-squared testPearson's chi-squared test is the best-known of several chi-squared tests – statistical procedures whose results are evaluated by reference to the chi-squared distribution. Its properties were first investigated by Karl Pearson in 1900...
(one of various chi-squared tests) - Pearson distributionPearson distributionThe Pearson distribution is a family of continuous probability distributions. It was first published by Karl Pearson in 1895 and subsequently extended by him in 1901 and 1916 in a series of articles on biostatistics.- History :...
- Pearson product-moment correlation coefficientPearson product-moment correlation coefficientIn statistics, the Pearson product-moment correlation coefficient is a measure of the correlation between two variables X and Y, giving a value between +1 and −1 inclusive...
- People v. CollinsPeople v. CollinsThe People of the State of California v. Collins was a 1968 jury trial in California, USA that made notorious forensic use of mathematics and probability.-Trial at first instance:...
(prob/stats related court case) - Per capitaPer capitaPer capita is a Latin prepositional phrase: per and capita . The phrase thus means "by heads" or "for each head", i.e. per individual or per person...
- Per-comparison error rate
- Per-protocol analysisPer-protocol analysisIn epidemiology, per-protocol analysis is a strategy of analysis in which only patients who complete the entire clinical trial are counted towards the final results. Intention to treat analysis uses data from all patients, including those who did not complete the study.- External links :* - of...
- PercentilePercentileIn statistics, a percentile is the value of a variable below which a certain percent of observations fall. For example, the 20th percentile is the value below which 20 percent of the observations may be found...
- Percentile rankPercentile rankThe percentile rank of a score is the percentage of scores in its frequency distribution that are the same or lower than it. For example, a test score that is greater than 75% of the scores of people taking the test is said to be at the 75th percentile....
- Periodic variation — redirects to SeasonalitySeasonalityIn statistics, many time series exhibit cyclic variation known as seasonality, periodic variation, or periodic fluctuations. This variation can be either regular or semi regular....
- PeriodogramPeriodogramThe periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898 as in the following quote:...
- Peirce's criterionPeirce's criterionIn robust statistics, Peirce's criterion is a rule for eliminating outliers from data sets, which was devised by Benjamin Peirce.-The problem of outliers:...
- Pensim2Pensim2Pensim2 is a dynamic microsimulation model to simulate the income of pensioners, owned by the British Department for Work and Pensions.Pensim2 is the second version of Pensim which was developed in the 1990s. The time horizon of the model is 100 years, by which time today's school leavers will...
— an econometric model - Percentage pointPercentage pointPercentage points are the unit for the arithmetic difference of two percentages.Consider the following hypothetical example: in 1980, 40 percent of the population smoked, and in 1990 only 30 percent smoked...
- Permutation test — redirects to Resampling (statistics)Resampling (statistics)In statistics, resampling is any of a variety of methods for doing one of the following:# Estimating the precision of sample statistics by using subsets of available data or drawing randomly with replacement from a set of data points # Exchanging labels on data points when performing significance...
- Pharmaceutical statisticsPharmaceutical StatisticsPharmaceutical Statistics is a peer-reviewed scientific journal that publishes papers related to pharmaceutical statistics. It is the official journal of Statisticians in the Pharmaceutical Industry and is published by John Wiley & Sons....
- Phase dispersion minimizationPhase dispersion minimizationPhase dispersion minimization is a data analysis technique that searches for periodic components of a time series data set. It is useful for data sets with gaps, non-sinusoidal variations, poor time coverage or other problems that would make Fourier techniques unusable...
- Phase-type distributionPhase-type distributionA phase-type distribution is a probability distribution that results from a system of one or more inter-related Poisson processes occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random...
- Phi coefficientPhi coefficientIn statistics, the phi coefficient is a measure of association for two binary variables introduced by Karl Pearson. This measure is similar to the Pearson correlation coefficient in its interpretation...
- Phillips–Perron test
- Philosophy of probability
- Philosophy of statisticsPhilosophy of statisticsThe philosophy of statistics involves the meaning, justification, utility, use and abuse of statistics and its methodology, and ethical and epistemological issues involved in the consideration of choice and interpretation of data and methods of Statistics....
- Pie chartPie chartA pie chart is a circular chart divided into sectors, illustrating proportion. In a pie chart, the arc length of each sector , is proportional to the quantity it represents. When angles are measured with 1 turn as unit then a number of percent is identified with the same number of centiturns...
- Pignistic probabilityPignistic probabilityPignistic probability, in decision theory, is a probability that a rational person will assign to an option when required to make a decision.A person may have, at one level certain beliefs or a lack of knowledge, or uncertainty, about the options and their actual likelihoods...
- Pinsker's inequalityPinsker's inequalityIn information theory, Pinsker's inequality, named after its inventor Mark Semenovich Pinsker, is an inequality that relates Kullback-Leibler divergence and the total variation distance...
- Pitman–Koopman–Darmois theorem
- Pitman–Yor process
- Pivotal quantityPivotal quantityIn statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters whose probability distribution does not depend on unknown parameters....
- Placebo-controlled study
- Plackett–Burman design
- Plate notationPlate notationPlate notation is a method of representing variables that repeat in a graphical model. Instead of drawing each repeated variable individually, a plate or rectangle is used to group variables into a subgraph that repeat together, and a number is drawn on the plate to represent the number of...
- Player winsPlayer winsPlayer wins is a stat used to estimate the number of games a player won for his team developed by Dean Oliver, the first full-time statistical analyst in the NBA.The formula used to calculate player wins is Player Games * Player Winning Percentage....
- Plot (graphics)Plot (graphics)A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a mechanical or electronic plotter. Graphs are a visual representation of the relationship between variables, very useful for...
- Pocock boundaryPocock boundaryThe Pocock boundary is a method for determine whether to stop a clinical trial prematurely. The typical clinical trial compares two groups of patients. One group are given a placebo or conventional treatment, while the other group of patients are given the treatment that is being tested...
- Poincaré plotPoincaré plotA Poincaré plot, named after Henri Poincaré, is used to quantify self-similarity in processes, usually periodic functions. It is also known as a return map.Given a time series of the form...
- Point-biserial correlation coefficientPoint-biserial correlation coefficientThe point biserial correlation coefficient is a correlation coefficient used when one variable is dichotomous; Y can either be "naturally" dichotomous, like gender, or an artificially dichotomized variable. In most situations it is not advisable to artificially dichotomize variables...
- Point estimationPoint estimationIn statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter....
- Point pattern analysis
- Point processPoint processIn statistics and probability theory, a point process is a type of random process for which any one realisation consists of a set of isolated points either in time or geographical space, or in even more general spaces...
- Poisson binomial distribution
- Poisson distributionPoisson distributionIn probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...
- Poisson hidden Markov modelPoisson hidden Markov modelIn statistics, Poisson hidden Markov models are a special case of hidden Markov models where a Poisson process has a rate which varies in association with changes between the different states of a Markov model...
- Poisson limit theoremPoisson limit theoremThe Poisson theorem gives a Poisson approximation to the binomial distribution, under certain conditions. The theorem was named after Siméon-Denis Poisson .- The theorem :If...
- Poisson processPoisson processA Poisson process, named after the French mathematician Siméon-Denis Poisson , is a stochastic process in which events occur continuously and independently of one another...
- Poisson regressionPoisson regressionIn statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown...
- Poisson random numbers — redirects to section of Poisson distributionPoisson distributionIn probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...
- Poisson samplingPoisson samplingIn the theory of finite population sampling, Poisson sampling is a sampling process where each element of the population that is sampled is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample during the drawing of a single sample.Each element...
- Polar distribution — redirects to Circular distribution
- Policy capturingPolicy capturingPolicy capturing or "the PC technique" is a statistical method used in social psychology to quantify the relationship between a person's judgement and the information that was used to make that judgement. Policy capturing assessments rely upon regression analysis models...
- Political forecastingPolitical forecastingPolitical forecasting aims at predicting the outcome of elections. Models include:- Opinion polls :Polls are an integral part of political forecasting. However, incorporating poll results into political forecasting models can cause problems in predicting the outcome of elections...
- Pollaczek–Khinchine formula
- Pollyanna CreepPollyanna creepPollyanna Creep is a phrase that originated with John Williams, a California-based economic analyst and statistician. It describes the way the U.S. government has modified the way important economic measures are calculated with the purpose of giving a better impression of economic development. This...
- Poly-Weibull distributionPoly-Weibull distributionIn probability theory and statistics, the poly-Weibull distribution is a continuous probability distribution. The distribution is defined to be that of a random variable defined to be the smallest of a number of statistically independent random variables having non-identical Weibull...
- Polychoric correlationPolychoric correlationIn statistics, polychoric correlation is a technique for estimating the correlation between two theorised normally distributed continuous latent variables, from two observed ordinal variables. Tetrachoric correlation is a special case of the polychoric correlation applicable when both observed...
- Polynomial and rational function modelingPolynomial and rational function modelingIn statistical modeling , polynomial functions and rational functions are sometimes used as an empirical technique for curve fitting.-Polynomial function models:A polynomial function is one that has the form...
- Polynomial chaosPolynomial chaosPolynomial chaos , also called "Wiener Chaos expansion", is a non-sampling based method to determine evolution of uncertainty in dynamical system, when there is probabilistic uncertainty in the system parameters....
- Polynomial regressionPolynomial regressionIn statistics, polynomial regression is a form of linear regression in which the relationship between the independent variable x and the dependent variable y is modeled as an nth order polynomial...
- PolytreePolytreeIn graph theory, a polytree is a directed graph with at most one undirected path between any two vertices. In other words, a polytree is a directed acyclic graph for which there are no undirected cycles either...
(Bayesian networks) - Pooled standard deviation redirects to Pooled variancePooled varianceIn statistics, many times, data are collected for a dependent variable, y, over a range of values for the independent variable, x. For example, the observation of fuel consumption might be studied as a function of engine speed while the engine load is held constant...
- Pooling designPooling designA pooling design is an algorithm to intelligently classify items by testing them in groups or pools rather than individually. The result from the pools is usually binary — either positive or negative. A negative result can imply that all the items tested in that pool were failures, if the...
- Popoviciu's inequality on variancesPopoviciu's inequality on variancesIn probability theory, Popoviciu's inequality, named after Tiberiu Popoviciu, is an upper bound on the variance of any bounded probability distribution. Let M and m be upper and lower bounds on the values of any random variable with a particular probability distribution...
- PopulationStatistical populationA statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. For example, if we were interested in generalizations about crows, then we would describe the set of crows that is of interest...
- Population dynamicsPopulation dynamicsPopulation dynamics is the branch of life sciences that studies short-term and long-term changes in the size and age composition of populations, and the biological and environmental processes influencing those changes...
- Population ecologyPopulation ecologyPopulation ecology is a sub-field of ecology that deals with the dynamics of species populations and how these populations interact with the environment. It is the study of how the population sizes of species living together in groups change over time and space....
– application - Population modelingPopulation modelingA population model is a type of mathematical model that is applied to the study of population dynamics.Models allow a better understanding of how complex interactions and processes work. Modeling of dynamic interactions in nature can provide a manageable way of understanding how numbers change over...
- Population processPopulation processIn applied probability, a population process is a Markov chain in which the state of the chain is analogous to the number of individuals in a population , and changes to the state are analogous to the addition or removal of individuals from the population.Although named by analogy to biological...
- Population pyramidPopulation pyramidA population pyramid, also called an age structure diagram, is a graphical illustration that shows the distribution of various age groups in a population , which forms the shape of a pyramid when the population is growing...
- Population statisticsPopulation statisticsPopulation statistics is the use of statistics to analyze characteristics or changes to a population. It is related to social demography and demography.Population statistics can analyze anything from global demographic changes to local small scale changes...
- Population variance
- Population viability analysisPopulation viability analysisPopulation viability analysis is a species-specific method of risk assessment frequently used in conservation biology.It is traditionally defined as the process that determines the probability that a population will go extinct within a given number of years.More recently, PVA has been described...
- Portmanteau testPortmanteau testA portmanteau test is a type of statistical hypothesis test in which the null hypothesis is well specified, but the alternative hypothesis is more loosely specified. Tests constructed in this context can have the property of being at least moderately powerful against a wide range of departures from...
- Positive predictive valuePositive predictive valueIn statistics and diagnostic testing, the positive predictive value, or precision rate is the proportion of subjects with positive test results who are correctly diagnosed. It is a critical measure of the performance of a diagnostic method, as it reflects the probability that a positive test...
- Post-hoc analysisPost-hoc analysisPost-hoc analysis , in the context of design and analysis of experiments, refers to looking at the data—after the experiment has concluded—for patterns that were not specified a priori. It is sometimes called by critics data dredging to evoke the sense that the more one looks the more likely...
- Posterior probabilityPosterior probabilityIn Bayesian statistics, the posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant evidence is taken into account...
- Power lawPower lawA power law is a special kind of mathematical relationship between two quantities. When the frequency of an event varies as a power of some attribute of that event , the frequency is said to follow a power law. For instance, the number of cities having a certain population size is found to vary...
- Power transformPower transformIn statistics, the power transform is from a family of functions that are applied to create a rank-preserving transformation of data using power functions. This is a useful data processing technique used to stabilize variance, make the data more normal distribution-like, improve the correlation...
- Prais–Winsten estimation
- Pre- and post-test probabilityPre- and post-test probabilityPre-test probability and post-test probability are the subjective probabilities of the presence of a condition before and after a diagnostic test, respectively...
- Precision (statistics)Precision (statistics)In statistics, the term precision can mean a quantity defined in a specific way. This is in addition to its more general meaning in the contexts of accuracy and precision and of precision and recall....
- Precision and recallPrecision and recallIn pattern recognition and information retrieval, precision is the fraction of retrieved instances that are relevant, while recall is the fraction of relevant instances that are retrieved. Both precision and recall are therefore based on an understanding and measure of relevance...
- Prediction intervalPrediction intervalIn statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed...
- Predictive analyticsPredictive analyticsPredictive analytics encompasses a variety of statistical techniques from modeling, machine learning, data mining and game theory that analyze current and historical facts to make predictions about future events....
- Predictive inferencePredictive inferencePredictive inference is an approach to statistical inference that emphasizes the prediction of future observations based on past observations.Initially, predictive inference was based on observable parameters and it was the main purpose of studying probability, but it fell out of favor in the 20th...
- Predictive informaticsPredictive informaticsPredictive informatics is the combination of predictive modeling and informatics applied to healthcare, pharmaceutical, life sciences and business industries....
- Predictive modellingPredictive modellingPredictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an...
- Predictive validityPredictive validityIn psychometrics, predictive validity is the extent to which a score on a scale or test predicts scores on some criterion measure.For example, the validity of a cognitive test for job performance is the correlation between test scores and, for example, supervisor performance ratings...
- Preference regression (in marketing)Preference regression (in marketing)Preference regression is a statistical technique used by marketers to determine consumers’ preferred core benefits. It usually supplements product positioning techniques like multi dimensional scaling or factor analysis and is used to create ideal vectors on perceptual maps.-Application:Starting...
- Preferential attachment process — redirects to Preferential attachmentPreferential attachmentA preferential attachment process is any of a class of processes in which some quantity, typically some form of wealth or credit, is distributed among a number of individuals or objects according to how much they already have, so that those who are already wealthy receive more than those who are not...
- PrevalencePrevalenceIn epidemiology, the prevalence of a health-related state in a statistical population is defined as the total number of cases of the risk factor in the population at a given time, or the total number of cases in the population, divided by the number of individuals in the population...
- Principal component analysis
- Multilinear principal-component analysis
- Principal component regressionPrincipal component regressionIn statistics, principal component regression is a regression analysis that uses principal component analysis when estimating regression coefficients...
- Principal geodesic analysisPrincipal geodesic analysisIn geometric data analysis and statistical shape analysis, principal geodesic analysis is a generalization of principal component analysis to a non-Euclidean, non-linear setting of manifolds suitable for use with shape descriptors such as medial representations....
- Principal stratificationPrincipal stratificationPrincipal stratification is a statistical technique used in causal inference.-References: * Zhang, Junni L.; Rubin, Donald B. "Estimation of Causal Effects via Principal Stratification When Some Outcomes are Truncated by “Death”", Journal of Educational and Behavioral Statistics, 28: 353–368...
- Principle of indifferencePrinciple of indifferenceThe principle of indifference is a rule for assigning epistemic probabilities.Suppose that there are n > 1 mutually exclusive and collectively exhaustive possibilities....
- Principle of marginalityPrinciple of marginalityIn statistics, the principle of marginality refers to the fact that the average effects, of variables in an analysis are marginal to their interaction effect...
- Principle of maximum entropyPrinciple of maximum entropyIn Bayesian probability, the principle of maximum entropy is a postulate which states that, subject to known constraints , the probability distribution which best represents the current state of knowledge is the one with largest entropy.Let some testable information about a probability distribution...
- Prior knowledge for pattern recognitionPrior knowledge for pattern recognitionPattern recognition is a very active field of research intimately bound to machine learning. Also known as classification or statistical classification, pattern recognition aims at building a classifier that can determine the class of an input pattern...
- Prior probabilityPrior probabilityIn Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity p is the probability distribution that would express one's uncertainty about p before the "data"...
- Prior probability distribution redirects to Prior probabilityPrior probabilityIn Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity p is the probability distribution that would express one's uncertainty about p before the "data"...
- Probabilistic causationProbabilistic causationProbabilistic causation designates a group of philosophical theories that aim to characterize the relationship between cause and effect using the tools of probability theory...
- Probabilistic designProbabilistic designProbabilistic design is a discipline within engineering design. It deals primarily with the consideration of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects are related to quality and reliability...
- Probabilistic forecastingProbabilistic forecastingProbabilistic forecasting summarises what is known, or opinions about, future events. In contrast to a single-valued forecasts , probabilistic forecasts assign a probability to each of a number of different outcomes,...
- Probabilistic latent semantic analysisProbabilistic latent semantic analysisProbabilistic latent semantic analysis , also known as probabilistic latent semantic indexing is a statistical technique for the analysis of two-mode and co-occurrence data. PLSA evolved from latent semantic analysis, adding a sounder probabilistic model...
- Probabilistic metric spaceProbabilistic metric spaceA probabilistic metric space is a generalization of metric spaces where the distance is no longer defined on positive real numbers, but on distribution functions....
- Probabilistic propositionProbabilistic propositionA probabilistic proposition is a proposition with a measured probability of being true for an arbitrary person at an arbitrary time.These are some examples of probabilistic propositions collected by the Mindpixel project:* You are not human 0.17...
- Probabilistic relational modelProbabilistic relational modelA Probabilistic relational model is the counterpart of a Bayesian network in statistical relational learning.-References:*Friedman N, Getoor L, Koller D, Pfeffer A....
- ProbabilityProbabilityProbability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...
- Probability and statisticsProbability and statisticsSee the separate articles on probability or the article on statistics. Statistical analysis often uses probability distributions, and the two topics are often studied together. However, probability theory contains much that is of mostly mathematical interest and not directly relevant to statistics...
- Probability density functionProbability density functionIn probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...
- Probability distributionProbability distributionIn probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
- Probability distribution functionProbability distribution functionDepending upon which text is consulted, a probability distribution function is any of:* a probability distribution function,* a cumulative distribution function,* a probability mass function, or* a probability density function....
(disambiguation) - Probability integral transformProbability integral transformIn statistics, the probability integral transform or transformation relates to the result that data values that are modelled as being random variables from any given continuous distribution can be converted to random variables having a uniform distribution...
- Probability interpretationsProbability interpretationsThe word probability has been used in a variety of ways since it was first coined in relation to games of chance. Does probability measure the real, physical tendency of something to occur, or is it just a measure of how strongly one believes it will occur? In answering such questions, we...
- Probability mass functionProbability mass functionIn probability theory and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value...
- Probability matchingProbability matchingProbability matching is a suboptimal decision strategy in which predictions of class membership are proportional to the class base rates. Thus, if in the training set positive examples are observed 60% of the time, and negative examples are observed 40% of the time, the observer using a...
- Probability metric
- Probability of errorProbability of errorIn statistics, the term "error" arises in two ways. Firstly, it arises in the context of decision making, where the probability of error may be considered as being the probability of making a wrong decision and which would have a different value for each type of error...
- Probability of precipitationProbability of PrecipitationA probability of precipitation is a formal measure of the likelihood of precipitation that is often published from weather forecasting models. Its definition varies.-U.S. usage:...
- Probability plotProbability plotIn statistics, a P-P plot is a probability plot for assessing how closely two data sets agree, which plots the two cumulative distribution functions against each other....
- Probability plot correlation coefficient — redirects to Q-Q plotQ-Q plotIn statistics, a Q-Q plot is a probability plot, which is a graphical method for comparing two probability distributions by plotting their quantiles against each other. First, the set of intervals for the quantiles are chosen...
- Probability plot correlation coefficient plotProbability plot correlation coefficient plotMany statistical analyses are based on distributional assumptions about the population from which the data have been obtained. However, distributional families can have radically different shapes depending on the value of the shape parameter. Therefore, finding a reasonable choice for the shape...
- Probability spaceProbability spaceIn probability theory, a probability space or a probability triple is a mathematical construct that models a real-world process consisting of states that occur randomly. A probability space is constructed with a specific kind of situation or experiment in mind...
- Probability theoryProbability theoryProbability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...
- Probability-generating functionProbability-generating functionIn probability theory, the probability-generating function of a discrete random variable is a power series representation of the probability mass function of the random variable...
- Probable errorProbable error-Statistics:In statistics, the probable error of a quantity is a value describing the probability distribution of that quantity. It defines the half-range of an interval about a cental point for the distribution, such that half of the values from the distribution will lie within the interval and...
- ProbitProbitIn probability theory and statistics, the probit function is the inverse cumulative distribution function , or quantile function associated with the standard normal distribution...
- Probit modelProbit modelIn statistics, a probit model is a type of regression where the dependent variable can only take two values, for example married or not married....
- Procedural confound
- Process Window IndexProcess Window IndexProcess Window Index is a statistical measure that quantifies the robustness of a manufacturing process which involves heating and cooling, known as a thermal process...
- Procrustes analysisProcrustes analysisIn statistics, Procrustes analysis is a form of statistical shape analysis used to analyse the distribution of a set of shapes. The name Procrustes refers to a bandit from Greek mythology who made his victims fit his bed either by stretching their limbs or cutting them off.To compare the shape of...
- Proebsting's paradoxProebsting's paradoxIn probability theory, Proebsting's paradox is an argument that appears to show that the Kelly criterion can lead to ruin. Although it can be resolved mathematically, it raises some interesting issues about the practical application of Kelly, especially in investing. It was named and first...
- Product distributionProduct distributionA product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions...
- Product form solutionProduct form solutionIn probability theory, a product form solution is a particularly efficient form of solution for determining some metric of a system with distinct sub-components, where the metric for the collection of components can be written as a product of the metric across the different components...
- Profile likelihood redirects to Likelihood functionLikelihood functionIn statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...
- Progressively measurable processProgressively measurable processIn mathematics, progressive measurability is a property of stochastic processes. A progressively measurable process is one for which events defined in terms of values of the process across a range of times can be assigned probabilities . Being progressively measurable is a strictly stronger...
- PrognosticsPrognosticsPrognostics is an engineering discipline focused on predicting the time at which a system or a component will no longer perform its intended function . This lack of performance is most often a failure beyond which the system can no longer be used to meet desired performance...
- Projection pursuitProjection pursuitProjection pursuit is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a Normal distribution are considered to be more interesting...
- Projection pursuit regressionProjection pursuit regressionIn statistics, projection pursuit regression is a statistical model developed by Jerome H. Friedman and Werner Stuetzle which is an extension of additive models...
- Proof of Stein's exampleProof of Stein's exampleStein's example is an important result in decision theory which can be stated asThe following is an outline of its proof. The reader is referred to the main article for more information.-Sketched proof:...
- Propagation of uncertaintyPropagation of uncertaintyIn statistics, propagation of error is the effect of variables' uncertainties on the uncertainty of a function based on them...
- Propensity probabilityPropensity probabilityThe propensity theory of probability is one interpretation of the concept of probability. Theorists who adopt this interpretation think of probability as a physical propensity, or disposition, or tendency of a given type of physical situation to yield an outcome of a certain kind, or to yield a...
- Propensity scorePropensity scoreIn the design of experiments, a propensity score is the probability of a unit being assigned to a particular condition in a study given a set of known covariates...
- Propensity score matchingPropensity score matchingIn the statistical analysis of observational data, propensity score matching is a methodology attempting to provide unbiased estimation of treatment-effects...
- Proper linear modelProper linear modelIn statistics, a proper linear model is a linear regression model in which the weights given to the predictor variables are chosen in such a way as to optimize the relationship between the prediction and the criterion. Simple regression analysis is the most common example of a proper linear model...
- Proportional hazards modelsProportional hazards modelsProportional hazards models are a class of survival models in statistics. Survival models relate the time that passes before some event occurs to one or more covariates that may be associated with that quantity. In a proportional hazards model, the unique effect of a unit increase in a covariate...
- Proportional reduction in lossProportional reduction in lossProportional reduction in loss refers to a general framework for developing and evaluating measures of the reliability of particular ways of making observations which are possibly subject to errors of all types...
- Prosecutor's fallacyProsecutor's fallacyThe prosecutor's fallacy is a fallacy of statistical reasoning made in law where the context in which the accused has been brought to court is falsely assumed to be irrelevant to judging how confident a jury can be in evidence against them with a statistical measure of doubt...
- Proxy (statistics)Proxy (statistics)In statistics, a proxy variable is something that is probably not in itself of any great interest, but from which a variable of interest can be obtained...
- PsephologyPsephologyPsephology is that branch of political science which deals with the study and scientific analysis of elections. Psephology uses historical precinct voting data, public opinion polls, campaign finance information and similar statistical data. The term was coined in the United Kingdom in 1952 by...
- Pseudo-determinantPseudo-determinantIn linear algebra and statistics, the pseudo-determinant is the product of all non-zero eigenvalues of a square matrix. It coincides with the regular determinant when the matrix is non-singular.- Definition :...
- PseudocountPseudocountA pseudocount is an amount added to the number of observed cases in order to change the expected probability in a model of those data, when not known to be zero. Depending on the prior knowledge, which is sometimes a subjective value, a pseudocount may have any non-negative finite value...
- PseudolikelihoodPseudolikelihoodIn statistical theory, a pseudolikelihood is an approximation to the joint probability distribution of a collection of random variables. The practical use of this is that it can provide an approximation to the likelihood function of a set of observed data which may either provide a computationally...
- PseudomedianPseudomedianIn statistics, the pseudomedian is defined as the median of all possible midpoints of pairs of observations. It is the Hodges–Lehmann one-sample estimate of the central location for a probability distribution.-References:...
- PseudoreplicationPseudoreplicationHurlbert defined pseudoreplication as the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated or replicates are not statistically independent....
- PSPPPSPPPSPP is a free software application for analysis of sampled data. It has a graphical user interface and conventional command line interface. It is written in C, uses GNU Scientific Library for its mathematical routines, and plotutils for generating graphs....
(free software) - Psychological statisticsPsychological statisticsPsychological statistics is the application of statistics to psychology. Some of the more common applications include:#psychometrics#learning theory#perception#human development#abnormal psychology#Personality test#psychological tests...
- PsychometricsPsychometricsPsychometrics is the field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, personality traits, and educational measurement...
- Pythagorean expectationPythagorean expectationPythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs they scored and allowed. Comparing a team's actual and Pythagorean winning percentage can be used to evaluate how lucky that team was...
Q
- Q testQ testIn statistics, Dixon's Q test, or simply the Q test, is used for identification and rejection of outliers. Per Dean and Dixon, and others, this test should be used sparingly and never more than once in a data set...
- Q research softwareQ research softwareQ research software is computer software for the analysis of market research data. Launched in 2007, Q is developed by Numbers International Pty Ltd.- Interactive data analysis :...
- Q-exponential distribution
- Q-functionQ-functionIn statistics, the Q-function is the tail probability of the standard normal distribution. In other words, Q is the probability that a standard normal random variable will obtain a value larger than x...
- Q-Gaussian distributionQ-Gaussian distributionIn q-analog theory, the q-Gaussian is a probability distribution arising from the maximization of the Tsallis entropy under appropriate constraints. It is one example of a Tsallis distribution. The q-Gaussian is a generalization of the Gaussian in the same way that Tsallis entropy is a...
- Q-Q plotQ-Q plotIn statistics, a Q-Q plot is a probability plot, which is a graphical method for comparing two probability distributions by plotting their quantiles against each other. First, the set of intervals for the quantiles are chosen...
- Q-statisticQ-statisticThe Q-statistic is a test statistic output by either the Box-Pierce test or, in a modified version which provides better small sample properties, by the Ljung-Box test. It follows the chi-squared distribution...
- QuadratQuadratA quadrat is a square used in ecology and geography to isolate a sample, usually about 1m2 or 0.25m2. The quadrat is suitable for sampling plants, slow-moving animals , and some aquatic organisms.When an ecologist wants to know how many organisms there are in a particular habitat, it would not be...
- Quadratic classifierQuadratic classifierA quadratic classifier is used in machine learning and statistical classification to separate measurements of two or more classes of objects or events by a quadric surface...
- Quadratic form (statistics)Quadratic form (statistics)If \epsilon is a vector of n random variables, and \Lambda is an n-dimensional symmetric matrix, then the scalar quantity\epsilon^T\Lambda\epsilonis known as a quadratic form in \epsilon.-Expectation:It can be shown that...
- Quadratic variationQuadratic variationIn mathematics, quadratic variation is used in the analysis of stochastic processes such as Brownian motion and martingales. Quadratic variation is just one kind of variation of a process.- Definition :...
- Qualitative comparative analysisQualitative comparative analysisQualitative Comparative Analysis is a technique, developed by Charles Ragin in 1987, for solving the problems that are caused by making causal inferences on the basis of only a small number of cases...
- Qualitative data
- Qualitative variationQualitative variationAn index of qualitative variation is a measure of statistical dispersion in nominal distributions. There are a variety of these, but they have been relatively little-studied in the statistics literature...
- Quality controlQuality controlQuality control, or QC for short, is a process by which entities review the quality of all factors involved in production. This approach places an emphasis on three aspects:...
- QuantileQuantileQuantiles are points taken at regular intervals from the cumulative distribution function of a random variable. Dividing ordered data into q essentially equal-sized data subsets is the motivation for q-quantiles; the quantiles are the data values marking the boundaries between consecutive subsets...
- Quantile functionQuantile functionIn probability and statistics, the quantile function of the probability distribution of a random variable specifies, for a given probability, the value which the random variable will be at, or below, with that probability...
- Quantile normalizationQuantile normalizationIn statistics, quantile normalization is a technique for making two distributions identical in statistical properties. To quantile-normalize a test distribution to a reference distribution of the same length, sort the test distribution and sort the reference distribution...
- Quantile regressionQuantile regressionQuantile regression is a type of regression analysis used in statistics. Whereas the method of least squares results in estimates that approximate the conditional mean of the response variable given certain values of the predictor variables, quantile regression results in estimates approximating...
- Quantitative marketing researchQuantitative marketing researchQuantitative marketing research is the application of quantitative research techniques to the field of marketing. It has roots in both the positivist view of the world, and the modern marketing viewpoint that marketing is an interactive process in which both the buyer and seller reach a satisfying...
- Quantitative parasitologyQuantitative parasitology-Counting parasites:Quantifying parasites in a sample of hosts or comparing measures of infection across two or more samples can be challenging.The parasitic infection of a sample of hosts inherently exhibits a complex pattern that cannot be adequately quantified by a single statistical measure...
- Quantitative psychological researchQuantitative psychological researchQuantitative psychological research is defined as psychological research which performs mathematical modeling and statistical estimation or statistical inference. This definition distinguishes it from so-called qualitative psychological research; however, many psychologists do not acknowledge any...
- Quantitative researchQuantitative researchIn the social sciences, quantitative research refers to the systematic empirical investigation of social phenomena via statistical, mathematical or computational techniques. The objective of quantitative research is to develop and employ mathematical models, theories and/or hypotheses pertaining to...
- Quantum (Statistical programming language)
- QuartileQuartileIn descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
- Quartile coefficient of dispersionQuartile coefficient of dispersionIn statistics, the quartile coefficient of dispersion is a descriptive statistic which measures dispersion and which is used to make comparisons within and between data sets....
- Quasi-birth–death process
- Quasi-experimentQuasi-experimentA quasi-experiment is an empirical study used to estimate the causal impact of an intervention on its target population. Quasi-experimental research designs share many similarities with the traditional experimental design or randomized controlled trial, but they specifically lack the element of...
- Quasi-experimental design — redirects to Design of quasi-experiments
- Quasi-likelihoodQuasi-likelihoodIn statistics, quasi-likelihood estimation is one way of allowing for overdispersion, that is, greater variability in the data than would be expected from the statistical model used. It is most often used with models for count data or grouped binary data, i.e...
- Quasi-maximum likelihoodQuasi-maximum likelihoodA quasi-maximum likelihood estimate is an estimate of a parameter θ in a statistical model that is formed by maximizing a function that is related to the logarithm of the likelihood function, but is not equal to it...
- QuasireversibilityQuasireversibilityIn probability theory, specifically queueing theory, quasireversibility is a property of some queues. The concept was first identified by Richard R. Muntz and further developed by Frank Kelly. Quasireversibility differs from reversibility in that a stronger condition is imposed on arrival rates...
- Queueing modelQueueing modelIn queueing theory, a queueing model is used to approximate a real queueing situation or system, so the queueing behaviour can be analysed mathematically...
- Queueing theoryQueueing theoryQueueing theory is the mathematical study of waiting lines, or queues. The theory enables mathematical analysis of several related processes, including arriving at the queue, waiting in the queue , and being served at the front of the queue...
- Queuing delayQueuing delayIn telecommunication and computer engineering, the queuing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay....
- Queuing theory in teletraffic engineering
- Quota samplingQuota samplingQuota sampling is a method for selecting survey participants. In quota sampling, a population is first segmented into mutually exclusive sub-groups, just as in stratified sampling. Then judgment is used to select the subjects or units from each segment based on a specified proportion. For example,...
R
- R programming language — redirects to R (programming language)R (programming language)R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....
- R v AdamsR v AdamsR v Adams [1996] 2 Cr App R 467, [1996] Crim LR 898, CA and R v Adams [1998] 1 Cr App R 377, The Times, 3 November 1997, CA, are rulings that ousted explicit Bayesian statistics from the reasoning admissible before a jury in DNA cases.-Facts:...
(prob/stats related court case) - Radar chartRadar chartA radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point...
- Rademacher distribution
- Radial basis function networkRadial basis function networkA radial basis function network is an artificial neural network that uses radial basis functions as activation functions. It is a linear combination of radial basis functions...
- Raikov's theoremRaikov's theoremIn probability theory, Raikov’s theorem, named after Dmitry Raikov, states that if the sum of two independent random variables X and Y has a Poisson distribution, then both X and Y themselves must have the Poisson distribution. It says the same thing about the Poisson distribution that Cramér's...
- Raised cosine distribution
- Ramsey RESET testRamsey reset testThe Ramsey Regression Equation Specification Error Test test is a general specification test for the linear regression model. More specifically, it tests whether non-linear combinations of the estimated values help explain the endogenous variable...
— the Ramsey Regression Equation Specification Error Test - Rand indexRand indexThe Rand index or Rand measure in statistics, and in particular in data clustering, is a measure of the similarity between two data clusterings...
- Random assignmentRandom assignmentRandom assignment or random placement is an experimental technique for assigning subjects to different treatments . The thinking behind random assignment is that by randomizing treatment assignment, then the group attributes for the different treatments will be roughly equivalent and therefore any...
- Random compact setRandom compact setIn mathematics, a random compact set is essentially a compact set-valued random variable. Random compact sets are useful in the study of attractors for random dynamical systems.-Definition:...
- Random data — see randomnessRandomnessRandomness has somewhat differing meanings as used in various fields. It also has common meanings which are connected to the notion of predictability of events....
- Random effects estimation — redirects to Random effects model
- Random effects model
- Random elementRandom elementIn probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line...
- Random fieldRandom fieldA random field is a generalization of a stochastic process such that the underlying parameter need no longer be a simple real or integer valued "time", but can instead take values that are multidimensional vectors, or points on some manifold....
- Random graphRandom graphIn mathematics, a random graph is a graph that is generated by some random process. The theory of random graphs lies at the intersection between graph theory and probability theory, and studies the properties of typical random graphs.-Random graph models:...
- Random matrixRandom matrixIn probability theory and mathematical physics, a random matrix is a matrix-valued random variable. Many important properties of physical systems can be represented mathematically as matrix problems...
- Random measure
- Random multinomial logitRandom multinomial logitIn statistics and machine learning, random multinomial logit is a technique for statistical classification using repeated multinomial logit analyses via Leo Breiman's random forests.-Rationale for the new method:...
- Random naive BayesRandom naive BayesRandom naive Bayes extends the Naive Bayes classifier by adopting the random forest principles: random input selection, bagging , and random feature selection .- Naive Bayes classifier :...
- Random permutation statisticsRandom permutation statisticsThe statistics of random permutations, such as the cycle structure of a random permutation are of fundamental importance in the analysis of algorithms, especially of sorting algorithms, which operate on random permutations. Suppose, for example, that we are using quickselect to select a random...
- Random regular graph
- Random sampleRandom sampleIn statistics, a sample is a subject chosen from a population for investigation; a random sample is one chosen by a method involving an unpredictable component...
- Random sampling
- Random sequenceRandom sequenceThe concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let X1,...,Xn be independent random variables...". Yet as D. H. Lehmer stated in...
- Random variableRandom variableIn probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...
- Random variateRandom variateA random variate is a particular outcome of a random variable: the random variates which are other outcomes of the same random variable would have different values. Random variates are used when simulating processes driven by random influences...
- Random walkRandom walkA random walk, sometimes denoted RW, is a mathematical formalisation of a trajectory that consists of taking successive random steps. For example, the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging animal, the price of a fluctuating stock and the...
- Random walk hypothesisRandom walk hypothesisThe random walk hypothesis is a financial theory stating that stock market prices evolve according to a random walk and thus the prices of the stock market cannot be predicted. It is consistent with the efficient-market hypothesis....
- RandomizationRandomizationRandomization is the process of making something random; this means:* Generating a random permutation of a sequence .* Selecting a random sample of a population ....
- Randomized block designRandomized block designIn the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups that are similar to one another. Typically, a blocking factor is a source of variability that is not of primary interest to the experimenter...
- Randomized controlled trialRandomized controlled trialA randomized controlled trial is a type of scientific experiment - a form of clinical trial - most commonly used in testing the safety and efficacy or effectiveness of healthcare services or health technologies A randomized controlled trial (RCT) is a type of scientific experiment - a form of...
- Randomized experimentRandomized experimentIn science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects...
- Randomized responseRandomized responseRandomized response is a research method used in structured survey interview. It was first proposed by S. L. Warner in 1965 and later modified by B. G. Greenberg in 1969. It allows respondents to respond to sensitive issues while maintaining confidentiality...
- RandomnessRandomnessRandomness has somewhat differing meanings as used in various fields. It also has common meanings which are connected to the notion of predictability of events....
- Randomness testsRandomness testsThe issue of randomness is an important philosophical and theoretical question.Many random number generators in use today generate what are called "random sequences" but they are actually the result of prescribed algorithms and so they are called pseudo-random number generators.These generators do...
- Range (statistics)Range (statistics)In the descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...
- Rank abundance curveRank abundance curveA rank abundance curve or "Whittaker plot" is a chart used by ecologists to display relative species abundance, a component of biodiversity. It can also be used to visualize species richness and species evenness...
- Rank correlationRank correlationIn statistics, a rank correlation is the relationship between different rankings of the same set of items. A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to assess its significance....
mainly links to two following- Spearman's rank correlation coefficientSpearman's rank correlation coefficientIn statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often denoted by the Greek letter \rho or as r_s, is a non-parametric measure of statistical dependence between two variables. It assesses how well the relationship between two variables can...
- Kendall tau rank correlation coefficientKendall tau rank correlation coefficientIn statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's tau coefficient, is a statistic used to measure the association between two measured quantities...
- Spearman's rank correlation coefficient
- Rank productRank productThe rank product is a biologically motivated test for the detection of differentially expressed genes in replicated microarray experiments.It is a simple non-parametric statistical method based on ranks of fold changes...
- Rank-size distributionRank-size distributionRank-size distribution or the rank-size rule describes the remarkable regularity in many phenomena including the distribution of city sizes around the world, sizes of businesses, particle sizes , lengths of rivers, frequencies of word usage, wealth among individuals, etc...
- RankingRankingA ranking is a relationship between a set of items such that, for any two items, the first is either 'ranked higher than', 'ranked lower than' or 'ranked equal to' the second....
- RankitRankitIn statistics, rankits of a set of data are the expected values of the order statistics of a sample from the standard normal distribution the same size as the data. They are primarily used in the normal probability plot, a graphical technique for normality testing.-Example:This is perhaps most...
- RankletRankletIn statistics, a ranklet is an orientation-selective non-parametric feature which is based on the computation of Mann–Whitney–Wilcoxon rank-sum test statistics...
- RANSACRANSACRANSAC is an abbreviation for "RANdom SAmple Consensus". It is an iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliers. It is a non-deterministic algorithm in the sense that it produces a reasonable result only with a certain...
- Rational quadratic covariance functionRational quadratic covariance functionIn statistics, the rational quadratic covariance function is used in spatial statistics, geostatistics, machine learning, image analysis, and other fields where multivariate statistical analysis is conducted on metric spaces. It is commonly used to define the statistical covariance between...
- Rao–Blackwell theoremRao–Blackwell theoremIn statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar...
- Rao-Blackwellisation — redirects to *Rao–Blackwell theoremRao–Blackwell theoremIn statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar...
- Rasch modelRasch modelRasch models are used for analysing data from assessments to measure variables such as abilities, attitudes, and personality traits. For example, they may be used to estimate a student's reading ability from answers to questions on a reading assessment, or the extremity of a person's attitude to...
- Polytomous Rasch modelPolytomous Rasch modelThe polytomous Rasch model is generalization of the dichotomous Rasch model. It is a measurement model that has potential application in any context in which the objective is to measure a trait or ability through a process in which responses to items are scored with successive integers...
- Polytomous Rasch model
- Rasch model estimationRasch model estimationEstimation of a Rasch model is used to estimate the parameters of the Rasch model. Various techniques are employed to estimate the parameters from matrices of response data. The most common approaches are types of maximum likelihood estimation, such as joint and conditional maximum likelihood...
- Ratio distributionRatio distributionA ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions....
- Rayleigh distribution
- Raw scoreRaw scoreIn statistics and data analysis, a raw score is an original datum that has not been transformed. This may include, for example, the original result obtained by a student on a test as opposed to that score after transformation to a standard score or percentile rank or the like.Often the conversion...
- Realization (probability)Realization (probability)In probability and statistics, a realization, or observed value, of a random variable is the value that is actually observed . The random variable itself should be thought of as the process how the observation comes about...
- Recall biasRecall biasIn psychology, recall bias is a type of systematic bias which occurs when the way a survey respondent answers a question is affected not just by the correct answer, but also by the respondent's memory. This can affect the results of the survey. As a hypothetical example, suppose that a survey in...
- Receiver operating characteristicReceiver operating characteristicIn signal detection theory, a receiver operating characteristic , or simply ROC curve, is a graphical plot of the sensitivity, or true positive rate, vs. false positive rate , for a binary classifier system as its discrimination threshold is varied...
- Rectified Gaussian distributionRectified Gaussian DistributionIn probability theory, the rectified Gaussian distribution is a modification of the Gaussian distribution when its negative elements are reset to 0...
- Recurrence period density entropyRecurrence period density entropyRecurrence period density entropy is a method, in the fields of dynamical systems, stochastic processes, and time series analysis, for determining the periodicity, or repetitiveness of a signal.- Overview :...
- Recurrence plotRecurrence plotIn descriptive statistics and chaos theory, a recurrence plot is a plot showing, for a given moment in time, the times at which a phase space trajectory visits roughly the same area in the phase space...
- Recurrence quantification analysisRecurrence quantification analysisRecurrence quantification analysis is a method of nonlinear data analysis for the investigation of dynamical systems. It quantifies the number and duration of recurrences of a dynamical system presented by its phase space trajectory....
- Recursive Bayesian estimationRecursive Bayesian estimationRecursive Bayesian estimation, also known as a Bayes filter, is a general probabilistic approach for estimating an unknown probability density function recursively over time using incoming measurements and a mathematical process model.-In robotics:...
- Recursive least squares
- Recursive partitioningRecursive partitioningRecursive partitioning is a statistical method for multivariable analysis. Recursive partitioning creates a decision tree that strives to correctly classify members of the population based on several dichotomous dependent variables....
- Reduced formReduced formIn statistics, and particularly in econometrics, the reduced form of a system of equations is the result of solving the system for the endogenous variables. This gives the latter as a function of the exogenous variables, if any...
- Reference class problemReference class problemIn statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case...
- Regenerative processRegenerative processIn applied probability, a regenerative process is a special type of stochastic process that is defined by having a property whereby certain portions of the process can be treated as being statistically independent of each other...
- Regression analysisRegression analysisIn statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
— see also linear regressionLinear regressionIn statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple regression... - Regression Analysis of Time Series — proprietary software
- Regression control chartRegression control chartIn statistical quality control, the regression control chart allows for monitoring a change in a process where two or more variables are correlated. The change in a dependent variable can be detected and compensatory change in the independent variable can be recommended...
- Regression dilutionRegression dilutionRegression dilution is a statistical phenomenon also known as "attenuation".Consider fitting a straight line for the relationship of an outcome variable y to a predictor variable x, and estimating the gradient of the line...
- Regression discontinuity design
- Regression estimationRegression estimationRegression estimation is a technique used to replace missing values in data. The variable with missing data is treated as the dependent variable, while the rest of the cases are treated as independent variables. A regression equation is then generated which can be used to predict missing values...
- Regression fallacyRegression fallacyThe regression fallacy is an informal fallacy. It ascribes cause where none exists. The flaw is failing to account for natural fluctuations. It is frequently a special kind of the post hoc fallacy.-Explanation:...
- Regression model validation
- Regression toward the meanRegression toward the meanIn statistics, regression toward the mean is the phenomenon that if a variable is extreme on its first measurement, it will tend to be closer to the average on a second measurement, and—a fact that may superficially seem paradoxical—if it is extreme on a second measurement, will tend...
- Regret (decision theory)Regret (decision theory)Regret is defined as the difference between the actual payoff and the payoff that would have been obtained if a different course of action had been chosen. This is also called difference regret...
- Reification (statistics)Reification (statistics)In statistics, reification is the use of an idealized model of a statistical process. The model is then used to make inferences connecting model results, which imperfectly represent the actual process, with experimental observations....
- Rejection samplingRejection samplingIn mathematics, rejection sampling is a basic pseudo-random number sampling technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm"....
- Relationships among probability distributionsRelationships among probability distributionsMany statistical distributions have close relationships. Some examples include:* Bernoulli distribution, binomial distribution, and normal distribution.* exponential distribution and Poisson distribution....
- Relative change and differenceRelative change and differenceThe relative difference, percent difference, relative percent difference, or percentage difference between two quantities is the difference between them, expressed as a comparison to the size of one or both of them. Such measures are unitless numbers...
- Relative efficiency redirects to Efficiency (statistics)Efficiency (statistics)In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors...
- Relative index of inequalityRelative index of inequalityThe relative index of inequality is a regression-based index which summarizes the magnitude of socio-economic status as a source of inequalities in health. RII is useful because it takes into account the size of the population and the relative diadvantage experienced by different groups...
- Relative riskRelative riskIn statistics and mathematical epidemiology, relative risk is the risk of an event relative to exposure. Relative risk is a ratio of the probability of the event occurring in the exposed group versus a non-exposed group....
- Relative risk reductionRelative risk reductionIn epidemiology, the relative risk reduction is a measure calculated by dividing the absolute risk reduction by the control event rate.The relative risk reduction can be more useful than the absolute risk reduction in determining an appropriate treatment plan, because it accounts not only for the...
- Relative standard deviationRelative standard deviationIn probability theory and statistics, the relative standard deviation is the absolute value of the coefficient of variation. It is often expressed as a percentage. A similar term that is sometimes used is the relative variance which is the square of the coefficient of variation...
- Relative standard error — redirects to Relative standard deviationRelative standard deviationIn probability theory and statistics, the relative standard deviation is the absolute value of the coefficient of variation. It is often expressed as a percentage. A similar term that is sometimes used is the relative variance which is the square of the coefficient of variation...
- Relative variance — redirects to Relative standard deviationRelative standard deviationIn probability theory and statistics, the relative standard deviation is the absolute value of the coefficient of variation. It is often expressed as a percentage. A similar term that is sometimes used is the relative variance which is the square of the coefficient of variation...
- Relative survivalRelative survivalWhen describing the survival experience of a group of people or patients typically the method of overall survival is used, and it presents estimates of the proportion of people or patients alive at a certain point in time...
- Relativistic Breit–Wigner distribution
- Relevance vector machineRelevance Vector MachineRelevance vector machine is a machine learning technique that uses Bayesian inference to obtain parsimonious solutions for regression and classification...
- Reliability (statistics)Reliability (statistics)In statistics, reliability is the consistency of a set of measurements or of a measuring instrument, often used to describe a test. Reliability is inversely related to random error.-Types:There are several general classes of reliability estimates:...
- Reliability block diagramReliability block diagramA reliability block diagram is a diagrammatic method for showing how component reliability contributes to the success or failure of a complex system. RBD is also known as a dependence diagram ....
- Reliability engineeringReliability engineeringReliability engineering is an engineering field, that deals with the study, evaluation, and life-cycle management of reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time. It is often measured as a probability of...
- Reliability theoryReliability theoryReliability theory describes the probability of a system completing its expected function during an interval of time. It is the basis of reliability engineering, which is an area of study focused on optimizing the reliability, or probability of successful functioning, of systems, such as airplanes,...
- Reliability theory of aging and longevityReliability theory of aging and longevityReliability theory of aging and longevity is a scientific approach aimed to gain theoretical insights into mechanisms of biological aging and species survival patterns by applying a general theory of systems failure, known as reliability theory.-Overview:...
- Rencontres numbers — a discrete distribution
- Renewal theoryRenewal theoryRenewal theory is the branch of probability theory that generalizes Poisson processes for arbitrary holding times. Applications include calculating the expected time for a monkey who is randomly tapping at a keyboard to type the word Macbeth and comparing the long-term benefits of different...
- RepeatabilityRepeatabilityRepeatability or test-retest reliability is the variation in measurements if they would have been taken by a single person or instrument on the same item and under the same conditions. A less-than-perfect test-retest reliability causes test-retest variability. Such variability can be caused by, for...
- Repeated measures designRepeated measures designThe repeated measures design uses the same subjects with every condition of the research, including the control. For instance, repeated measures are collected in a longitudinal study in which change over time is assessed. Other studies compare the same measure under two or more different conditions...
- Replication (statistics)Replication (statistics)In engineering, science, and statistics, replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. ASTM, in standard E1847, defines replication as "the repetition of the set of all the treatment combinations to be compared in...
- Representation validityRepresentation validityRepresentation validity is concerned about how well the constructs or abstractions translate into observable measures. There are two primary questions to be answered:...
- ReproducibilityReproducibilityReproducibility is the ability of an experiment or study to be accurately reproduced, or replicated, by someone else working independently...
- Resampling (statistics)Resampling (statistics)In statistics, resampling is any of a variety of methods for doing one of the following:# Estimating the precision of sample statistics by using subsets of available data or drawing randomly with replacement from a set of data points # Exchanging labels on data points when performing significance...
- Rescaled rangeRescaled rangeThe rescaled range is a statistical measure of the variability of a time series introduced by the British hydrologist Harold Edwin Hurst...
- Resentful demoralizationResentful demoralizationResentful demoralization is an issue in controlled experiments in which those in the control group become resentful of not receiving the experimental treatment. Alternatively, the experimental group could be resentful of the control group, if the experimental group perceive its treatment as...
– experimental design - Residual. See errors and residuals in statisticsErrors and residuals in statisticsIn statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...
. - Residual sum of squaresResidual sum of squaresIn statistics, the residual sum of squares is the sum of squares of residuals. It is also known as the sum of squared residuals or the sum of squared errors of prediction . It is a measure of the discrepancy between the data and an estimation model...
- Response biasResponse biasResponse bias is a type of cognitive bias which can affect the results of a statistical survey if respondents answer questions in the way they think the questioner wants them to answer rather than according to their true beliefs...
- Response rateResponse rateResponse rate in survey research refers to the number of people who answered the survey divided by the number of people in the sample...
- Response surface methodologyResponse surface methodologyIn statistics, response surface methodology explores the relationships between several explanatory variables and one or more response variables. The method was introduced by G. E. P. Box and K. B. Wilson in 1951. The main idea of RSM is to use a sequence of designed experiments to obtain an...
- Response variable
- Restricted maximum likelihoodRestricted maximum likelihoodIn statistics, the restricted maximum likelihood approach is a particular form of maximum likelihood estimation which does not base estimates on a maximum likelihood fit of all the information, but instead uses a likelihood function calculated from a transformed set of data, so that nuisance...
- Restricted randomizationRestricted randomizationMany processes have more than one source of variation in them. In order to reduce variation in processes, these multiple sources must be understood, and that often leads to the concept of nested or hierarchical data structures. For example, in the semiconductor industry, a batch process may operate...
- Reversible-jump Markov chain Monte Carlo
- Reversible dynamicsReversible dynamics- Mathematics :In mathematics, a dynamical system is invertible if the forward evolution is one-to-one, not many-to-one; so that for every state there exists a well-defined reverse-time evolution operator....
- Rind et al. controversy – interpretations of paper involving meta-analysis
- Rice distributionRice distributionIn probability theory, the Rice distribution or Rician distribution is the probability distribution of the absolute value of a circular bivariate normal random variable with potentially non-zero mean. It was named after Stephen O...
- Richardson–Lucy deconvolution
- Ridge regression redirects to Tikhonov regularizationTikhonov regularizationTikhonov regularization, named for Andrey Tikhonov, is the most commonly used method of regularization of ill-posed problems. In statistics, the method is known as ridge regression, and, with multiple independent discoveries, it is also variously known as the Tikhonov-Miller method, the...
- Risk factorRisk factorIn epidemiology, a risk factor is a variable associated with an increased risk of disease or infection. Sometimes, determinant is also used, being a variable associated with either increased or decreased risk.-Correlation vs causation:...
- Risk functionRisk functionIn decision theory and estimation theory, the risk function R of a decision rule, δ, is the expected value of a loss function L:...
- Risk perceptionRisk perceptionRisk perception is the subjective judgment that people make about the characteristics and severity of a risk. The phrase is most commonly used in reference to natural hazards and threats to the environment or health, such as nuclear power. Several theories have been proposed to explain why...
- Risk theoryRisk theoryRisk theory connotes the study usually by actuaries and insurers of the financial impact on a carrier of a portfolio of insurance policies. For example, if the carrier has 100 policies that insures against a total loss of $1000, and if each policy's chance of loss is independent and has a...
- Risk-benefit analysisRisk-benefit analysisRisk–benefit analysis is the comparison of the risk of a situation to its related benefits. Exposure to personal risk is recognized as a normal aspect of everyday life. We accept a certain level of risk in our lives as necessary to achieve certain benefits. In most of these risks we feel as though...
- Robbins lemmaRobbins lemmaIn statistics, the Robbins lemma, named after Herbert Robbins, states that if X is a random variable with a Poisson distribution, and f is any function for which the expected value E exists, then...
- Robin Hood indexRobin Hood indexThe Hoover index is a measure of income inequality. It is equal to the portion of the total community income that would have to be redistributed for there to be perfect equality....
- Robust confidence intervalsRobust confidence intervalsIn statistics a robust confidence interval is a robust modification of confidence intervals, meaning that one modifies the non-robust calculations of the confidence interval so that they are not badly affected by outlying or aberrant observations in a data-set.- Example :In the process of weighing...
- Robust regressionRobust regressionIn robust statistics, robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. Regression analysis seeks to find the effect of one or more independent variables upon a dependent variable...
- Robust statisticsRobust statisticsRobust statistics provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not unduly affected by small departures from model assumptions.- Introduction :...
- Root mean squareRoot mean squareIn mathematics, the root mean square , also known as the quadratic mean, is a statistical measure of the magnitude of a varying quantity. It is especially useful when variates are positive and negative, e.g., sinusoids...
- Root mean square deviationRoot mean square deviationThe root-mean-square deviation is the measure of the average distance between the atoms of superimposed proteins...
- Root mean square deviation (bioinformatics)
- Root mean square fluctuation
- Robust measures of scaleRobust measures of scaleIn statistics, a robust measure of scale is a robust statistic that quantifies the statistical dispersion in a set of quantitative data. Robust measures of scale are used to complement or replace conventional estimates of scale such as the sample variance or sample standard deviation...
- Rossmo's formulaRossmo's formulaRossmo's formula is a geographic profiling formula to predict where a serial criminal lives. The formula was developed by criminologist Kim Rossmo.-Formula:...
- Rothamsted Experimental StationRothamsted Experimental StationThe Rothamsted Experimental Station, one of the oldest agricultural research institutions in the world, is located at Harpenden in Hertfordshire, England. It is now known as Rothamsted Research...
- Round robin testRound robin testIn experimental methodology, a round robin test is an interlaboratory test performed independently several times. This can involve multiple independent scientists performing the test with the use of the same method in different equipment, or a variety of methods and equipment...
- Rubin causal modelRubin Causal ModelThe Rubin Causal Model is an approach to the statistical analysis of cause and effect based on the framework of potential outcomes. RCM is named after Donald Rubin, Professor of Statistics at Harvard University...
- Ruin theoryRuin theoryRuin theory, sometimes referred to as collective risk theory, is a branch of actuarial science that studies an insurer's vulnerability to insolvency based on mathematical modeling of the insurer's surplus....
- Rule of successionRule of successionIn probability theory, the rule of succession is a formula introduced in the 18th century by Pierre-Simon Laplace in the course of treating the sunrise problem....
- Rule of three (medicine)Rule of three (medicine)In the statistical analysis of clinical trials, the rule of three states that if no major adverse events occurred in a group of n people, there can be 95% confidence that the chance of major adverse events is less than one in n / 3...
- Run chartRun ChartA run chart, also known as a run-sequence plot is a graph that displays observed data in a time sequence. Often, the data displayed represent some aspect of the output or performance of a manufacturing or other business process.- Overview :...
- RV coefficientRV coefficientIn statistics, the RV coefficientis a multivariate generalization of the Pearson correlation coefficient.It measures the closeness of two set of points that may each be represented in a matrix....
S
- S (programming language)
- S-PLUSS-PLUSS-PLUS is a commercial implementation of the S programming language sold by TIBCO Software Inc..It features object-oriented programming capabilities and advanced analytical algorithms.-Historical timeline:...
- Safety in numbersSafety in numbersSafety in numbers is the hypothesis that, by being part of a large physical group or mass, an individual is proportionally less likely to be the victim of a mishap, accident, attack, or other bad event...
- Sally ClarkSally ClarkSally Clark was a British solicitor who became the victim of an infamous miscarriage of justice when she was wrongly convicted of the murder of two of her sons in 1999...
(prob/stats related court case) - Sammon projection
- Sample mean and covariance redirects to Sample mean and sample covarianceSample mean and sample covarianceThe sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...
- Sample mean and sample covarianceSample mean and sample covarianceThe sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...
- Sample maximum and minimumSample maximum and minimumIn statistics, the maximum and sample minimum, also called the largest observation, and smallest observation, are the values of the greatest and least elements of a sample....
- Sample size determination
- Sample space
- Sample standard deviation — disambiguation
- Sample (statistics)Sample (statistics)In statistics, a sample is a subset of a population. Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of manageable size...
- Sample-continuous process
- Sampling (statistics)Sampling (statistics)In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....
- simple random sampling
- Snowball samplingSnowball samplingIn sociology and statistics research, snowball sampling is a non-probability sampling technique where existing study subjects recruit future subjects from among their acquaintances. Thus the sample group appears to grow like a rolling snowball...
- systematic samplingSystematic samplingSystematic sampling is a statistical method involving the selection of elements from an ordered sampling frame. The most common form of systematic sampling is an equal-probability method, in which every kth element in the frame is selected, where k, the sampling interval , is calculated as:k =...
- stratified samplingStratified samplingIn statistics, stratified sampling is a method of sampling from a population.In statistical surveys, when subpopulations within an overall population vary, it is advantageous to sample each subpopulation independently. Stratification is the process of dividing members of the population into...
- cluster samplingCluster samplingCluster Sampling is a sampling technique used when "natural" groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into these groups and a sample of the groups is selected. Then the required information is...
- multistage samplingMultistage samplingMultistage sampling is a complex form of cluster sampling.Advantages * cost and speed that the survey can be done in* convenience of finding the survey sample* normally more accurate than cluster sampling for the same size sampleDisadvantages...
- nonprobability samplingNonprobability samplingSampling is the use of a subset of the population to represent the whole population. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. Nonprobability sampling does not meet this criterion and should be...
- slice samplingSlice samplingSlice sampling is a type of Markov chain Monte Carlo algorithm for pseudo-random number sampling, i.e. for drawing random samples from a statistical distribution...
- Sampling bias
- Sampling designSampling designIn the theory of finite population sampling, a sampling design specifies for every possible sample its probability of being drawn.Mathematically, a sampling design is denoted by the function P which gives the probability of drawing a sample S....
- Sampling distributionSampling distributionIn statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given statistic based on a random sample. Sampling distributions are important in statistics because they provide a major simplification on the route to statistical inference...
- Sampling errorSampling error-Random sampling:In statistics, sampling error or estimation error is the error caused by observing a sample instead of the whole population. The sampling error can be found by subtracting the value of a parameter from the value of a statistic...
- Sampling fractionSampling fractionIn sampling theory, sampling fraction is the ratio of sample size to population size or, in the context of stratified sampling, the ratio of the sample size to the size of the stratum....
- Sampling frameSampling frameIn statistics, a sampling frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include individuals, households or institutions....
- Sampling riskSampling riskIn auditing, sampling is an inevitable means of testing. However, sampling is always associated with sampling risks which auditors have to control....
- Samuelson's inequalitySamuelson's inequalityIn statistics, Samuelson's inequality, named after the economist Paul Samuelson, also called the Laguerre–Samuelson inequality, after the mathematician Edmond Laguerre, states that every one of any collection x1, ..., xn, is within √ standard deviations of their mean...
- Sargan testSargan testThe Sargan test is a statistical test used to check for over-identifying restrictions in a statistical model. The Sargan test is based on the observation that the residuals should be uncorrelated with the set of exogenous variables if the instruments are truly exogenous...
- SAS (software)
- SAS languageSAS languageThe SAS language is a data processing and statistical analysis .See more on origins of SAS language at SAS System and at Barr Systems .-Structure:The SAS language basically divides data processing and analysis into two kinds of steps....
- SAS System — redirects to SAS (software)
- Savitzky–Golay smoothing filterSavitzky–Golay smoothing filterThe Savitzky–Golay smoothing filter is a type of filter first described in 1964 by Abraham Savitzky and Marcel J. E. Golay.The Savitzky–Golay method essentially performs a local polynomial regression on a series of values to determine the smoothed value for each point...
- Sazonov's theoremSazonov's theoremIn mathematics, Sazonov's theorem, named after Vyacheslav Vasilievich Sazonov , is a theorem in functional analysis.It states that a bounded linear operator between two Hilbert spaces is γ-radonifying if it is Hilbert–Schmidt...
- Saturated arraySaturated arrayIn experiments in which additional factors are not likely to interact with any of the other factors, a saturated array can be used. In a saturated array, a controllable factor is substituted for the interaction of two or more by-products. Using a saturated array, a two-factor test matrix could be...
- Scale analysis (statistics)Scale analysis (statistics)In statistics, scale analysis is a set of methods to analyse survey data, in which responses to questions are combined to measure a latent variable. These items can be dichotomous or polytomous...
- Scale parameterScale parameterIn probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions...
- Scaled-inverse-chi-squared distribution
- Scaling pattern of occupancyScaling pattern of occupancyIn spatial ecology and macroecology, scaling pattern of occupancy , also known as the area-of-occupancy is the way in which species distribution changes across spatial scales. In physical geography and image analysis, it is similar to the modifiable areal unit problem. Simon A...
- Scatter matrixScatter matrixIn multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix of the multivariate normal distribution.-Definition:...
- Scatter plot
- Scatterplot smoothing
- Scheffé's methodScheffé's methodIn statistics, Scheffé's method, named after Henry Scheffé, is a method for adjusting significance levels in a linear regression analysis to account for multiple comparisons...
- Schilder's theoremSchilder's theoremIn mathematics, Schilder's theorem is a result in the large deviations theory of stochastic processes. Roughly speaking, Schilder's theorem gives an estimate for the probability that a sample path of Brownian motion will stray far from the mean path . This statement is made precise using rate...
- Schramm–Loewner evolution
- Schuette–Nesbitt formulaSchuette–Nesbitt formulaIn probability theory, the Schuette–Nesbitt formula is a generalization of the probabilistic version of the inclusion-exclusion principle. It is named after Donald R. Schuette and Cecil J...
- Schwarz criterionSchwarz criterionIn statistics, the Bayesian information criterion or Schwarz criterion is a criterion for model selection among a finite set of models...
- Score (statistics)Score (statistics)In statistics, the score, score function, efficient score or informant plays an important role in several aspects of inference...
- Score testScore testA score test is a statistical test of a simple null hypothesis that a parameter of interest \theta isequal to some particular value \theta_0. It is the most powerful test when the true value of \theta is close to \theta_0. The main advantage of the Score-test is that it does not require an...
- Scoring algorithmScoring algorithmIn statistics, Fisher's scoring algorithm is a form of Newton's method used to solve maximum likelihood equations numerically.-Sketch of Derivation:...
- Scoring ruleScoring ruleIn decision theory a score function, or scoring rule, is a measure of the performance of an entity, be it person or machine, that repeatedly makes decisions under uncertainty. For example, every evening a TV weather forecaster may give the probability of rain on the next day, in a type of...
- SCORUSSCORUSAn acronym for "Standing Committee of Regional and Urban Statistics", SCORUS is a sub-committee of the International Association for Official Statistics which is a section of the International Statistical Institute. The sub-committee has specific responsibility for regional and urban statistics...
- Scott's PiScott's PiScott's pi is a statistic for measuring inter-rater reliability for nominal data in communication studies. Textual entities are annotated with categories by different annotators, and various measures are used to assess the extent of agreement between the annotators, one of which is Scott's pi...
- SDMXSDMXSDMX is an initiative to foster standards for the exchange of statistical information. It started in 2001 and aims at fostering standards for Statistical Data and Metadata eXchange...
– a standard for exchanging statistical data - Seasonal adjustmentSeasonal adjustmentSeasonal adjustment is a statistical method for removing the seasonal component of a time series that is used when analyzing non-seasonal trends. It is normal to report un-adjusted data for current unemployment rates, as these reflect the actual current situation...
- SeasonalitySeasonalityIn statistics, many time series exhibit cyclic variation known as seasonality, periodic variation, or periodic fluctuations. This variation can be either regular or semi regular....
- Seasonal subseries plotSeasonal subseries plotSeasonal subseries plots are a tool for detecting seasonality in a time series. This plot allows one to detect both between-group and within-group patterns. This plot is only useful if the period of the seasonality is already known. In many cases, this will in fact be known. For example, monthly...
- Seasonal variation
- Seasonally adjusted annual rateSeasonally adjusted annual rateThe Seasonally Adjusted Annual Rate refers to the rate adjustment employed when drawing comparisons between various sets of statistical data. As the name suggests, it takes into account fluctuations of values in such data which might occur due to seasonality...
- Second moment methodSecond moment methodIn mathematics, the second moment method is a technique used in probability theory and analysis to show that a random variable has positive probability of being positive...
- Secretary problemSecretary problemThe secretary problem is one of many names for a famous problem of theoptimal stopping theory.The problem has been studied extensively in the fields ofapplied probability, statistics, and decision theory...
- Secular trend
- Secular variationSecular variationThe secular variation of a time series is its long-term non-periodic variation . Whether something is perceived as a secular variation or not depends on the available timescale: a secular variation over a time scale of centuries may be part of a periodic variation over a time scale of millions of...
- Seemingly unrelated regressions
- Seismic to simulationSeismic to simulationSeismic to Simulation is the process and associated techniques used to develop highly accurate static and dynamic 3D models of hydrocarbon reservoirs for use in predicting future production, placing additional wells, and evaluating alternative reservoir management scenarios...
- Selection biasSelection biasSelection bias is a statistical bias in which there is an error in choosing the individuals or groups to take part in a scientific study. It is sometimes referred to as the selection effect. The term "selection bias" most often refers to the distortion of a statistical analysis, resulting from the...
- Selective recruitmentSelective recruitmentSelective recruitment is an observed effect in traffic safety. When safety belt laws are passed, belt wearing rates increase, but casualties decline by smaller percentages than estimated in a simple calculation. This is because those converted from non-use to use are not “recruited” random...
- Self-selection bias
- Self-similar processSelf-similar processSelf-similar processes are types of stochastic processes that exhibit the phenomenon of self-similarity. A self-similar phenomenon behaves the same when viewed at different degrees of magnification, or different scales on a dimension . Self-similar processes can sometimes be described using...
- Segmented regressionSegmented regressionSegmented regression is a method in regression analysis in which the independent variable is partitioned into intervals and a separate line segment is fit to each interval. Segmented or piecewise regression analysis can also be performed on multivariate data by partitioning the various independent...
- Seismic inversionSeismic inversionSeismic inversion, in Geophysics , is the process of transforming seismic reflection data into a quantitative rock-property description of a reservoir...
- Self-similarity matrixSelf-similarity matrixIn data analysis, the self-similarity matrix is a graphical representation of similar sequences in a data series. Similarity can be explained by different measures, like spatial distance , correlation, or comparison of local histograms or spectral properties...
- Semantic mapping (statistics)Semantic mapping (statistics)The semantic mapping is a dimensionality reduction method that extracts new features by clustering the original features in semantic clusters and combining features mapped in the same cluster to generate an extracted feature...
- Semantic relatedness
- Semantic similaritySemantic similaritySemantic similarity or semantic relatedness is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content....
- Semi-Markov processSemi-Markov processA continuous-time stochastic process is called a semi-Markov process or 'Markov renewal process' if the embedded jump chain is a Markov chain, and where the holding times are random variables with any distribution, whose distribution function may depend on the two states between which the move is...
- Semi-log graph
- Semidefinite embeddingSemidefinite embeddingSemidefinite embedding or maximum variance unfolding is an algorithm in computer science, that uses semidefinite programming to perform non-linear dimensionality reduction of high-dimensional vectorial input data....
- SemimartingaleSemimartingaleIn probability theory, a real valued process X is called a semimartingale if it can be decomposed as the sum of a local martingale and an adapted finite-variation process....
- Semiparametric model
- Semiparametric regressionSemiparametric regressionIn statistics, semiparametric regression includes regression models that combine parametric and nonparametric models. They are often used in situations where the fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with...
- Semivariance
- Sensitivity (tests)
- Sensitivity analysisSensitivity analysisSensitivity analysis is the study of how the variation in the output of a statistical model can be attributed to different variations in the inputs of the model. Put another way, it is a technique for systematically changing variables in a model to determine the effects of such changes.In any...
- Sensitivity and specificitySensitivity and specificitySensitivity and specificity are statistical measures of the performance of a binary classification test, also known in statistics as classification function. Sensitivity measures the proportion of actual positives which are correctly identified as such Sensitivity and specificity are statistical...
- Separation testSeparation testA separation test is a statistical procedure for early-phase research, to decide whether or not to pursue further research. It is designed to avoid the prevalent situation in early-phase research, when a statistically underpowered test gives a negative result....
- Sequential analysisSequential analysisIn statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results...
- Sequential estimationSequential estimationIn statistics, sequential estimation refers to estimation methods in sequential analysis where the sample size is not fixed in advance. Instead, data is evaluated as it is collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results are...
- Sequential Monte Carlo methods redirects to Particle filterParticle filterIn statistics, particle filters, also known as Sequential Monte Carlo methods , are sophisticated model estimation techniques based on simulation...
- Sequential probability ratio testSequential probability ratio testThe sequential probability ratio test is a specific sequential hypothesis test, developed by Abraham Wald. Neyman and Pearson's 1933 result inspired Wald to reformulate it as a sequential analysis problem...
- Serial dependenceSerial dependenceIn statistics and signal processing, random variables in a time series have serial dependence if the value at some time t in the series is statistically dependent on the value at another time s...
- Seriation (archaeology)Seriation (archaeology)In archaeology, seriation is a relative dating method in which assemblages or artifacts from numerous sites, in the same culture, are placed in chronological order. Where absolute dating methods, such as carbon dating, cannot be applied, archaeologists have to use relative dating methods to date...
- SETAR (model)SETAR (model)In statistics, Self-Exciting Threshold AutoRegressive models are typically applied to time series data as an extension of autoregressive models, in order to allow for higher degree of flexibility in model parameters through a regime switching behaviour.Given a time series of data xt, the SETAR...
— a time series model - Sethi modelSethi modelThe Sethi model was developed by Suresh P. Sethi and describes the process of how sales evolve over time in response to advertising. The rate of change in sales depend on three effects: response to advertising that acts positively on the unsold portion of the market, the loss due to forgetting or...
- Seven-number summarySeven-number summaryIn descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is a modification or extension of the five-number summary...
- Sexual dimorphism measuresSexual dimorphism measuresAlthough the subject of sexual dimorphism is not in itself controversial, the measures by which it is assessed differ widely. Most of the measures are used on the assumption that a random variable is considered so that probability distributions should be taken into account...
- Shannon–Hartley theoremShannon–Hartley theoremIn information theory, the Shannon–Hartley theorem tells the maximum rate at which information can be transmitted over a communications channel of a specified bandwidth in the presence of noise. It is an application of the noisy channel coding theorem to the archetypal case of a continuous-time...
- Shape of the distributionShape of the distributionIn statistics, the concept of the shape of the distribution refers to the shape of a probability distribution and it most often arises in questions of finding an appropriate distribution to use to model the statistical properties of a population, given a sample from that population...
- Shape parameterShape parameterIn probability theory and statistics, a shape parameter is a kind of numerical parameter of a parametric family of probability distributions.- Definition :...
- Shapiro–Wilk test
- Sharpe ratioSharpe ratioThe Sharpe ratio or Sharpe index or Sharpe measure or reward-to-variability ratio is a measure of the excess return per unit of deviation in an investment asset or a trading strategy, typically referred to as risk , named after William Forsyth Sharpe...
- SHAZAM (software)SHAZAM (software)SHAZAM is a comprehensive econometrics and statistics package for estimating, testing, simulating and forecasting many types of econometrics and statistical models...
- Shewhart individuals control chart
- Shifted Gompertz distribution
- Shifted log-logistic distribution
- Shifting baseline
- Shrinkage (statistics)Shrinkage (statistics)In statistics, shrinkage has two meanings:*In relation to the general observation that, in regression analysis, a fitted relationship appears to perform less well on a new data set than on the data set used for fitting. In particular the value of the coefficient of determination 'shrinks'...
- Shrinkage estimatorShrinkage estimatorIn statistics, a shrinkage estimator is an estimator that, either explicitly or implicitly, incorporates the effects of shrinkage. In loose terms this means that a naïve or raw estimate is improved by combining it with other information. The term relates to the notion that the improved estimate is...
- Sichel distribution
- Siegel–Tukey test
- Sieve estimatorSieve estimatorIn statistics, sieve estimators are a class of nonparametric estimator which use progressively more complex models to estimate an unknown high-dimensional function as more data becomes available, with the aim of asymptotically reducing error towards zero as the amount of data increases. This method...
- Sigma-algebraSigma-algebraIn mathematics, a σ-algebra is a technical concept for a collection of sets satisfying certain properties. The main use of σ-algebras is in the definition of measures; specifically, the collection of sets over which a measure is defined is a σ-algebra...
- SigmaStatSigmaStatSigmaStat is a statistical software package, which was originally developed by Jandel Scientific Software in the 1980s. As of October 1996, Systat Software is now based in San Jose, California. SigmaStat users have the ability to compare effects among groups. This includes before and after or...
– software - Sign testSign testIn statistics, the sign test can be used to test the hypothesis that there is "no difference in medians" between the continuous distributions of two random variables X and Y, in the situation when we can draw paired samples from X and Y...
- Signal-to-noise ratioSignal-to-noise ratioSignal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise...
- Signal-to-noise statistic
- Signed differential mappingSigned differential mappingSigned differential mapping or SDM is a statistical technique for meta-analyzing studies on differences in brain activity or structure which used neuroimaging techniques such as fMRI, VBM, DTI or PET...
- Significance analysis of microarraysSignificance Analysis of MicroarraysSignificance analysis of microarrays is a statistical technique, established in 2001 by Tusher, Tibshirani and Chu, for determining whether changes in gene expression are statistically significant. With the advent of DNA microarrays it is now possible to measure the expression of thousands of...
- Silhouette (clustering)Silhouette (clustering)Silhouette refers to a method of interpretation and validation of clusters of data. The technique provides a succinct graphical representation of how well each object lies within its cluster. It was first described by Peter J. Rousseeuw in 1986.- Method :...
- SimfitSimfitSimfit is a free Open Source Windows package for simulation, curve fitting, statistics, and plotting, using a library of models or user-defined equations. Simfit has been in continuous development for many years by Dr Bill Bardsley of the University of Manchester...
– software - Similarity matrixSimilarity matrixA similarity matrix is a matrix of scores which express the similarity between two data points. Similarity matrices are strongly related to their counterparts, distance matrices and substitution matrices.-Use in sequence alignment:...
- Simon modelSimon model-Motivation:Aiming to account for the wide range of empirical distributions following a power-law, Herbert Simon proposed a class of stochastic models that results in a power-law distribution function. It models the dynamics of a system...
- Simple linear regressionSimple linear regressionIn statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. In other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model as...
- Simple moving average crossoverSimple moving average crossoverIn the statistics of time series, and in particular the analysis of financial time series for stock trading purposes, a moving-average crossover occurs when, on plotting two moving averages each based on different degrees of smoothing, the traces of these moving averages cross...
- Simple random sampleSimple random sampleIn statistics, a simple random sample is a subset of individuals chosen from a larger set . Each individual is chosen randomly and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has...
- Simpson's paradoxSimpson's paradoxIn probability and statistics, Simpson's paradox is a paradox in which a correlation present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequencydata are hastily given causal...
- Simulated annealingSimulated annealingSimulated annealing is a generic probabilistic metaheuristic for the global optimization problem of locating a good approximation to the global optimum of a given function in a large search space. It is often used when the search space is discrete...
- Simultaneous equation methods (econometrics)
- Simultaneous equations modelSimultaneous equations modelSimultaneous equation models are a form of statistical model in the form of a set of linear simultaneous equations. They are often used in econometrics.- Structural and reduced form :...
- Single equation methods (econometrics)Single equation methods (econometrics)A variety of methods are used in econometrics to estimate models consisting of a single equation. The oldest and still the most commonly used is the ordinary least squares method used to estimate linear regressions....
- Singular distributionSingular distributionIn probability, a singular distribution is a probability distribution concentrated on a set of Lebesgue measure zero, where the probability of each point in that set is zero. These distributions are sometimes called singular continuous distributions...
- Singular spectrum analysisSingular Spectrum AnalysisSingular spectrum analysis combines elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems and signal processing...
- Sinusoidal modelSinusoidal modelIn statistics, signal processing, and time series analysis, a sinusoidal model to approximate a sequence Yi is:Y_i = C + \alpha\sin + E_i...
- Sinkov statisticSinkov statisticSinkov statistics, also known as log-weight statistics, is a specialized field of statistics that was developed by Abraham Sinkov, while working for the small Signal Intelligence Service organization, the primary mission of which was to compile codes and ciphers for use by the U.S. Army...
- Skellam distribution
- Skew normal distribution
- SkewnessSkewnessIn probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. The skewness value can be positive or negative, or even undefined...
- Skorokhod's representation theoremSkorokhod's representation theoremIn mathematics and statistics, Skorokhod's representation theorem is a result that shows that a weakly convergent sequence of probability measures whose limit measure is sufficiently well-behaved can be represented as the distribution/law of a pointwise convergent sequence of random variables...
- Slash distributionSlash distributionIn probability theory, the slash distribution is the probability distribution of a standard normal variate divided by an independent standard uniform variate...
- Slice samplingSlice samplingSlice sampling is a type of Markov chain Monte Carlo algorithm for pseudo-random number sampling, i.e. for drawing random samples from a statistical distribution...
- Sliced inverse regression
- Slutsky's theoremSlutsky's theoremIn probability theory, Slutsky’s theorem extends some properties of algebraic operations on convergent sequences of real numbers to sequences of random variables.The theorem was named after Eugen Slutsky. Slutsky’s theorem is also attributed to Harald Cramér....
- Small area estimationSmall area estimationSmall area estimation is any of several statistical techniques involving the estimation of parameters for small sub-populations, generally used when the sub-population of interest is included in a larger survey....
- Smearing retransformationSmearing retransformationThe Smearing retransformation is used in regression analysis, after estimating the logarithm of a variable. Estimating the logarithm of a variable instead of the variable itself is a common technique to more closely approximate normality...
- SmoothingSmoothingIn statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. Many different algorithms are used in smoothing...
- Smoothing splineSmoothing splineThe smoothing spline is a method of smoothing using a spline function.-Definition:Let ;x_1...
- Smoothness (probability theory)Smoothness (probability theory)In probability theory and statistics, smoothness of a density function is a measure which determines how many times the density function can be differentiated, or equivalently the limiting behavior of distribution’s characteristic function....
- Snowball samplingSnowball samplingIn sociology and statistics research, snowball sampling is a non-probability sampling technique where existing study subjects recruit future subjects from among their acquaintances. Thus the sample group appears to grow like a rolling snowball...
- Social network change detectionSocial network change detectionSocial network change detection is a process of monitoring social networks to determine when significant changes to their organizational structure occur and what caused them. This scientific approach combines analytical techniques from social network analysis with those from statistical process...
- Social statisticsSocial statisticsSocial statistics is the use of statistical measurement systems to study human behavior in a social environment. This can be accomplished through polling a particular group of people, evaluating a particular subset of data obtained about a group of people, or by observation and statistical...
- SOFA StatisticsSOFA StatisticsSOFA Statistics is an open-source statistical package, with an emphasis on ease of use, learn as you go, and beautiful output. The name stands for Statistics Open For All. It has a graphical user interface and can connect directly to MySQL, PostgreSQL, SQLite, MS Access, and Microsoft SQL Server...
– software - Soliton distributionSoliton distributionA soliton distribution is a type of discrete probability distribution that arises in the theory of erasure correcting codes. A paper by Luby introduced two forms of such distributions, the ideal soliton distribution and the robust soliton distribution.-Ideal distribution:The ideal soliton...
– redirects to Luby transform code - Sørensen similarity indexSørensen similarity indexThe Sørensen index, also known as Sørensen’s similarity coefficient, is a statistic used for comparing the similarity of two samples. It was developed by the botanist Thorvald Sørensen and published in 1948....
- Spaghetti plot
- Sparse binary polynomial hashingSparse binary polynomial hashingSparse binary polynomial hashing is a generalization of Bayesian filtering that can match mutating phrases as well as single words. SBPH is a way of generating a large number of features from an incoming text automatically, and then using statistics to determine the weights for each of those...
- Sparse PCASparse PCASparse PCA is a specialised technique used in statistical analysis and, in particular, in the analysis of multivariate data sets....
– sparse principal components analysis - Sparsity-of-effects principleSparsity-of-effects principleThe sparsity-of-effects principle states that a system is usually dominated by main effects and low-order interactions. Thus it is most likely that main effects and two-factor interactions are the most significant responses . In other words, higher order interactions such as three-factor...
- Spatial analysisSpatial analysisSpatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties...
- Spatial dependenceSpatial dependenceIn applications of statistics, spatial dependence is the existence of statistical dependence in a collection of random variables or a collection time series of random variables, each of which is associated with a different geographical location...
- Spatial descriptive statisticsSpatial descriptive statisticsSpatial descriptive statistics are used for a variety of purposes in geography, particularly in quantitative data analyses involving Geographic Information Systems .-Types of spatial data:...
- Spatial distributionSpatial distributionA spatial distribution is the arrangement of a phenomenon across the Earth's surface and a graphical display of such an arrangement is an important tool in geographical and environmental statistics. A graphical display of a spatial distribution may summarize raw data directly or may reflect the...
- Spatial econometricsSpatial econometricsSpatial Econometrics is the field where spatial analysis and econometrics intersect. In general, econometrics differs from other branches of statistics in focusing on theoretical models, whose parameters are estimated using regression analysis...
- Spatial statistics redirects to Spatial analysisSpatial analysisSpatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties...
- Spatial variabilitySpatial variabilitySpatial variability occurs when a quantity that is measured at different spatial locations exhibits values that differ across the locations. Spatial variability can be assessed using spatial descriptive statistics such as the range.- References :...
- SPC XLSPC XLSPC XL is a statistical add-in for Microsoft Excel. SPC XL is a replacement for SPC KISS which was released in 1993 making it one of the oldest statistical addons to Excel...
– software - Spearman's rank correlation coefficientSpearman's rank correlation coefficientIn statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often denoted by the Greek letter \rho or as r_s, is a non-parametric measure of statistical dependence between two variables. It assesses how well the relationship between two variables can...
- Spearman–Brown prediction formula
- Species discovery curveSpecies discovery curveIn ecology, the species discovery curve is a graph recording the cumulative number of species of living things recorded in a particular environment as a function of the cumulative effort expended searching for them...
- Specification (regression)Specification (regression)In regression analysis and related fields such as econometrics, specification is the process of converting a theory into a regression model. This process consists of selecting an appropriate functional form for the model and choosing which variables to include. Model specification is one of the...
- Specificity (tests)
- Spectral density estimationSpectral density estimationIn statistical signal processing, the goal of spectral density estimation is to estimate the spectral density of a random signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal...
- Spectrum biasSpectrum biasInitially identified in 1978, spectrum bias refers to the phenomenon that the performance of a diagnostic test may change between different clinical settings owing to changes in the patient case-mix thereby affecting the transferability of study results in clinical practice...
- Spectrum continuation analysisSpectrum continuation analysisSpectrum continuation analysis is a generalization of the concept of Fourier series to non-periodic functions of which only a fragment has been sampled in the time domain....
- Speed priorSpeed priorJürgen Schmidhuber's speed prior is a complexity measure similar to Kolmogorov complexity, except that it is based on computation speed as well as programlength.The speed prior complexity of a program is its...
- Spherical designSpherical designA spherical design, part of combinatorial design theory in mathematics, is a finite set of N points on the d-dimensional unit hypersphere Sd such that the average value of any polynomial f of degree t or less on the set equals the average value of f on the whole sphere...
- Split normal distributionSplit normal distributionIn probability theory and statistics, the split normal distribution also known as the two-piece normal distribution results from joining at the mode the corresponding halves of two normal distributions with the same mode but different variances...
- SPRT — redirects to Sequential probability ratio testSequential probability ratio testThe sequential probability ratio test is a specific sequential hypothesis test, developed by Abraham Wald. Neyman and Pearson's 1933 result inspired Wald to reformulate it as a sequential analysis problem...
- SPSSSPSSSPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....
– software - SPSS ClementineSPSS ClementineSPSS Modeler is a data mining software tool by SPSS Inc., an IBM company. It was originally named SPSS Clementine by SPSS, after which it was renamed PASW Modeler in 2009 by SPSS. It was since acquired by IBM in its acquisition of SPSS Inc.-Overview:...
– software (data mining) - Spurious relationshipSpurious relationshipIn statistics, a spurious relationship is a mathematical relationship in which two events or variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor In statistics, a spurious relationship...
- Square root biased samplingSquare root biased samplingSquare root biased sampling is a sampling method proposed by William H. Press, a professor in the fields of computer sciences and computational biology, for use in airport screenings as a mathematically efficient compromise between simple random sampling and strong profiling.Using this method, if a...
- Squared deviationsSquared deviationsIn probability theory and statistics, the definition of variance is either the expected value , or average value , of squared deviations from the mean. Computations for analysis of variance involve the partitioning of a sum of squared deviations...
- St. Petersburg paradoxSt. Petersburg paradoxIn economics, the St. Petersburg paradox is a paradox related to probability theory and decision theory. It is based on a particular lottery game that leads to a random variable with infinite expected value, i.e., infinite expected payoff, but would nevertheless be considered to be worth only a...
- Stability (probability)Stability (probability)In probability theory, the stability of a random variable is the property that a linear combination of two independent copies of the variable has the same distribution, up to location and scale parameters. The distributions of random variables having this property are said to be "stable...
- Stable distribution
- Stable and tempered stable distributions with volatility clustering – financial applications
- Standard deviationStandard deviationStandard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
- Standard error (statistics)Standard error (statistics)The standard error is the standard deviation of the sampling distribution of a statistic. The term may also be used to refer to an estimate of that standard deviation, derived from a particular sample used to compute the estimate....
- Standard normal deviateStandard normal deviateA standard normal deviate is a normally distributed random variable with expected value 0 and variance 1. A fuller term is standard normal random variable...
- Standard normal tableStandard normal tableA standard normal table also called the "Unit Normal Table" is a mathematical table for the values of Φ, the cumulative distribution function of the normal distribution....
- Standard probability spaceStandard probability spaceIn probability theory, a standard probability space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940...
- Standard scoreStandard scoreIn statistics, a standard score indicates how many standard deviations an observation or datum is above or below the mean. It is a dimensionless quantity derived by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation...
- Standardized coefficientStandardized coefficientIn statistics, standardized coefficients or beta coefficients are the estimates resulting from an analysis carried out on variables that have been standardized so that their variances are 1. Therefore, standardized coefficients refer to how many standard deviations a dependent variable will change,...
- Standardized moment
- Standardised mortality rateStandardised mortality rateStandardized mortality ratio tells how many persons, per thousand of the population, will die in a given year and what the causes of death will be...
- Standardized mortality ratioStandardized mortality ratioThe standardized mortality ratio or SMR in epidemiology is the ratio of observed deaths to expected deaths, where expected deaths are calculated for a typical area with the same age and gender mix by looking at the death rates for different ages and genders in the larger population.The SMR may be...
- Standardized rateStandardized rateStandardized rates are a statistical measure of any rates in a population. The most common are birth, death and unemployment rates.The formula for standardized rates is as follows:...
- StanineStanineStanine is a method of scaling test scores on a nine-point standard scale with a mean of five and a standard deviation of two.Some web sources attribute stanines to the U.S. Army Air Forces during World War II...
- STAR modelSTAR modelIn statistics, Smooth Transition Autoregressive models are typically applied to time series data as an extension of autoregressive models, in order to allow for higher degree of flexibility in model parameters through a smooth transition.Given a time series of data xt, the STAR model is a tool for...
— a time series model - Star plot — redirects to Radar chartRadar chartA radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point...
- StataStataStata is a general-purpose statistical software package created in 1985 by StataCorp. It is used by many businesses and academic institutions around the world...
- StatgraphicsStatgraphicsStatgraphics is a statistics package that performs and explains basic and advanced statistical functions. The software was created in 1980 by Dr. Neil Polhemus...
– software - Static analysisStatic analysisStatic analysis, static projection, and static scoring are terms for simplified analysis wherein the effect of an immediate change to a system is calculated without respect to the longer term response of the system to that change...
- Stationary distributionStationary distributionStationary distribution may refer to:* The limiting distribution in a Markov chain* The marginal distribution of a stationary process or stationary time series* The set of joint probability distributions of a stationary process or stationary time series...
- Stationary ergodic processStationary ergodic processIn probability theory, stationary ergodic process is a stochastic process which exhibits both stationarity and ergodicity. In essence this implies that the random process will not change its statistical properties with time and that its statistical properties can be deduced from a single,...
- Stationary processStationary processIn the mathematical sciences, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space...
- Stationary sequenceStationary sequenceIn probability theory – specifically in the theory of stochastic processes, a stationary sequence is a random sequence whose joint probability distribution is invariant over time...
- Stationary subspace analysisStationary subspace analysisStationary Subspace Analysis is a blind source separation algorithm which factorizes a multivariate time series into stationary and non-stationary components.- Introduction :...
- StatisticStatisticA statistic is a single measure of some attribute of a sample . It is calculated by applying a function to the values of the items comprising the sample which are known together as a set of data.More formally, statistical theory defines a statistic as a function of a sample where the function...
- STATISTICASTATISTICASTATISTICA is a statistics and analytics software package developed by StatSoft. STATISTICA provides data analysis, data management, data mining, and data visualization procedures...
– software - Statistical arbitrageStatistical arbitrageIn the world of finance and investments, statistical arbitrage is used in two related but distinct ways:* In academic literature, "statistical arbitrage" is opposed to arbitrage. In deterministic arbitrage, a sure profit can be obtained from being long some securities and short others...
- Statistical assemblyStatistical assemblyIn statistics, for example in statistical quality control, a statistical assembly is a collection of parts or components which makes up a statistical unit. Thus a statistical unit, which would be the prime item of concern, is made of discrete components like organs or machine parts...
- Statistical assumption
- Statistical benchmarkingStatistical benchmarkingIn statistics, benchmarking is a method of using auxiliary information to adjust the sampling weights used in an estimation process, in order to yield more accurate estimates of totals....
- Statistical classification
- Statistical conclusion validityStatistical conclusion validityStatistical conclusion validity refers to the appropriate use of statistics to infer whether the presumed independent and dependent variables covary...
- Statistical consultantStatistical consultantA statistical consultant provides statistical advice andguidance to clients interested in making decisions through theanalysis or collection of data. Clients often need statistical advice to answer questions in business, medicine, biology, genetics, forestry, agriculture, fisheries, wildlife...
- Statistical deviance—see deviance (statistics)
- Statistical dispersionStatistical dispersionIn statistics, statistical dispersion is variability or spread in a variable or a probability distribution...
- Statistical distanceStatistical distanceIn statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two samples, two random variables, or two probability distributions, for example.-Metrics:...
- Statistical efficiency
- Statistical epidemiologyStatistical epidemiologyStatistical epidemiology is an emerging branch of the disciplines of epidemiology and biostatistics that aims to:* Bring more statistical rigour to bear in the field of epidemiology...
- Statistical estimation — redirects to Estimation theoryEstimation theoryEstimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...
- Statistical financeStatistical financeStatistical finance, sometimes called econophysics, is an empirical attempt to shift finance from its normative roots to a positivist framework using exemplars from statistical physics with an emphasis on emergent or collective properties of financial markets...
- Statistical genetics — redirects to population geneticsPopulation geneticsPopulation genetics is the study of allele frequency distribution and change under the influence of the four main evolutionary processes: natural selection, genetic drift, mutation and gene flow. It also takes into account the factors of recombination, population subdivision and population...
- Statistical geographyStatistical geographyStatistical geography is the study and practice of collecting, analysing and presenting data that has a geographic or areal dimension, such as census or demographics data. It uses techniques from spatial analysis, but also encompasses geographical activities such as the defining and naming of...
- Statistical graphicsStatistical graphicsStatistical graphics, also known as graphical techniques, are information graphics in the field of statistics used to visualize quantitative data.- Overview :...
- Statistical hypothesis testingStatistical hypothesis testingA statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...
- Statistical independenceStatistical independenceIn probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...
- Statistical inferenceStatistical inferenceIn statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
- Statistical interferenceStatistical interferenceWhen two probability distributions overlap, statistical interference exists. Knowledge of the distributions can be used to determine the likelihood that one parameter exceeds another, and by how much....
- Statistical LabStatistical LabThe computer program Statistical Lab is an explorative and interactive toolbox for statistical analysis and visualization of data. It supports educational applications of statistics in business sciences, economics, social sciences and humanities. The program is developed and constantly advanced by...
– software - Statistical learning theory
- Statistical literacyStatistical literacyStatistical literacy is a term used to describe an individual's or group's ability to understand statistics. Statistical literacy is necessary for citizens to understand material presented in publications such as newspapers, television, and the Internet. Numeracy is a prerequisite to being...
- Statistical modelStatistical modelA statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but...
- Statistical model validation — redirects to Regression model validation
- Statistical noiseStatistical noiseStatistical noise is the colloquialism for recognized amounts of unexplained variation in a sample. See errors and residuals in statistics....
- Statistical package
- Statistical parameterStatistical parameterA statistical parameter is a parameter that indexes a family of probability distributions. It can be regarded as a numerical characteristic of a population or a model....
- Statistical parametric mappingStatistical parametric mappingStatistical parametric mapping or SPM is a statistical technique created by Karl Friston for examining differences in brain activity recorded during functional neuroimaging experiments using neuroimaging technologies such as fMRI or PET...
- Statistical parsingStatistical parsingStatistical parsing is a group of parsing methods within natural language processing. The methods have in common that they associate grammar rules with a probability. Grammar rules are traditionally viewed in computational linguistics as defining the valid sentences in a language...
- Statistical populationStatistical populationA statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. For example, if we were interested in generalizations about crows, then we would describe the set of crows that is of interest...
- Statistical powerStatistical powerThe power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is actually false . The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis...
- Statistical probability
- Statistical process controlStatistical process controlStatistical process control is the application of statistical methods to the monitoring and control of a process to ensure that it operates at its full potential to produce conforming product. Under SPC, a process behaves predictably to produce as much conforming product as possible with the least...
- Statistical process control softwareStatistical process control softwareThere are a number of software programs designed to aid in statistical process control .Typically the software program undertakes two functions: data collection and data analysis.-Data collection:...
- Statistical proofStatistical proof.Statistical proof is the rational demonstration of degree of certainty for a proposition, hypothesis or theory to convince others subsequent to a statistical test of the increased understanding of the facts. Statistical methods are used to demonstrate the validity and logic of inference with...
- Statistical randomnessStatistical randomnessA numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll, or the digits of π exhibit statistical randomness....
- Statistical range – see range (statistics)Range (statistics)In the descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...
- Statistical regularityStatistical regularityStatistical regularity is a notion in statistics and probability theory that random events exhibit regularity when repeated enough times or that enough sufficiently similar random events exhibit regularity...
- Statistical sample
- Statistical semanticsStatistical semanticsStatistical semantics is the study of "how the statistical patterns of human word usage can be used to figure out what people mean, at least to a level sufficient for information access"...
- Statistical shape analysisStatistical shape analysisStatistical shape analysis is a geometrical analysis from a set of shapes in which statistics are measured to describe geometrical properties from similar shapes or different groups, for instance, the difference between male and female Gorilla skull shapes, normal and pathological bone shapes, etc...
- Statistical signal processingStatistical signal processingStatistical signal processing is an area of Applied Mathematics and Signal Processing that treats signals as stochastic processes, dealing with their statistical properties...
- Statistical significanceStatistical significanceIn statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....
- Statistical surveyStatistical surveySurvey methodology is the field that studies surveys, that is, the sample of individuals from a population with a view towards making statistical inferences about the population using the sample. Polls about public opinion, such as political beliefs, are reported in the news media in democracies....
- Statistical syllogismStatistical syllogismA statistical syllogism is a non-deductive syllogism. It argues from a generalization true for the most part to a particular case .-Introduction:Statistical syllogisms may use qualifying words like "most", "frequently", "almost never", "rarely",...
- Statistical theoryStatistical theoryThe theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that...
- Statistical unitStatistical unitA unit in a statistical analysis refers to one member of a set of entities being studied. It is the material source for the mathematical abstraction of a "random variable"...
- Statisticians' and engineers' cross-reference of statistical termsStatisticians' and engineers' cross-reference of statistical termsThe following terms are used by electrical engineers in statistical signal processing studies instead of typical statistician's terms.The following terms are used by electrical engineers in statistical signal processing studies instead of typical statistician's terms.The following terms are used by...
- StatisticsStatisticsStatistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
- Statistics educationStatistics educationStatistics education is concerned with the teaching and learning of statistics.Statistics is both a formal science and a practical theory of scientific inquiry, and both aspects are considered in statistics education. Education in statistics has similar concerns as does education in other...
- Statistics Online Computational Resource – training materials
- StatPlusStatPlusStatPlus is a software product that includes basic and multivariate statistical analysis , including time series analysis, nonparametric statistics, survival analysis and ability to build different charts ....
- StatXactStatXactStatXact is a statistical software package for exact statistics. It calculates exact p-values and confidence intervals for contingency tables and non-parametric procedures. It is marketed by Cytel Inc.-References:...
– software - Stein's exampleStein's exampleStein's example , in decision theory and estimation theory, is the phenomenon that when three or more parameters are estimated simultaneously, there exist combined estimators more accurate on average than any method that handles the parameters separately...
- Proof of Stein's exampleProof of Stein's exampleStein's example is an important result in decision theory which can be stated asThe following is an outline of its proof. The reader is referred to the main article for more information.-Sketched proof:...
- Proof of Stein's example
- Stein's lemmaStein's lemmaStein's lemma, named in honor of Charles Stein, is a theorem of probability theory that is of interest primarily because of its applications to statistical inference — in particular, to James–Stein estimation and empirical Bayes methods — and its applications to portfolio choice...
- Stein's unbiased risk estimateStein's unbiased risk estimateIn statistics, Stein's unbiased risk estimate is an unbiased estimator of the mean-squared error of "a nearly arbitrary, nonlinear biased estimator." In other words, it provides an indication of the accuracy of a given estimator...
- Steiner systemSteiner system250px|right|thumbnail|The [[Fano plane]] is an S Steiner triple system. The blocks are the 7 lines, each containing 3 points. Every pair of points belongs to a unique line....
- StemplotStemplotA stemplot , in statistics, is a device for presenting quantitative data in a graphical format, similar to a histogram, to assist in visualizing the shape of a distribution. They evolved from Arthur Bowley's work in the early 1900s, and are useful tools in exploratory data analysis...
- Stepwise regressionStepwise regressionIn statistics, stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure...
- Stetson–Harrison method
- Stieltjes moment problem
- Stimulus-response modelStimulus-response modelThe stimulus–response model is a characterization of a statistical unit as a black box model, predicting a quantitative response to a quantitative stimulus, for example one administered by a researcher.-Fields of application:...
- StochasticStochasticStochastic refers to systems whose behaviour is intrinsically non-deterministic. A stochastic process is one whose behavior is non-deterministic, in that a system's subsequent state is determined both by the process's predictable actions and by a random element. However, according to M. Kac and E...
- Stochastic approximationStochastic approximationStochastic approximation methods are a family of iterative stochastic optimization algorithms that attempt to find zeroes or extrema of functions which cannot be computed directly, but only estimated via noisy observations....
- Stochastic calculusStochastic calculusStochastic calculus is a branch of mathematics that operates on stochastic processes. It allows a consistent theory of integration to be defined for integrals of stochastic processes with respect to stochastic processes...
- Stochastic convergence
- Stochastic differential equationStochastic differential equationA stochastic differential equation is a differential equation in which one or more of the terms is a stochastic process, thus resulting in a solution which is itself a stochastic process....
- Stochastic dominanceStochastic dominanceStochastic dominance is a form of stochastic ordering. The term is used in decision theory and decision analysis to refer to situations where one gamble can be ranked as superior to another gamble. It is based on preferences regarding outcomes...
- Stochastic driftStochastic driftIn probability theory, stochastic drift is the change of the average value of a stochastic process. A related term is the drift rate which is the rate at which the average changes. This is in contrast to the random fluctuations about this average value...
- Stochastic gradient descentStochastic gradient descentStochastic gradient descent is an optimization method for minimizing an objective function that is written as a sum of differentiable functions.- Background :...
- Stochastic grammarStochastic grammarA stochastic grammar is a grammar framework with a probabilistic notion of grammaticality:*Stochastic context-free grammar*Statistical parsing*Data-oriented parsing*Hidden Markov model*Estimation theory...
- Stochastic kernel estimation
- Stochastic matrixStochastic matrixIn mathematics, a stochastic matrix is a matrix used to describe the transitions of a Markov chain. It has found use in probability theory, statistics and linear algebra, as well as computer science...
- Stochastic modelling (insurance)
- Stochastic optimizationStochastic optimizationStochastic optimization methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involve random objective functions or random constraints, for example. Stochastic...
- Stochastic orderingStochastic orderingIn probability theory and statistics, a stochastic order quantifies the concept of one random variable being "bigger" than another. These are usually partial orders, so that one random variable A may be neither stochastically greater than, less than nor equal to another random variable B...
- Stochastic processStochastic processIn probability theory, a stochastic process , or sometimes random process, is the counterpart to a deterministic process...
- Stochastic rounding
- Stochastic simulationStochastic simulationStochastic simulation algorithms and methods were initially developed to analyse chemical reactions involving large numbers of species with complex reaction kinetics. The first algorithm, the Gillespie algorithm was proposed by Dan Gillespie in 1977...
- Stopped processStopped processIn mathematics, a stopped process is a stochastic process that is forced to assume the same value after a prescribed time.-Definition:Let* be a probability space;...
- Stopping time
- Stratified samplingStratified samplingIn statistics, stratified sampling is a method of sampling from a population.In statistical surveys, when subpopulations within an overall population vary, it is advantageous to sample each subpopulation independently. Stratification is the process of dividing members of the population into...
- Stratonovich integralStratonovich integralIn stochastic processes, the Stratonovich integral is a stochastic integral, the most common alternative to the Itō integral...
- Stress majorizationStress majorizationStress majorization is an optimization strategy used in multidimensional scaling where, for a set of n, m-dimensional data items, a configuration X of n points in rStress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of n, m-dimensional data...
- Strong Law of Small NumbersStrong Law of Small Numbers"The Strong Law of Small Numbers" is a humorous paper by mathematician Richard K. Guy and also the so-called law that it proclaims: "There aren't enough small numbers to meet the many demands made of them." In other words, any given small number appears in far more contexts than may seem...
- Strong priorStrong priorA Strong prior is a preceding assumption, theory, concept or idea upon which a current assumption, theory, concept or idea is founded.In Bayesian statistics, the term is used to contrast the case of a weak or uniformative prior probability...
- Structural breakStructural breakA structural break is a concept in econometrics. A structural break appears when we see an unexpected shift in a time series. This can lead to huge forecasting errors and unreliability of the model in general...
- Structural equation modelingStructural equation modelingStructural equation modeling is a statistical technique for testing and estimating causal relations using a combination of statistical data and qualitative causal assumptions...
- Structural estimationStructural estimationStructural estimation is a technnique for estimating deep "structural" parameters of theoretical economic models. In this sense, "structural estimation" is contrasted with "reduced-form estimation," which generally provides evidence about partial equilibrium relationships in a regression...
- Structured data analysis (statistics)Structured data analysis (statistics)Structured data analysis is the statistical data analysis of structured data. This can arise either in the form of an a priori structure such as multiple-choice questionnaires or in situations with the need to search for structure that fits the given data, either exactly or approximately...
- Studentized range
- Studentized residualStudentized residualIn statistics, a studentized residual is the quotient resulting from the division of a residual by an estimate of its standard deviation. Typically the standard deviations of residuals in a sample vary greatly from one data point to another even when the errors all have the same standard...
- Student's t-distribution
- Student's t-statisticStudent's t-statisticIn statistics, the t-statistic is a ratio of the departure of an estimated parameter from its notional value and its standard error. It is used in hypothesis testing, for example in the Student's t-test, in the augmented Dickey–Fuller test, and in bootstrapping.-Definition:Let \scriptstyle\hat\beta...
- Student's t-testStudent's t-testA t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known...
- Student’s t-test for Gaussian scale mixture distributions — redirects to Location testing for Gaussian scale mixture distributionsLocation testing for Gaussian scale mixture distributionsIn statistics, the topic of location testing for Gaussian scale mixture distributions arises in some particular types of situations where the more standard Student's t-test is inapplicable...
- StudentizationStudentizationIn statistics, Studentization, named after William Sealy Gosset, who wrote under the pseudonym Student, is the adjustment consisting of division of a first-degree statistic derived from a sample, by a sample-based estimate of a population standard deviation...
- Study designStudy designClinical study design is the formulation of trials and experiments in medical and epidemiological research, sometimes known as clinical trials. Many of the considerations here are shared under the more general topic of design of experiments but there can be others, in particular related to patient...
- Study heterogeneityStudy heterogeneityIn statistics, study heterogeneity is a problem that can arise when attempting to undertake a meta-analysis. Ideally, the studies whose results are being combined in the meta-analysis should all be undertaken in the same way and to the same experimental protocols: study heterogeneity is a term used...
- Subcontrary mean redirects to Harmonic meanHarmonic meanIn mathematics, the harmonic mean is one of several kinds of average. Typically, it is appropriate for situations when the average of rates is desired....
- Subgroup analysisSubgroup analysisSubgroup analysis, in the context of design and analysis of experiments, refers to looking for pattern in a subset of the subjects....
- Subindependence
- Substitution modelSubstitution modelIn biology, a substitution model describes the process from which a sequence of characters changes into another set of traits. For example, in cladistics, each position in the sequence might correspond to a property of a species which can either be present or absent. The alphabet could then consist...
- SUDAANSUDAANSUDAAN is a statistical software package for the analysis of correlated data, including correlated data encountered in complex sample surveys. SUDAAN originated in 1972 at .-Current version:...
– software - Sufficiency (statistics) — redirects to Sufficient statistic
- Sufficient dimension reductionSufficient dimension reductionIn statistics, sufficient dimension reduction is a paradigm for analyzing data that combines the ideas of dimension reduction with the concept of sufficiency.Dimension reduction has long been a primary goal of regression analysis...
- Sufficient statistic
- Sum of normally distributed random variablesSum of normally distributed random variablesIn probability theory, calculation of the sum of normally distributed random variables is an instance of the arithmetic of random variables, which can be quite complex based on the probability distributions of the random variables involved and their relationships.-Independent random variables:If X...
- Sum of squares — general disambiguation
- Sum of squares (statistics) — redirects to Partition of sums of squares
- Summary statistic
- SuperstatisticsSuperstatisticsSuperstatistics is a branch of statistical mechanics or statistical physics devoted to the study of non-linear and non-equilibrium systems. It is characterized by using the superposition of multiple differing statistical models to achieve the desired non-linearity...
- Support curveSupport curveSupport curve is a statistical term, coined by A. W. F. Edwards, to describe the graph of the natural logarithm of the likelihood function. The function being plotted is used in the computation of the score and Fisher information, and the graph has a direct interpretation in the context of maximum...
- Support vector machineSupport vector machineA support vector machine is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis...
- Surrogate modelSurrogate modelMost engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the air flow around the wing for...
- Survey data collectionSurvey data collectionThe methods involved in survey data collection are any of a number of ways in which data can be collected for a statistical survey. These are methods that are used to collect information from a sample of individuals in a systematic way....
- Survey samplingSurvey samplingIn statistics, survey sampling describes the process of selecting a sample of elements from a target population in order to conduct a survey.A survey may refer to many different types or techniques of observation, but in the context of survey sampling it most often involves a questionnaire used to...
- Survey methodologySurvey MethodologySurvey Methodology is a peer-reviewed open access scientific journal that publishes papers related to the development and application of survey techniques...
- Survival analysisSurvival analysisSurvival analysis is a branch of statistics which deals with death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, and duration analysis or duration modeling in economics or sociology...
- Survival rateSurvival rateIn biostatistics, survival rate is a part of survival analysis, indicating the percentage of people in a study or treatment group who are alive for a given period of time after diagnosis...
- Survival functionSurvival functionThe survival function, also known as a survivor function or reliability function, is a property of any random variable that maps a set of events, usually associated with mortality or failure of some system, onto time. It captures the probability that the system will survive beyond a specified time...
- Survivorship biasSurvivorship biasSurvivorship bias is the logical error of concentrating on the people or things that "survived" some process and inadvertently overlooking those that didn't because of their lack of visibility. This can lead to false conclusions in several different ways...
- Symmetric designSymmetric designIn combinatorial mathematics, a symmetric design is a block design with equal numbers of points and blocks. Thus, it has the fewest possible blocks given the number of points . They are also known as projective designs....
- Symmetric mean absolute percentage errorSymmetric mean absolute percentage errorSymmetric mean absolute percentage error is an accuracy measure based on percentage errors. It is usually defined as follows:where At is the actual value and Ft is the forecast value....
- SYSTATSYSTATSYSTAT is a statistics and statistical graphics software package, developed by Leland Wilkinson in the late 1970s, who was at the time an assistant professor of psychology at the University of Illinois at Chicago...
– software - System dynamicsSystem dynamicsSystem dynamics is an approach to understanding the behaviour of complex systems over time. It deals with internal feedback loops and time delays that affect the behaviour of the entire system. What makes using system dynamics different from other approaches to studying complex systems is the use...
- System identificationSystem identificationIn control engineering, the field of system identification uses statistical methods to build mathematical models of dynamical systems from measured data...
- Systematic errorSystematic errorSystematic errors are biases in measurement which lead to the situation where the mean of many separate measurements differs significantly from the actual value of the measured attribute. All measurements are prone to systematic errors, often of several different types...
(also see bias (statistics)Bias (statistics)A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter of interest. The following lists some types of, or aspects of, bias which should not be considered mutually exclusive:...
and errors and residuals in statisticsErrors and residuals in statisticsIn statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...
) - Systematic reviewSystematic reviewA systematic review is a literature review focused on a research question that tries to identify, appraise, select and synthesize all high quality research evidence relevant to that question. Systematic reviews of high-quality randomized controlled trials are crucial to evidence-based medicine...
T
- t-distribution; see Student's t-distribution (includes table)
- T distribution — disambiguation
- t-statistic
- Tag cloudTag cloudA tag cloud is a visual representation for text data, typically used to depict keyword metadata on websites, or to visualize free form text. 'Tags' are usually single words, and the importance of each tag is shown with font size or color...
– graphical display of info - Taguchi loss functionTaguchi loss functionThe Taguchi Loss Function is a graphical depiction of loss developed by the Japanese business statistician Genichi Taguchi to describe a phenomenon affecting the value of products produced by a company. Praised by Dr. W...
- Taguchi methodsTaguchi methodsTaguchi methods are statistical methods developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to, engineering, biotechnology, marketing and advertising...
- Tajima's DTajima's DTajima's D is a statistical test created by and named after the Japanese researcher Fumio Tajima. The purpose of the test is to distinguish between a DNA sequence evolving randomly and one evolving under a non-random process, including directional selection or balancing selection, demographic...
- Taleb distributionTaleb DistributionIn economics and finance, a Taleb distribution is a term coined by U.K. economists/journalists Martin Wolf and John Kay to describe a returns profile that appears at times deceptively low-risk with steady returns, but experiences periodically catastrophic drawdowns. It does not describe a...
- Tampering (quality control)Tampering (quality control)Tampering in the context of a controlled process is when adjustments to the process are made based on outcomes which are within the expected range of variability. The net result is to re-align the process so that an increased proportion of the output is out of specification. The term was introduced...
- Taylor expansions for the moments of functions of random variablesTaylor expansions for the moments of functions of random variablesIn probability theory, it is possible to approximate the moments of a function f of a random variable X using Taylor expansions, provided that f is sufficiently differentiable and that the moments of X are finite...
- Telegraph processTelegraph processIn probability theory, the telegraph process is a memoryless continuous-time stochastic process that shows two distinct values.If these are called a and b, the process can be described by the following master equations:...
- Test for structural changeTest for structural changeTest for structural change is an econometric test. It is used to verify the equality of coefficients in separate subsamples. See Chow test....
- Test-retest (disambiguation)
- Test scoreTest scoreA test score is a piece of information, usually a number, that conveys the performance of an examinee on a test. One formal definition is that it is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being...
- Test setTest setA test set is a set of data used in various areas of information science to assess the strength and utility of a predictive relationship. Test sets are used in artificial intelligence, machine learning, genetic programming, intelligent systems, and statistics...
- Test statisticTest statisticIn statistical hypothesis testing, a hypothesis test is typically specified in terms of a test statistic, which is a function of the sample; it is considered as a numerical summary of a set of data that...
- TestimatorTestimatorA testimator is an estimator whose value depends on the result of a test for statistical significance. In the simplest case the value of the final estimator is that of the basic estimator if the test result is significant, and otherwise the value is zero...
- Testing hypotheses suggested by the dataTesting hypotheses suggested by the dataIn statistics, hypotheses suggested by the data, if tested using the data set that suggested them, are likely to be accepted even when they are not true...
- Text analyticsText analyticsThe term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation. The term is roughly synonymous with text mining;...
- The Long TailThe Long TailThe Long Tail or long tail refers to the statistical property that a larger share of population rests within the tail of a probability distribution than observed under a 'normal' or Gaussian distribution...
— possibly seminal magazine article - The UnscramblerThe UnscramblerThe Unscrambler is a commercial software product for multivariate data analysis, used primarily for calibration in the application of near infrared spectroscopy and development of predictive models for use in real-time spectroscopic analysis of materials. The software was originally developed in...
— software - Theil indexTheil indexThe Theil index is a statistic used to measure economic inequality. It has also been used to measure the lack of racial diversity. The basic Theil index TT is the same as redundancy in information theory which is the maximum possible entropy of the data minus the observed entropy. It is a special...
- Theil–Sen estimatorTheil–Sen estimatorIn non-parametric statistics, the Theil–Sen estimator, also known as Sen's slope estimator, slope selection, the single median method, or the Kendall robust line-fit method, is a method for robust linear regression that chooses the median slope among all lines through pairs of two-dimensional...
- Theory of conjoint measurementTheory of conjoint measurementThe theory of conjoint measurement is a general, formal theory of continuous quantity. It was independently discovered by the French economist Gerard Debreu and by the American mathematical psychologist R...
- Therapeutic effectTherapeutic effectA therapeutic effect is a consequence of a medical treatment of any kind, the results of which are judged to be desirable and beneficial. This is true whether the result was expected, unexpected, or even an unintended consequence of the treatment...
- Three-point estimationThree-point estimationThe three-point estimation technique is used in management and information systems applications for the construction of an approximate probability distribution representing the outcome of future events, based on very limited information...
- Three-stage least squares
- Threshold modelThreshold modelIn mathematical or statistical modelling a threshold model is any model where a threshold value, or set of threshold values, is used to distinguish ranges of values where the behaviour predicted by the model differs in some important way...
- Thurstone scaleThurstone scaleIn psychology, the Thurstone scale was the first formal technique for measuring an attitude. It was developed by Louis Leon Thurstone in 1928, as a means of measuring attitudes towards religion. It is made up of statements about a particular issue, and each statement has a numerical value...
- Time-frequency analysisTime-frequency analysisIn signal processing, time–frequency analysis comprises those techniques that study a signal in both the time and frequency domains simultaneously, using various time–frequency representations...
- Time–frequency representation
- Time reversibilityTime reversibilityTime reversibility is an attribute of some stochastic processes and some deterministic processes.If a stochastic process is time reversible, then it is not possible to determine, given the states at a number of points in time after running the stochastic process, which state came first and which...
- Time seriesTime seriesIn statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones index or the annual flow volume of the...
- Time-series regression
- Time use surveyTime use surveyA Time Use Survey is a statistical survey which aims to report data on how, on average, people spend their time.- Objectives :The objective is to identify, classify and quantify the main types of activity that people engage in during a definitive time period, e.g...
- Time-varying covariateTime-varying covariateA time-varying covariate is a term used in statistics, particularly in survival analyses. It reflects the phenomenon that a covariate is not necessarily constant through the whole study...
- Timeline of probability and statisticsTimeline of probability and statisticsA timeline of probability and statistics-Before 1600:* 9th Century - Al-Kindi was the first to use statistics to decipher encrypted messages and developed the first code breaking algorithm in the House of Wisdom in Baghdad, based on frequency analysis...
- TinkerPlotsTinkerPlotsTinkerPlots is exploratory data analysis software designed for use by students in grades 4-8. It was designed by Clifford Konold and Craig Miller at the University of Massachusetts Amherst and is published by Key Curriculum Press. It has some similarities with Fathom, and runs on Windows XP or...
— proprietary software for schools - Tobit modelTobit modelThe Tobit model is a statistical model proposed by James Tobin to describe the relationship between a non-negative dependent variable y_i and an independent variable x_i....
- Tolerance intervalTolerance intervalA tolerance interval is a statistical interval within which, with some confidence level, a specified proportion of a population falls.A tolerance interval can be seen as a statistical version of a probability interval. If we knew a population's exact parameters, we would be able to compute a range...
- Top-codedTop-codedIn econometrics and statistics, a top-coded dataset is one for which the upper bound is not known. This is often done to preserve the anonymity of people participating in the survey .-Example: Top-coding of wealth:Jacob S...
- Topic modelTopic modelIn machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. An early topic model was probabilistic latent semantic indexing , created by Thomas Hofmann in 1999...
(statistical natural language processing) - Topological data analysisTopological data analysisTopological data analysis is a new area of study aimed at having applications in areas such as data mining and computer vision.The main problems are:# how one infers high-dimensional structure from low-dimensional representations; and...
- Tornqvist indexTornqvist indexIn economics the Törnqvist index is a price or quantity index. Using price and quantity data, a Tornqvist index is a discrete approximation to a continuous Divisia index. A Divisia index is a weighted sum of the growth rates of the various components, where the weights are the component's shares in...
- Total correlationTotal correlationIn probability theory and in particular in information theory, total correlation is one of several generalizations of the mutual information. It is also known as the multivariate constraint or multiinformation...
- Total least squares
- Total sum of squaresTotal sum of squaresIn statistical data analysis the total sum of squares is a quantity that appears as part of a standard way of presenting results of such analyses...
- Total variation distance — a statistical distance measure
- TPL TablesTPL TablesTPL Tables is a cross tabulation system used to generate statistical tables for analysis or publication.- Background / History :TPL Tables has its roots in the Table Producing Language system, developed at the Bureau of Labor Statistics in the 1970s and early 1980s to run on IBM mainframes. It...
– software - Tracy–Widom distributionTracy–Widom distributionThe Tracy–Widom distribution, introduced by , is the probability distribution of the largest eigenvalue of a random hermitian matrix in the edge scaling limit. It also appears in the distribution of the length of the longest increasing subsequence of random permutations and in current fluctuations...
- Traffic equationsTraffic equationsIn queueing theory, a discipline within the mathematical theory of probability, traffic equations are equations that describe the mean arrival rate of traffic, allowing the arrival rates at individual nodes to be determined...
- Training setTraining setA training set is a set of data used in various areas of information science to discover potentially predictive relationships. Training sets are used in artificial intelligence, machine learning, genetic programming, intelligent systems, and statistics...
- TransectTransectA transect is a path along which one records and counts occurrences of the phenomena of study .It requires an observer to move along a fixed path and to count occurrences along the path and, at the same time, obtain the distance of the object from the path...
- Transferable belief modelTransferable belief modelThe transferable belief model is an elaboration on the Dempster-Shafer theory of evidence.-Context:Consider the following classical problem of information fusion. A patient has an illness that can be caused by three different factors A, B and C...
- TransiogramTransiogramTransiogram is the accompanying spatial correlation measure of Markov chain random fields and an important part of Markov chain geostatistics. It is defined as a transition probability function over the distance lag. Simply, a transiogram refers to a transition probability diagram. Transiograms...
- Transmission risks and ratesTransmission risks and ratesTransmission of an infection requires three conditions:*an infectious individual*a susceptible individual*an effective contact between themAn effective contact is defined as any kind of contact between two individuals such that, if one individual is infectious and the other susceptible, then the...
- Treatment group
- Trend analysisTrend analysisTrend Analysis is the practice of collecting information and attempting to spot a pattern, or trend, in the information. In some fields of study, the term "trend analysis" has more formally-defined meanings....
- Trend estimationTrend estimationTrend estimation is a statistical technique to aid interpretation of data. When a series of measurements of a process are treated as a time series, trend estimation can be used to make and justify statements about tendencies in the data...
- Trend stationary
- Treynor ratioTreynor ratioThe Treynor ratio , named after Jack L. Treynor, is a measurement of the returns earned in excess of that which could have been earned on an investment that has no diversifiable risk , per each unit of market risk assumed.The Treynor ratio relates...
- Triangular distribution
- TrimeanTrimeanIn statistics the trimean , or Tukey's trimean, is a measure of a probability distribution's location defined as a weighted average of the distribution's median and its two quartiles:This is equivalent to the average of the median and the midhinge:...
- Trimmed estimatorTrimmed estimatorGiven an estimator, a trimmed estimator is obtained by excluding some of the extreme values. This is generally done to obtain a more robust statistic: the extreme values are considered outliers....
- TrispectrumTrispectrumIn mathematics, in the area of statistical analysis, the trispectrum is a statistic used to search for nonlinear interactions. The Fourier transform of the second-order cumulant, i.e., the autocorrelation function, is the traditional power spectrum...
- True experimentTrue experimentA true experiment is a method of social research in which there are two kinds of variables. The independent variable is manipulated by the experimenter, and the dependent variable is measured...
- True variance
- Truncated distributionTruncated distributionIn statistics, a truncated distribution is a conditional distribution that results from restricting the domain of some other probability distribution. Truncated distributions arise in practical statistics in cases where the ability to record, or even to know about, occurrences is limited to values...
- Truncated meanTruncated meanA truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both.For...
- Truncated normal distributionTruncated normal distributionIn probability and statistics, the truncated normal distribution is the probability distribution of a normally distributed random variable whose value is either bounded below or above . The truncated normal distribution has wide applications in statistics and econometrics...
- Truncated regression modelTruncated regression modelTruncated regression models arise in many applications of statistics, for example in econometrics, in cases where observations with values in the outcome variable below or above certain thresholds systematically excluded from the sample...
- Truncation (statistics)Truncation (statistics)In statistics, truncation results in values that are limited above or below, resulting in a truncated sample. Truncation is similar to but distinct from the concept of statistical censoring. A truncated sample can be thought of as being equivalent to an underlying sample with all values outside the...
- Tsallis distributionTsallis distributionIn q-analog theory and statistical mechanics, a Tsallis distribution is a probability distribution derived from the maximization of the Tsallis entropy under appropriate constraints. There are several different families of Tsallis distributions, yet different sources may reference an individual...
- Tsallis statisticsTsallis statisticsThe term Tsallis statistics usually refers to the collection of q-analogs of mathematical functions and associated probability distributions that were originated by Constantino Tsallis. Using these tools, it is possible to derive Tsallis distributions from the optimization of the Tsallis entropic...
- Tschuprow's TTschuprow's TIn statistics, Tschuprow's T is a measure of association between two nominal variables, giving a value between 0 and 1 . It is closely related to Cramér's V, coinciding with it for square contingency tables....
- Tucker decompositionTucker decompositionIn mathematics, Tucker decomposition decomposes a tensor into a set of matrices and one small core tensor. It is named after Ledyard R. Tuckeralthough it goes back to Hitchcock in 1927....
- Tukey's range test — multiple comparisons
- Tukey's test of additivityTukey's test of additivityIn statistics, Tukey's test of additivity, named for John Tukey, is an approach used in two-way anova to assess whether the factor variables are additively related to the expected value of the response variable...
— interaction in two-way anova - Tukey–Kramer method
- Tukey lambda distribution
- Tweedie distributionsTweedie distributionsIn probability and statistics, the Tweedie distributions are a family of probability distributions which include continuous distributions such as the normal and gamma, the purely discrete scaled Poisson distribution, and the class of mixed compound Poisson-Gamma distributions which have positive...
- Twisting properties
- Two stage least squares — redirects to Instrumental variableInstrumental variableIn statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....
- Two-tailed testTwo-tailed testThe two-tailed test is a statistical test used in inference, in which a given statistical hypothesis, H0 , will be rejected when the value of the test statistic is either sufficiently small or sufficiently large...
- Type I and type II errorsType I and type II errorsIn statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...
- Type-1 Gumbel distribution
- Type-2 Gumbel distribution
- Tyranny of averagesTyranny of averagesThe tyranny of averages is a phrase used in applied statistics to describe the often overlooked fact that the mean does not provide any information about the distribution of a data set or skewness, and that decisions or analysis based on this value—as opposed to median and standard deviation—may be...
U
- u-chart
- U-quadratic distributionU-quadratic distributionIn probability theory and statistics, the U-quadratic distribution is a continuous probability distribution defined by a unique quadratic function with lower limit a and upper limit b.-Parameter relations:...
- U-statisticU-statisticIn statistical theory, a U-statistic is a class of statistics that is especially important in estimation theory. In elementary statistics, U-statistics arise naturally in producing minimum-variance unbiased estimators...
- U test
- Umbrella samplingUmbrella samplingUmbrella sampling is a technique in computational physics and chemistry, used to improve sampling of a system where ergodicity is hindered by the form of the system's energy landscape. It was first suggested by Torrie and Valleau in 1977...
- Unbiased estimator—see bias (statistics)Bias (statistics)A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter of interest. The following lists some types of, or aspects of, bias which should not be considered mutually exclusive:...
- Unbiased estimation of standard deviationUnbiased estimation of standard deviationThe question of unbiased estimation of a standard deviation arises in statistics mainly as question in statistical theory. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the...
- UncertaintyUncertaintyUncertainty is a term used in subtly different ways in a number of fields, including physics, philosophy, statistics, economics, finance, insurance, psychology, sociology, engineering, and information science...
- Uncertainty coefficientUncertainty coefficientIn statistics, the uncertainty coefficient, also called entropy coefficient or Theil's U, is a measure of nominal association. It was first introduced by Henri Theil and is based on the concept of information entropy. Suppose we have samples of two random variables, i and j...
- Uncertainty quantificationUncertainty quantificationUncertainty quantification is the science of quantitative characterization and reduction of uncertainties in applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known...
- Uncomfortable scienceUncomfortable scienceUncomfortable science is the term coined by statistician John Tukey for cases in which there is a need to draw an inference from a limited sample of data, where further samples influenced by the same cause system will not be available...
- UncorrelatedUncorrelatedIn probability theory and statistics, two real-valued random variables are said to be uncorrelated if their covariance is zero. Uncorrelatedness is by definition pairwise; i.e...
- Underdispersion — redirects to OverdispersionOverdispersionIn statistics, overdispersion is the presence of greater variability in a data set than would be expected based on a given simple statistical model....
- Unexplained variation — redirects to Explained variationExplained variationIn statistics, explained variation or explained randomness measures the proportion to which a mathematical model accounts for the variation of a given data set...
- Underprivileged area scoreUnderprivileged area scoreThe Underprivileged Area Score is an indicy to measure socio-economic variation across small geographical areas. The score is an outcome of the need identified in the Acheson Committee Report , to create an indicy to identify 'underprivileged areas' where there were high numbers of patients and...
- Uniform distribution (continuous)Uniform distribution (continuous)In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by...
- Uniform distribution (discrete)
- Uniformly most powerful testUniformly most powerful testIn statistical hypothesis testing, a uniformly most powerful test is a hypothesis test which has the greatest power 1 − β among all possible tests of a given size α...
- Unimodal distribution redirects to Unimodal function (has some stats context)
- UnimodalityUnimodalityUnimodality is a term used in several contexts in mathematics. Originally, it relates to possessing a unique mode.- Unimodal probability distribution :...
- UnistatUnistatThe Unistat computer program is a statistical data analysis tool featuring two modes of operation: The stand-alone user interface is a complete workbench for data input, analysis and visualization while the Microsoft Excel add-in mode extends the features of the mainstream spreadsheet application...
– software - Unit (statistics)
- Unit of observationUnit of observationThe Unit of observation is the unit on which one collects data . For example, a study may have a unit of observation at the individual level but may have the unit of analysis at the neighborhood level, drawing conclusions on neighborhood characteristics from data collected from individuals....
- Unit rootUnit rootIn time series models in econometrics , a unit root is a feature of processes that evolve through time that can cause problems in statistical inference if it is not adequately dealt with....
- Unit root testUnit root testIn statistics, a unit root test tests whether a time series variable is non-stationary using an autoregressive model. A well-known test that is valid in large samples is the augmented Dickey–Fuller test. The optimal finite sample tests for a unit root in autoregressive models were developed by John...
- Unit-weighted regressionUnit-weighted regressionIn statistics, unit-weighted regression is perhaps the easiest form of multiple regression analysis, a method in which two or more variables are used to predict the value of an outcome....
- Unitized risk
- UnivariateUnivariateIn mathematics, univariate refers to an expression, equation, function or polynomial of only one variable. Objects of any of these types but involving more than one variable may be called multivariate...
- Univariate analysisUnivariate analysisUnivariate analysis is the simplest form of quantitative analysis. The analysis is carried out with the description of a single variable and its attributes of the applicable unit of analysis...
- Univariate distributionUnivariate distributionIn statistics, a univariate distribution is a probability distribution of only one random variable. This is in contrast to a multivariate distribution, the probability distribution of a random vector.-Further reading:...
- Unmatched countUnmatched countIn psychology and social research, unmatched count, or item count, is a technique to improve through anonymity the number of true answers to possibly embarrassing or self-incriminating questions. It is very simple to use but yields only the number of people bearing the property of interest.- Method...
- Unsolved problems in statisticsUnsolved problems in statisticsThere are many longstanding unsolved problems in mathematics for which a solution has still not yet been found. The unsolved problems in statistics are generally of a different flavor; according to John Tukey, "difficulties in identifying problems have delayed statistics far more than difficulties...
- Upper and lower probabilitiesUpper and lower probabilitiesUpper and lower probabilities are representations of imprecise probability. Whereas probability theory uses a single number, the probability, to describe how likely an event is to occur, this method uses two numbers: the upper probability of the event and the lower probability of the event.Because...
- Upside potential ratioUpside potential ratioThe Upside-Potential Ratio is a measure of a return of an investment asset relative to the minimal acceptable return. The measurement allows a firm or individual to choose investments which have had relatively good upside performance, per unit of downside risk....
– finance - Urn problemUrn problemIn probability and statistics, an urn problem is an idealized mental exercise in which some objects of real interest are represented as colored balls in an urn or other container....
- Ursell functionUrsell functionIn statistical mechanics, an Ursell function or connected correlation function, is a cumulant ofa random variable. It is also called a connected correlation function as it can often be obtained by summing over...
- Utility maximization problemUtility maximization problemIn microeconomics, the utility maximization problem is the problem consumers face: "how should I spend my money in order to maximize my utility?" It is a type of optimal decision problem.-Basic setup:...
- UtilizationUtilizationUtilization is a statistical concept as well as a primary business measure for the rental industry.-Queueing theory:In queueing theory, utilization is the proportion of the system's resources which is used by the traffic which arrives at it. It should be strictly less than one for the system to...
- Utilization distributionUtilization distributionA utilization distribution is a probability distribution constructed from data providing the location of an individual in space at different points in time....
V
- Validity (statistics)Validity (statistics)In science and statistics, validity has no single agreed definition but generally refers to the extent to which a concept, conclusion or measurement is well-founded and corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong...
- Van der Waerden testVan der Waerden testNamed for the Dutch mathematician Bartel Leendert van der Waerden, the Van der Waerden test is a statistical test that k population distribution functions are equal. The Van Der Waerden test converts the ranks from a standard Kruskal-Wallis one-way analysis of variance to quantiles of the standard...
- Van Houtum distributionVan Houtum distributionIn probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum. It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of...
- Vapnik–Chervonenkis theory
- Varadhan's lemmaVaradhan's lemmaIn mathematics, Varadhan's lemma is a result in large deviations theory named after S. R. Srinivasa Varadhan. The result gives information on the asymptotic distribution of a statistic φ of a family of random variables Zε as ε becomes small in terms of a rate function for the variables.-Statement...
- VariableVariable (mathematics)In mathematics, a variable is a value that may change within the scope of a given problem or set of operations. In contrast, a constant is a value that remains unchanged, though often unknown or undetermined. The concepts of constants and variables are fundamental to many areas of mathematics and...
- Variable kernel density estimationVariable kernel density estimationIn statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varied...
- Variable-order Bayesian networkVariable-order Bayesian networkVariable-order Bayesian network models provide an important extension of both the Bayesian network models and the variable-order Markov models...
- Variable-order Markov modelVariable-order Markov modelVariable-order Markov models are an important class of models that extend the well known Markov chain models. In contrast to the Markov chain models, where each random variable in a sequence with a Markov property depends on a fixed number of random variables, in VOM models this number of...
- Variable rules analysis
- VarianceVarianceIn probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
- Variance decompositionVariance decompositionVariance decomposition or forecast error variance decomposition indicates the amount of information each variable contributes to the other variables in a vector autoregression models...
- Variance gamma processVariance gamma processIn the theory of stochastic processes, a part of the mathematical theory of probability, the variance gamma process , also known as Laplace motion, is a Lévy process determined by a random time change. The process has finite moments distinguishing it from many Lévy processes. There is no diffusion...
- Variance inflation factorVariance inflation factorIn statistics, the variance inflation factor quantifies the severity of multicollinearity in an ordinary least squares regression analysis...
- Variance-gamma distribution
- Variance reductionVariance reductionIn mathematics, more specifically in the theory of Monte Carlo methods, variance reduction is a procedure used to increase the precision of the estimates that can be obtained for a given number of iterations. Every output random variable from the simulation is associated with a variance which...
- Variance-stabilizing transformationVariance-stabilizing transformationIn applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or analysis of variance techniques.The aim behind the...
- Variance-to-mean ratio
- Variation ratio
- Variational Bayesian methods
- Variational message passingVariational message passingVariational message passing is an approximate inference technique for continuous- or discrete-valued Bayesian networks, with conjugate-exponential parents, developed by John Winn...
- VariogramVariogramIn spatial statistics the theoretical variogram 2\gamma is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z...
- Varimax rotationVarimax rotationIn statistics, a varimax rotation is a change of coordinates used in principal component analysis and factor analysis that maximizes the sum of the variances of the squared loadings...
- Vasicek modelVasicek modelIn finance, the Vasicek model is a mathematical model describing the evolution of interest rates. It is a type of "one-factor model" as it describes interest rate movements as driven by only one source of market risk...
- VC dimensionVC dimensionIn statistical learning theory, or sometimes computational learning theory, the VC dimension is a measure of the capacity of a statistical classification algorithm, defined as the cardinality of the largest set of points that the algorithm can shatter...
- VC theory
- Vector autoregressionVector autoregressionVector autoregression is a statistical model used to capture the linear interdependencies among multiple time series. VAR models generalize the univariate autoregression models. All the variables in a VAR are treated symmetrically; each variable has an equation explaining its evolution based on...
- VEGAS algorithmVEGAS algorithmThe VEGAS algorithm, due to G. P. Lepage, is a method for reducing error in Monte Carlo simulations by using a known or approximate probability distribution function to concentrate the search in those areas of the graph that make the greatest contribution to the final integral.The VEGAS algorithm...
- Violin plotViolin plotViolin plots are a method of plotting numeric data. A violin plot is a combination of a box plot and a kernel density plot. Specifically, it starts with a box plot...
- ViSta - Software — redirects to ViSta, The Visual Statistics systemViSta, The Visual Statistics systemViSta, the Visual Statistics system is a freeware statistical system developed by Forrest W. Young of the University of North Carolina. ViSta current version maintained by Pedro M. Valero-Mora of the University of Valencia and can be found at...
- Voigt profile
- Volatility (finance)Volatility (finance)In finance, volatility is a measure for variation of price of a financial instrument over time. Historic volatility is derived from time series of past market prices...
- Volcano plot (statistics)Volcano plot (statistics)In statistics, a volcano plot is a type of scatter-plot that is used to quickly identify changes in large datasets composed of replicate data . It plots significance versus fold-change on the y- and x-axes, respectively...
- Von Mises distribution
- Von Mises–Fisher distribution
- V-optimal histograms
- V-statisticV-statisticV-statistics are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947. V-statistics are closely related to U-statistics introduced by Wassily Hoeffding in 1948...
- Vuong's closeness test
- Vysochanskiï–Petunin inequality
W
- Wald distribution redirects to Inverse Gaussian distributionInverse Gaussian distribution| cdf = \Phi\left +\exp\left \Phi\left...
- Wald testWald testThe Wald test is a parametric statistical test named after Abraham Wald with a great variety of uses. Whenever a relationship within or between data items can be expressed as a statistical model with parameters to be estimated from a sample, the Wald test can be used to test the true value of the...
- Wald's decision theoryWald's decision theoryWald's decision theory was explicated in his last book, "Statistical decision functions"...
- Wald–Wolfowitz runs test
- Wallenius' noncentral hypergeometric distributionWallenius' noncentral hypergeometric distributionIn probability theory and statistics, Wallenius' noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where items are sampled with bias....
- Wang and Landau algorithmWang and Landau algorithmThe Wang and Landau algorithm proposed by Fugao Wang and David P. Landau is an extension of Metropolis Monte Carlo sampling. It is designed to calculate the density of states of a computer-simulated system, such as an Ising model of spin glasses, or model atoms in a molecular force field...
- Watterson estimatorWatterson estimatorIn population genetics, the Watterson estimator is a method for estimating the population mutation rate, \theta = 4N_e\mu, where N_e is the effective population size and \mu is the per-generation mutation rate of the population of interest...
- Watts and Strogatz modelWatts and Strogatz modelThe Watts and Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. It was proposed by Duncan J. Watts and Steven Strogatz in their joint 1998 Nature paper...
- Weibull chart — presently redirects to weibull distribution
- Weibull distribution
- Weibull modulusWeibull modulusThe Weibull modulus is a dimensionless parameter of the Weibull distribution which is used to describe variability in measured material strength of brittle materials. For ceramics and other brittle materials, the maximum stress that a sample can be measured to withstand before failure may vary from...
- Weight functionWeight functionA weight function is a mathematical device used when performing a sum, integral, or average in order to give some elements more "weight" or influence on the result than other elements in the same set. They occur frequently in statistics and analysis, and are closely related to the concept of a...
- Weighted sample redirects to Sample mean and sample covarianceSample mean and sample covarianceThe sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...
- Weighted covariance matrix redirects to Sample mean and sample covarianceSample mean and sample covarianceThe sample mean or empirical mean and the sample covariance are statistics computed from a collection of data on one or more random variables. The sample mean is a vector each of whose elements is the sample mean of one of the random variables that is, each of whose elements is the average of the...
- Weighted meanWeighted meanThe weighted mean is similar to an arithmetic mean , where instead of each of the data points contributing equally to the final average, some data points contribute more than others...
- Welch's t testWelch's t testIn statistics, Welch's t test is an adaptation of Student's t-test intended for use with two samples having possibly unequal variances. As such, it is an approximate solution to the Behrens–Fisher problem.-Formulas:...
- Welch–Satterthwaite equation
- Well-behaved statisticWell-behaved statisticA well-behaved statistic is a term sometimes used in the theory of statistics to describe part of a procedure. This usage is broadly similar to the use of well-behaved in more general mathematics...
- Wick productWick productIn probability theory, the Wick product\langle X_1,\dots,X_k \rangle\,named after physicist Gian-Carlo Wick, is a sort of product of the random variables, X1, ..., Xk, defined recursively as follows:\langle \rangle = 1\,...
- Wilks' lambda distributionWilks' lambda distributionIn statistics, Wilks' lambda distribution , is a probability distribution used in multivariate hypothesis testing, especially with regard to the likelihood-ratio test and Multivariate analysis of variance...
- Winsorized meanWinsorized meanA Winsorized mean is a Winsorized statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean...
- Whipple's indexWhipple's IndexSurvey or census respondents sometimes inaccurately report ages or dates of birth. Whipple's index , invented by the American demographer George Chandler Whipple , indicates the extent to which age data show systematic heaping on certain ages as a result of digit preference or rounding...
- White testWhite testIn statistics, the White test is a statistical test that establishes whether the residual variance of a variable in a regression model is constant: that is for homoscedasticity....
- White noiseWhite noiseWhite noise is a random signal with a flat power spectral density. In other words, the signal contains equal power within a fixed bandwidth at any center frequency...
- Wide and narrow dataWide and narrow dataWide and narrow are terms used to describe two different presentations for tabular data.- Wide :Wide, or unstacked data is presented with each different data variable in a separate column.- Narrow :...
- Wiener deconvolutionWiener deconvolutionIn mathematics, Wiener deconvolution is an application of the Wiener filter to the noise problems inherent in deconvolution. It works in the frequency domain, attempting to minimize the impact of deconvoluted noise at frequencies which have a poor signal-to-noise ratio.The Wiener deconvolution...
- Wiener filterWiener filterIn signal processing, the Wiener filter is a filter proposed by Norbert Wiener during the 1940s and published in 1949. Its purpose is to reduce the amount of noise present in a signal by comparison with an estimation of the desired noiseless signal. The discrete-time equivalent of Wiener's work was...
- Wiener processWiener processIn mathematics, the Wiener process is a continuous-time stochastic process named in honor of Norbert Wiener. It is often called standard Brownian motion, after Robert Brown...
- Wigner quasi-probability distributionWigner quasi-probability distributionThe Wigner quasi-probability distribution is a quasi-probability distribution. It was introduced by Eugene Wigner in 1932 to study quantum corrections to classical statistical mechanics...
- Wigner semicircle distribution
- Wike's law of low odd primesWike's law of low odd primesWike's law of low odd primes is a methodological principle to help design sound experiments in psychology. It is: "If the number of experimental treatments is a low odd prime number, then the experimental design is unbalanced and partially confounded" Wike's law of low odd primes is a...
- Wilcoxon signed-rank testWilcoxon signed-rank testThe Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples or repeated measurements on a single sample to assess whether their population mean ranks differ The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used...
- Will Rogers phenomenonWill Rogers phenomenonThe Will Rogers phenomenon is obtained when moving an element from one set to another set raises the average values of both sets. It is based on the following quote, attributed to comedian Will Rogers:...
- WinBUGSWinBUGSWinBUGS is statistical software for Bayesian analysis using Markov chain Monte Carlo methods.It is based on the BUGS project started in 1989...
– software - Window functionWindow functionIn signal processing, a window function is a mathematical function that is zero-valued outside of some chosen interval. For instance, a function that is constant inside the interval and zero elsewhere is called a rectangular window, which describes the shape of its graphical representation...
- WinpepiWinpepiWinPepi is a freeware package of statistical programs for epidemiologists, comprising seven programs with over 120 modules. WinPepi is not a complete compendium of statistical routines for epidemiologists but it provides a very wide range of procedures, including those most commonly used and many...
– software - WinsorisingWinsorisingWinsorising or Winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-turned-biostatistician Charles P. Winsor...
- Wishart distribution
- Wold's theorem
- WomblingWomblingIn statistics, Wombling is any of a number of techniques used for identifying zones of rapid change, typically in some quantity as it varies across some geographical or Euclidean space. It is named for statistician William H. Womble....
- World Programming SystemWorld Programming SystemThe World Programming System, also known as WPS, is a software product developed by a company called World Programming. WPS allows users to create, edit and run programs written in the language of SAS. The latest release of WPS covers a significant gap in use of WPS. It now provides PROC REG and...
– software - Wrapped Cauchy distribution
- Wrapped distributionWrapped distributionIn probability theory and directional statistics, a wrapped probability distribution is a continuous probability distribution that describes data points that lie on a unit n-sphere...
- Wrapped exponential distribution
- Wrapped normal distributionWrapped normal distributionIn probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution which results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for...
- Wrapped Lévy distributionWrapped Lévy distributionIn probability theory and directional statistics, a wrapped Lévy distribution is a wrapped probability distribution that results from the "wrapping" of the Lévy distribution around the unit circle.- Description :The pdf of the wrapped Lévy distribution is...
- Writer invariantWriter invariantWriter invariant, also called authorial invariant or author's invariant, is a property of a text which is invariant of its author, that is, it will be similar in all texts of a given author and different in texts of different authors. It can be used to find plagiarism or discover who is real author...
X
- X-12-ARIMAX-12-ARIMAX-12-ARIMA is the U.S. Census Bureau's software package for seasonal adjustment. It can be used together with gretl, which provides a graphical user interface for X-12-ARIMA.X-12-ARIMA is the successor to X-11-ARIMA-See also:*AnSWR*ARIMA*CSPro...
- chartX-bar chartIn industrial statistics, the X-bar chart is a type of Shewhart control chart that is used to monitor the arithmetic means of successive samples of constant size, n. This type of control chart is used for characteristics that can be measured on a continuous scale, such as weight, temperature,...
- and R chart
- and s chart
- XLispStatXLispStatXLispStat is an open-source statistical scientific package based on the XLISP language.As from xlispstat startup: XLISP-PLUS version 3.04 Portions Copyright 1988, by David Betz. Modified by Thomas Almy and others....
– software - XLSTATXLSTATXLSTAT is a commercial statistical and multivariate analysis software. The software has been developed by Addinsoft and was introduced by Thierry Fahmy, the founder of Addinsoft, in 1993. It is a Microsoft Excel add-in...
– software - XploReXploReXploRe is the name of a commercial statistics software, developed by the German software company MD*Tech. XploRe is not sold anymore. The last version, 4.8, is available for download at no cost. The user interacts with the software via the XploRe programming language, which is derived from the C...
– software
Y
- Yamartino methodYamartino methodThe Yamartino method is an algorithm for calculating an approximation to the standard deviation σθ of wind direction θ during a single pass through the incoming data...
- Yates analysisYates AnalysisFull- and fractional-factorial designs are common in designed experiments for engineering and scientific applications. In these designs, each factor is assigned two levels. These are typically called the low and high levels. For computational purposes, the factors are scaled so that the low level...
- Yates's correction for continuity
- Youden's J statisticYouden's J statisticYouden's J statistic is a single statistic that captures the performance of a diagnostic test. The use of such a single index is "not generally to be recommended". It is equal to the risk difference for a dichotomous test ....
- Yule–Simon distribution
- YxilonYxilonYxilon is a modular open-source statistical programming language.Developed by Sigbert Klinke, Uwe Ziegenhagen, and Yuval Guri.A re-implementation of the XploRe language, with the intention of providing better performance by using compiled code instead of a language interpreter...
– statistical programming language
Z
- z-score
- z-factorZ-factorThe Z-factor is a measure of statistical effect size. It has been proposed for use in high-throughput screening to judge whether the response in a particular assay is large enough to warrant further attention.-Background:...
- z statisticZ statisticIn statistics, the Vuong closeness test is likelihood-ratio-based test for model selection using the Kullback-Leibler information criterion. This statistic makes probabilistic statements about two models. It tests the null hypothesis, that two models are as close to the actual model against the...
- Z-testZ-testA Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Due to the central limit theorem, many test statistics are approximately normally distributed for large samples...
- Zakai equationZakai equationIn filtering theory the Zakai equation is a linear recursive filtering equation for the un-normalized density of a hidden state. In contrast, the Kushner equation gives a non-linear recursive equation for the normalized density of the hidden state...
- Zelen's design
- Zero-one law (disambiguation)
- Zeta distribution
- Ziggurat algorithmZiggurat algorithmThe ziggurat algorithm is an algorithm for pseudo-random number sampling. Belonging to the class of rejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from a pseudo-random number generator, as well as precomputed tables. The...
- Zipf–Mandelbrot law — a discrete distribution
- Zipf's law
See also
Supplementary listsThese lists include items which are somehow related to statistics however are not included in this index:
- List of statisticians
- List of important publications in statistics
- List of scientific journals in statistics
Topic lists
- Outline of statistics
- List of probability topics
- Glossary of probability and statisticsGlossary of probability and statisticsThe following is a glossary of terms. It is not intended to be all-inclusive.- Concerned fields :*Probability theory*Algebra of random variables *Statistics*Measure theory*Estimation theory- Glossary :...
- Glossary of experimental designGlossary of experimental design- Glossary :* Alias: When the estimate of an effect also includes the influence of one or more other effects the effects are said to be aliased . For example, if the estimate of effect D in a four factor experiment actually estimates , then the main effect D is aliased with the 3-way interaction ABC...
- Notation in probability and statistics
- List of probability distributions
- List of graphical methods
- List of fields of application of statistics
- List of stochastic processes topics
- Lists of statistics topics
- List of statistical packages
External links
- ISI Glossary of Statistical Terms (multilingual), International Statistical Institute