Chauvenet's criterion
Encyclopedia
In statistical theory, the Chauvenet's criterion (named for William Chauvenet
) is a means of assessing whether one piece of experimental data — an outlier
— from a set of observations, is likely to be spurious.
To apply Chauvenet's criterion, first calculate the mean
and standard deviation
of the observed data. Based on how much the suspect datum differs from the mean, use the normal distribution function (or a table thereof) to determine the probability
that a given data point will be at the value of the suspect data point. Multiply this probability by the number of data points taken. If the result is less than 0.5, the suspicious data point may be discarded, i.e., a reading may be rejected if the probability of obtaining the particular deviation from the mean is less than 1/(2n).
. It was developed a few years before Chauvenet's criterion was published, and it is a more rigorous approach to the rational deletion of outlier data. See S. Ross reference below. Other methods such as Grubbs' test for outliers
are mentioned under the listing for Outlier
.
William Chauvenet
William Chauvenet was an early American educator. A professor of mathematics, astronomy, navigation, and surveying, he was always known and well liked among students and faculty....
) is a means of assessing whether one piece of experimental data — an outlier
Outlier
In statistics, an outlier is an observation that is numerically distant from the rest of the data. Grubbs defined an outlier as: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs....
— from a set of observations, is likely to be spurious.
To apply Chauvenet's criterion, first calculate the mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
and standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
of the observed data. Based on how much the suspect datum differs from the mean, use the normal distribution function (or a table thereof) to determine the probability
Probability
Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...
that a given data point will be at the value of the suspect data point. Multiply this probability by the number of data points taken. If the result is less than 0.5, the suspicious data point may be discarded, i.e., a reading may be rejected if the probability of obtaining the particular deviation from the mean is less than 1/(2n).
Example
For instance, suppose a value is measured experimentally in several trials as 9, 10, 10, 10, 11, and 50. The mean is 16.7 and the standard deviation 16.34. 50 differs from 16.7 by 33.3, slightly more than two standard deviations. The probability of taking data more than two standard deviations from the mean is roughly 0.05. Six measurements were taken, so the statistic value (data size multiplied by the probability) is 0.05×6 = 0.3. Because 0.3 < 0.5, according to Chauvenet's criterion, the measured value of 50 should be discarded (leaving a new mean of 10, with standard deviation 0.7).Peirce's criterion
Another method for eliminating spurious data is called Peirce's criterionPeirce's criterion
In robust statistics, Peirce's criterion is a rule for eliminating outliers from data sets, which was devised by Benjamin Peirce.-The problem of outliers:...
. It was developed a few years before Chauvenet's criterion was published, and it is a more rigorous approach to the rational deletion of outlier data. See S. Ross reference below. Other methods such as Grubbs' test for outliers
Grubbs' test for outliers
Grubbs' test , also known as the maximum normed residual test, is a statistical test used to detect outliers in a univariate data set assumed to come from a normally distributed population.-Definition:...
are mentioned under the listing for Outlier
Outlier
In statistics, an outlier is an observation that is numerically distant from the rest of the data. Grubbs defined an outlier as: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs....
.