Winsorized mean
Encyclopedia
A Winsorized mean is a Winsorized
Winsorising
Winsorising or Winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-turned-biostatistician Charles P. Winsor...

 statistical measure of central tendency, much like the mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

 and median
Median
In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...

, and even more similar to the truncated mean
Truncated mean
A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both.For...

. It involves the calculation of the mean after replacing given parts of a probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....

 or sample
Sampling (statistics)
In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....

 at the high and low end with the most extreme remaining values, typically discarding an equal amount of both; often 10 to 25 percent of the ends are replaced.

Advantages

The Winsorized mean is a useful estimator because it is less sensitive to outlier
Outlier
In statistics, an outlier is an observation that is numerically distant from the rest of the data. Grubbs defined an outlier as: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs....

s than the mean but will still give a reasonable estimate of central tendency or mean for almost all statistical models. In this regard it is referred to as a robust estimator.

Drawbacks

The Winsorized mean uses more information from the distribution or sample than the median
Median
In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...

. However, unless the underlying distribution is symmetric, the truncated mean of a sample is unlikely to produce an unbiased estimator for either the mean or the median.

Example

  • For a sample of 10 numbers (from x1, the smallest, to x10 the largest) the 10% Winsorized mean is


The key is in the repetition of x2 and x9: the extras substitute for the original values x1 and x10 which have been discarded and replaced.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK