Wilcoxon signed-rank test
Encyclopedia
The Wilcoxon signed-rank test is a non-parametric
Non-parametric statistics
In statistics, the term non-parametric statistics has at least two different meanings:The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:...

 statistical hypothesis test
Statistical hypothesis testing
A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...

 used when comparing two related samples or repeated measurements on a single sample to assess whether their population mean ranks differ (i.e. it's a paired difference test
Paired difference test
In statistics, a paired difference test is a type of location test that is used when comparing two sets of measurements to assess whether their population means differ...

).

It can be used as an alternative to the paired Student's t-test
Student's t-test
A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known...

 when the population cannot be assumed to be normally distributed or the data is on the ordinal scale.

The test is named for Frank Wilcoxon
Frank Wilcoxon
Frank Wilcoxon was a chemist and statistician, known for the development of several statistical tests....

 (1892–1965) who, in a single paper, proposed both it and the rank-sum test for two independent samples (Wilcoxon, 1945). The test was popularized by Siegel
Sidney Siegel
Sidney Siegel was an American psychologist who became especially well-known for his work in popularising non-parametric statistics for use in the behavioural sciences. He was a co-developer of the statistical test known as the Siegel–Tukey test.Siegel completed a Ph.D. in Psychology in 1953 at...

 (1956) in his influential text book on non-parametric statistics. Siegel used the symbol T for the value defined below as S. In consequence, the test is sometimes referred to as the Wilcoxon T test, and the test statistic is reported as a value of T.

Setup

Suppose we collect 2n observations, two observations of each of the n subjects. Let i denote the particular subject that is being referred to and the first observation measured on subject i be denoted by and second observation be . For each i in the observations, and should be paired together.

Assumptions

Let Zi = Xi – Yi for i = 1, ... , n.
  1. The differences Zi are assumed to be independent.
  2. Each Zi comes from the same continuous population, and is symmetric about a common median θ .
  3. The values which Xi and Yi represent are ordered (at least the ordinal level of measurement
    Level of measurement
    The "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed by the psychologist Stanley Smith Stevens. Stevens proposed his theory in a 1946 Science article titled "On the theory of scales of measurement"...

    ), so the comparisons "greater than", "less than", and "equal to" are useful.

Test procedure

The null hypothesis
Null hypothesis
The practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...

 tested is H0: θ = 0.
  1. Exclude observations with Zi = 0. Let m be the reduced sample size. (But see the note on #Excluding zero differences below.)
  2. Order the absolute values |Z1|, ..., |Zn| in ascending sequence, and let the rank of each non-zero |Zi| be Ri (the smallest positive |Zi| gets the rank of 1, and a mean rank is assigned to tied scores).
  3. Denote the positive Zi values with φi = I(Zi > 0), where I(.) is an indicator function: φi = 1 for Zi > 0, otherwise φi = 0.
  4. The Wilcoxon signed ranked statistic W+ is defined as
  5. Define W similarly by summing ranks of the negative differences Zi.
  6. Calculate S as the smaller of these two rank sums: S = min(W+, W).
  7. Find the critical value for the given sample size n (or m?), and the wanted confidence level.
    • For samples of a small size the critical value is obtained from a table (which is calculated by considering all possible distributions of ranks to calculate p, the statistical probability
      Probability
      Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

       of attaining S from a population of scores that is symmetrically distributed around the central point)
    • As the number of scores used, n, increases, the distribution of all possible ranks S tends towards the normal distribution. So although for n ≤ 20, exact probabilities would usually be calculated, for n > 20, the normal approximation is used. The recommended cutoff varies from textbook to textbook — here we use 20 although some put it lower (10) or higher (25).
  8. Compare S to the critical value, and reject H0 if S is less than or is equal to the critical value.

Example

Subject (i) Xi Yi Sign of Xi – Yi Xi – Yi Absolute Xi – Yi Rank of Absolute Signed Rank
1 125 110 + 15 15 7 7
2 115 122 –7 7 3 –3
3 130 125 + 5 5 1.5 1.5
4 140 120 + 20 20 9 9
5 140 140   0 0    
6 115 124 –9 9 4 –4
7 140 123 + 17 17 8 8
8 125 137 –12 12 6 –6
9 140 135 + 5 5 1.5 1.5
10 135 145 –10 10 5 –5
  1. The sign of Xi – Yi is denoted in the Sign column by either (+) or (–). If Xi and Yi are equal, then the value is thrown out.
  2. The values of Xi – Yi are given in the next two columns.
  3. The last two columns are the ranks. The absolute rank column has no signs, and the signed rank column gives the ranks along with their signs.
  4. The data is ranked from the smallest value to the largest value. In the case of a tie, ranks are added together and divided by the number of ties. For example, in this data, there were two instances of the value 5. The ranks corresponding to 5 are 1 and 2. The sum of these ranks is 3. After dividing by the number of ties, you get a mean rank of 1.5, and this value is assigned to both instances of 5.
  5. The test statistic, W+, is given by the sum of all of the positive values in the Signed Rank column. The test statistic, W, is given by the sum of all of the negative values in the Signed Rank column. For this example, W+ = 27 and W=18. The minimum of these is 18.
  6. Lastly, this test statistic is analyzed using a table of critical values. If the test statistic is less than or equal to the critical value based on the number of observations n, then the null hypothesis is rejected for the alternative hypothesis. Otherwise, the null hypothesis is not rejected. See table here.


In this case the test statistic is W = 18 and the critical value is 8 for a two-tailed p-value of 0.05. The test statistic must be less than this to be significant at this level, so in this case the null hypothesis is not rejected.

See also

  • Mann-Whitney-Wilcoxon test (the variant for two independent samples)
  • Sign test
    Sign test
    In statistics, the sign test can be used to test the hypothesis that there is "no difference in medians" between the continuous distributions of two random variables X and Y, in the situation when we can draw paired samples from X and Y...

     (Like Wilcoxon test, but without the assumption of symmetric distribution of the differences around the median, and without using the magnitude of the difference)

External links


Implementations

  • ALGLIB includes implementation of the Wilcoxon signed-rank test in C++, C#, Delphi, Visual Basic, etc.
  • The free statistical software R
    R (programming language)
    R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

     includes an implementation of the test as wilcox.test(x,y, paired=TRUE), where x and y are vectors of equal length.
  • GNU Octave
    GNU Octave
    GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command-line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with MATLAB...

    implements various one-tailed and two-tailed versions of the test in the wilcoxon_test function.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK