ANOVA on ranks
Encyclopedia
In statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

, one purpose for the analysis of variance
Analysis of variance
In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation...

 (ANOVA) is to analyze differences in mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

s between groups. The test statistic, F, assumes independence of observations, homogeneous variances, and population normality. ANOVA on ranks is a statistic designed for situations when the normality assumption has been violated.

Logic of the F test on means

The F statistic is a ratio of a numerator to a denominator. Consider randomly selected subjects that are subsequently randomly assigned to groups A, B, and C. Under the truth of the null hypothesis
Null hypothesis
The practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...

, the variability (or sum of squares) of scores on some dependent variable will be the same within each group. When divided by the degrees of freedom (i.e., based on the number of subjects per group), the numerator of the F ratio is obtained.

Treat the mean for each group as a score, and compute the variability (again, the sum of squares) of those three scores. When divided by its degrees of freedom (i.e., based on the number of groups), the denominator of the F ratio is obtained.

Under the truth of the null hypothesis, the sampling distribution of the F ratio depends on the degrees of freedom for the numerator and the denominator.

Model a treatment applied to group A by increasing every score by X. (This model maintains the underlying assumption of homogeneous variances. In practice it is rare – if not impossible – for an increase of X in a group mean to occur via an incease of each member's score by X.) This will shift the distribution X units in the positive direction, but will not have any impact on the variability within the group. However, the variability between the three groups' mean scores will now increase. If the resulting F ratio raises the value to such an extent that it exceeds the threshold of what constitutes a rare event (called the Alpha level), the Anova F test is said to reject the null hypothesis of equal means between the three groups, in favor of the alternative hypothesis that at least one of the groups has a larger mean (which in this example, is group A).

Handling violation of population normality

Ranking is one of many procedures used to transform data that do not meet the assumptions of normality. Conover and Iman provided a review of the four main types of rank transformations (RT). One method replaces each original data value by its rank (from 1 for the smallest to N for the largest). This rank-based procedure has been recommended as being robust to non-normal errors, resistant to outliers, and highly efficient for many distributions. It may result in a known statistic (e.g., in the two independent samples layout ranking results in the Wilcoxon rank-sum / Mann–Whitney U test), and provides the desired robustness and increased statistical power
Statistical power
The power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is actually false . The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis...

 that is sought. For example, Monte Carlo studies
Monte Carlo method
Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical and mathematical systems...

 have shown that the rank transformation in the two independent samples t-test layout can be successfully extended to the one-way independent samples ANOVA, as well as the two independent samples multivariate Hotelling's T2
Hotelling's T-square distribution
In statistics Hotelling's T-squared distribution is important because it arises as the distribution of a set of statistics which are natural generalisations of the statistics underlying Student's t distribution...

 layouts Commercial statistical software packages (e.g., SAS) followed with recommendations to data analysts to run their data sets through a ranking procedure (e.g., PROC RANK) prior to conducting standard analyses using parametric procedures.

Failure of ranking in the factorial ANOVA and other complex layouts

ANOVA on ranks means that a standard analysis of variance
Analysis of variance
In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation...

 is calculated on the rank-transformed data. Conducting factorial ANOVA on the ranks of original scores has also been suggested. However, Monte Carlo studies, and subsequent asymptotic studies found that the rank transformation is inappropriate for testing interaction effects in a 4x3 and a 2x2x2 factorial design. As the number of effects (i.e., main, interaction) become non-null, and as the magnitude of the non-null effects increase, there is an increase in Type I error, resulting in a complete failure of the statistic with as high as a 100% probability of making a false positive decision. Similarly, it was found that the rank transformation increasingly fails in the two dependent samples layout as the correlation between pretest and posttest scores increase. It was also discovered that the Type I error rate problem was exacerbated in the context of Analysis of Covariance, particularly as the correlation between the covariate and the dependent variable increased.

Transforming ranks

A variant of rank-transformation is 'quantile normalization' in which a further transformation is applied to the ranks such that the resulting values have some defined distribution (often a normal distribution with a specified mean and variance). Further analyses of quantile-normalized data may then assume that distribution to compute significance values. However, two specific types of secondary transformations, the random normal scores and expected normal scores transformation, have been shown to greatly inflate Type I errors and severely reduce statistical power.

Violating homoscedasticity

The ANOVA on ranks has never been recommended when the underlying assumption of homogeneous variances has been violated, either by itself, or in conjunction with a violation of the assumption of population normality. In general, rank based statistics become nonrobust with respect to Type I errors for departures from homoscedasticity even more quickly than parametric counterparts that share the same assumption.

Further information

Kepner and Wackerly summarized the literature in noting "by the late 1980s, the volume of literature on RT methods was rapidly expanding as new insights, both positive and negative, were gained regarding the utility of the method. Concerned that RT methods would be misused, Sawilowsky et al. (1989, p. 255) cautioned practitioners to avoid the use of these tests 'except in those specific situations where the characteristics of the tests are well understood'." According to Hettmansperger and McKean, "Sawilowsky (1990) provides an excellent review of nonparametric approaches to testing for interaction" in ANOVA.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK