Level of measurement
Encyclopedia
The "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed by the psychologist Stanley Smith Stevens
. Stevens proposed his theory in a 1946 Science
article titled "On the theory of scales of measurement". In that article, Stevens claimed that all measurement
in science was conducted using four different types of scales that he called "nominal", "ordinal", "interval" and "ratio".
, one uses labels; for example, rocks can be generally categorized as igneous, sedimentary and metamorphic
. For this scale, some valid operations are equivalence
and set membership. Nominal measures offer names or labels for certain characteristics.
Variables assessed on a nominal scale are called categorical variables; see also categorical data
.
Stevens (1946, p. 679) must have known that claiming nominal scales to measure obviously non-quantitative things would have attracted criticism, so he invoked his theory of measurement to justify nominal scales as measurement:
The central tendency
of a nominal attribute is given by its mode
; neither the mean nor the median can be defined.
We can use a simple example of a nominal category: first names. Looking at nearby people, we might find one or more of them named Aamir. Aamir is their label; and the set of all first names is a nominal scale. We can only check whether two people have the same name (equivalence) or whether a given name is in on a certain list of names (set membership), but it is impossible to say which name is greater or less than another (comparison) or to measure the difference between two names. Given a set of people, we can describe the set by its most common name (the mode), but cannot provide an "average name" or even the "middle name" among all the names. However, if we decide to sort our names alphabetically (or to sort them by length; or by how many times they appear in the US Census), we will begin to turn this nominal scale into an ordinal scale.
, which says only which horses arrived first, second, or third but include no information about race times. Another is the Mohs scale of mineral hardness
, which characterizes the hardness of various minerals through the ability of a harder material to scratch a softer one, saying nothing about the actual hardness of any of them. Yet another example is military ranks; they have an order, but no well-defined numerical difference between ranks.
When using an ordinal scale, the central tendency
of a group of items can be described by using the group's mode
(or most common item) or its median
(the middle-ranked item), but the mean (or average) cannot be defined.
In 1946, Stevens observed that psychological measurement usually operates on ordinal scales, and that ordinary statistics like means and standard deviations do not have valid interpretations. Nevertheless, such statistics can often be used to generate fruitful information, with the caveat that caution should be taken in drawing conclusions from such statistical data.
Psychometricians like to theorize that psychometric tests produce interval scale measures of cognitive abilities (e.g. Lord & Novick, 1968; von Eye, 2005) but there is little prima facie
evidence to suggest that such attributes are anything more than ordinal for most psychological data (Cliff, 1996; Cliff & Keats, 2003; Michell, 2008). In particular, IQ scores reflect an ordinal scale, in which all scores are only meaningful for comparison, rather than an interval scale, in which a given number of IQ "points" corresponds to a unit of intelligence. Thus it is an error to write that an IQ of 160 is just as different from an IQ of 130 as an IQ of 100 is different from an IQ of 70.
In mathematical order theory
, an ordinal scale defines a total
preorder
of objects (in essence, a way of sorting all the objects, in which some may be tied). The scale values themselves (such as labels like "great", "good", and "bad"; 1st, 2nd, and 3rd) have a total order
, where they may be sorted into a single line with no ambiguities. If numbers are used to define the scale, they remain correct even if they are transformed by any monotonically increasing function. This property is known as the order isomorphism
. A simple example follows:
Since x-8, 3x, and x3 are all monotonically increasing functions, replacing the ordinal judge's score by any of these alternate scores does not affect the relative ranking of the five people's cooking abilities. Each column of numbers is an equally legitimate ordinal scale for describing their abilities. However, the numerical (additive) difference between the various ordinal scores has no particular meaning.
See also Strict weak ordering
.
(in this case an affine line). Variables measured at the interval level are called "interval variables" or sometimes "scaled variables" as they have units of measurement.
Ratios between numbers on the scale are not meaningful, so operations such as multiplication and division cannot be carried out directly. But ratios of differences can be expressed; for example, one difference can be twice another.
The central tendency of a variable measured at the interval level can be represented by its mode, its median, or its arithmetic mean. Statistical dispersion can be measured in most of the usual ways, which just involved differences or averaging, such as range, interquartile range, and standard deviation. Since one cannot divide, one cannot define measures that require a ratio, such as studentized range or coefficient of variation. More subtly, while one can define moments about the origin, only central moments are useful, since the choice of origin is arbitrary and not meaningful. One can define standardized moments, since ratios of differences are meaningful, but one cannot define coefficient of variation, since the mean is a moment about the origin, unlike the standard deviation, which is (the square root of) a central moment.
temperature scale has a non-arbitrary zero point of absolute zero
, which is denoted 0K and is equal to -273.15 degrees Celsius. This zero point is non arbitrary as the particles that compose matter at this temperature have zero kinetic energy.
Examples of ratio scale measurement in the behavioral sciences are all but non-existent. Luce (2000) argues that an example of ratio scale measurement in psychology can be found in rank and sign dependent expected utility theory.
All statistical measures can be used for a variable measured at the ratio level, as all necessary mathematical operations are defined. The central tendency of a variable measured at the ratio level can be represented by, in addition to its mode
, its median
, or its arithmetic mean
, also its geometric mean
or harmonic mean
. In addition to the measures of statistical dispersion defined for interval variables, such as range
and standard deviation
, for ratio variables one can also define measures that require a ratio, such as studentized range or coefficient of variation
.
Duncan (1986) observed that Stevens' classification nominal measurement is contrary to his own definition of measurement. Stevens (1975) said on his own definition of measurement that "the assignment can be any consistent rule. The only rule not allowed would be random assignment, for randomness amounts in effect to a nonrule". However, so-called nominal measurement involves arbitrary assignment, and the "permissible transformation" is any number for any other. This is one of the points made in Lord's (1953) satirical paper On the Statistical Treatment of Football Numbers.
Among those who accept the classification scheme, there is also some controversy in behavioural sciences over whether the mean is meaningful for ordinal measurement. In terms of measurement theory, it is not, because the arithmetic operations are not made on numbers that are measurements in units, and so the results of computations do not give numbers in units. However, many behavioural scientists use means for ordinal data anyway. This is often justified on the basis that ordinal scales in behavioural science are really somewhere between true ordinal and interval scales; although the interval difference between two ordinal ranks is not constant, it is often of the same order of magnitude. For example, applications of measurement models in educational contexts often indicate that total scores have a fairly linear relationship with measurements across a range of an assessment. Thus, some argue, that so long as the unknown interval difference between ordinal scale ranks is not too variable, interval scale statistics such as means can meaningfully be used on ordinal scale variables. Statistical analysis software such as PSPP
require the user to select the appropriate measurement class for each variable. This ensures that subsequent user errors cannot inadvertently perform meaningless analyses (for example correlation analysis with a variable on a nominal level).
L. L. Thurstone made progress toward developing a justification for obtaining interval-level measurements based on the law of comparative judgment
. For a common application of the law, see the Analytic Hierarchy Process
. Further progress was made by Georg Rasch
(1960), who developed the probabilistic Rasch model
that provides a theoretical basis and justification for obtaining interval-level measurements from counts of observations such as total scores on assessments.
Another issue is derived from Nicholas R. Chrisman's article "Rethinking Levels of Measurement for Cartography", in which he introduces an expanded list of levels of measurement to account for various measurements that do not necessarily fit with the traditional notion of levels of measurement. Measurements bound to a range and repeat (like degrees in a circle, time, etc.), graded membership categories, and other types of measurement do not fit to Steven's original work, leading to the introduction of 6 new levels of measurement leading to: (1) Nominal, (2) Graded membership, (3) Ordinal, (4) Interval, (5) Log-Interval, (6) Extensive Ratio, (7) Cyclical Ratio, (8) Derived Ratio, (9) Counts and finally (10) Absolute. The extended levels of measurement are rarely used outside of academic geography.
to investigate the possibility of genuine scientific measurement in the psychological and behavioral sciences. This committee, which became known as the Ferguson committee, published a Final Report (Ferguson, et al., 1940, p. 245) in which Stevens' sone
scale (Stevens & Davis, 1938) was an object of criticism:
That is, if Stevens' sone
scale was genuinely measuring the intensity of auditory sensations, then evidence for such sensations as being quantitative attributes must be produced. The evidence needed was the presence of additive structure - a concept comprehensively treated by the German mathematician Otto Hölder
(Hölder, 1901). Given the physicist and measurement theorist Norman Robert Campbell dominated the Ferguson committee's deliberations, the committee concluded that measurement in the social sciences was impossible due to the lack of concatenation
operations. This conclusion was later rendered false by the discovery of the theory of conjoint measurement
by Debreu (1960) and independently by Luce & Tukey (1964). However, Stevens' reaction was not to conduct experiments to test for the presence of additive structure in sensations, but instead to render the conclusions of the Ferguson committee null and void by proposing a new theory of measurement:
Stevens was greatly influenced by the ideas of another Harvard academic, the Nobel laureate
physicist Percy Bridgman (1927), whose doctrine of operationism Stevens used to define measurement. In Stevens' definition for example, it is the use of a tape measure that defines length (the object of measurement) as being measurable (and so by implication quantitative). Critics of operationism object that it confuses the relations between two objects or events for properties of one of those of objects or events (Hardcastle, 1995; Michell, 1999; Moyer, 1981a,b; Rogers, 1989).
The Canadian measurement theorist William Rozeboom (1966) was an early and trenchant critic of Stevens' theory of scale types. But it was not until much later with the work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens (1981a, b) and R. Duncan Luce
(1986, 1987, 2001) did the concept of scale types receive the mathematical rigour that it lacked at its inception. As Luce (1997, p. 395) bluntly stated:
Stanley Smith Stevens
Stanley Smith Stevens was an American psychologist who founded Harvard's Psycho-Acoustic Laboratory and is credited with the introduction of Stevens' power law. Stevens authored a milestone textbook, the 1400+ page "Handbook of Experimental Psychology" . He was also one of the founding organizers...
. Stevens proposed his theory in a 1946 Science
Science (journal)
Science is the academic journal of the American Association for the Advancement of Science and is one of the world's top scientific journals....
article titled "On the theory of scales of measurement". In that article, Stevens claimed that all measurement
Measurement
Measurement is the process or the result of determining the ratio of a physical quantity, such as a length, time, temperature etc., to a unit of measurement, such as the metre, second or degree Celsius...
in science was conducted using four different types of scales that he called "nominal", "ordinal", "interval" and "ratio".
The theory of scale types
Stevens (1946, 1951) proposed that measurements can be classified into four different types of scales. These are shown in the table below as: nominal, ordinal, interval, and ratio.Scale Type | Permissible Statistics | Admissible Scale Transformation | Mathematical structure |
---|---|---|---|
nominal (also denoted as categorical) | mode Mode (statistics) In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score.... , Chi-squared |
One to One (equality (=)) | standard set structure (unordered) |
ordinal | median Median In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to... , percentile Percentile In statistics, a percentile is the value of a variable below which a certain percent of observations fall. For example, the 20th percentile is the value below which 20 percent of the observations may be found... |
Monotonic increasing (order Total order In set theory, a total order, linear order, simple order, or ordering is a binary relation on some set X. The relation is transitive, antisymmetric, and total... (<)) |
totally ordered set |
interval | mean Mean In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean.... , standard deviation Standard deviation Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average... , correlation Correlation In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence.... , regression Regression analysis In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables... , analysis of variance Analysis of variance In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation... |
Positive linear (affine Affine Affine may refer to:*Affine cipher, a special case of the more general substitution cipher*Affine combination, a certain kind of constrained linear combination*Affine connection, a connection on the tangent bundle of a differentiable manifold... ) |
affine line |
ratio | All statistics permitted for interval scales plus the following: geometric mean Geometric mean The geometric mean, in mathematics, is a type of mean or average, which indicates the central tendency or typical value of a set of numbers. It is similar to the arithmetic mean, except that the numbers are multiplied and then the nth root of the resulting product is taken.For instance, the... , harmonic mean Harmonic mean In mathematics, the harmonic mean is one of several kinds of average. Typically, it is appropriate for situations when the average of rates is desired.... , coefficient of variation Coefficient of variation In probability theory and statistics, the coefficient of variation is a normalized measure of dispersion of a probability distribution. It is also known as unitized risk or the variation coefficient. The absolute value of the CV is sometimes known as relative standard deviation , which is... , logarithms |
Positive similarities (multiplication Multiplication Multiplication is the mathematical operation of scaling one number by another. It is one of the four basic operations in elementary arithmetic .... ) |
One-dimensional Vector space Vector space A vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex... |
Nominal scale
At the nominal scale, i.e., for a nominal categoryNominal category
A nominal category or a nominal group is a group of objects or ideas that can be collectively grouped on the basis of shared, arbitrary characteristic....
, one uses labels; for example, rocks can be generally categorized as igneous, sedimentary and metamorphic
Metamorphic rock
Metamorphic rock is the transformation of an existing rock type, the protolith, in a process called metamorphism, which means "change in form". The protolith is subjected to heat and pressure causing profound physical and/or chemical change...
. For this scale, some valid operations are equivalence
Equivalence
Equivalence or equivalent may refer to:*In chemistry:**Equivalent **Equivalence point**Equivalent weight*In computing:**Turing equivalence *In ethics:**Moral equivalence*In history:...
and set membership. Nominal measures offer names or labels for certain characteristics.
Variables assessed on a nominal scale are called categorical variables; see also categorical data
Categorical data
In statistics, categorical data is that part of an observed dataset that consists of categorical variables, or for data that has been converted into that form, for example as grouped data...
.
Stevens (1946, p. 679) must have known that claiming nominal scales to measure obviously non-quantitative things would have attracted criticism, so he invoked his theory of measurement to justify nominal scales as measurement:
The central tendency
Central tendency
In statistics, the term central tendency relates to the way in which quantitative data is clustered around some value. A measure of central tendency is a way of specifying - central value...
of a nominal attribute is given by its mode
Mode (statistics)
In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....
; neither the mean nor the median can be defined.
We can use a simple example of a nominal category: first names. Looking at nearby people, we might find one or more of them named Aamir. Aamir is their label; and the set of all first names is a nominal scale. We can only check whether two people have the same name (equivalence) or whether a given name is in on a certain list of names (set membership), but it is impossible to say which name is greater or less than another (comparison) or to measure the difference between two names. Given a set of people, we can describe the set by its most common name (the mode), but cannot provide an "average name" or even the "middle name" among all the names. However, if we decide to sort our names alphabetically (or to sort them by length; or by how many times they appear in the US Census), we will begin to turn this nominal scale into an ordinal scale.
Ordinal scale
Rank-ordering data simply puts the data on an ordinal scale. Ordinal measurements describe order, but not relative size or degree of difference between the items measured. In this scale type, the numbers assigned to objects or events represent the rank order (1st, 2nd, 3rd, etc.) of the entities assessed. An example of an ordinal scale is the result of a horse raceHorse racing
Horse racing is an equestrian sport that has a long history. Archaeological records indicate that horse racing occurred in ancient Babylon, Syria, and Egypt. Both chariot and mounted horse racing were events in the ancient Greek Olympics by 648 BC...
, which says only which horses arrived first, second, or third but include no information about race times. Another is the Mohs scale of mineral hardness
Mohs scale of mineral hardness
The Mohs scale of mineral hardness characterizes the scratch resistance of various minerals through the ability of a harder material to scratch a softer material. It was created in 1812 by the German geologist and mineralogist Friedrich Mohs and is one of several definitions of hardness in...
, which characterizes the hardness of various minerals through the ability of a harder material to scratch a softer one, saying nothing about the actual hardness of any of them. Yet another example is military ranks; they have an order, but no well-defined numerical difference between ranks.
When using an ordinal scale, the central tendency
Central tendency
In statistics, the term central tendency relates to the way in which quantitative data is clustered around some value. A measure of central tendency is a way of specifying - central value...
of a group of items can be described by using the group's mode
Mode (statistics)
In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....
(or most common item) or its median
Median
In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...
(the middle-ranked item), but the mean (or average) cannot be defined.
In 1946, Stevens observed that psychological measurement usually operates on ordinal scales, and that ordinary statistics like means and standard deviations do not have valid interpretations. Nevertheless, such statistics can often be used to generate fruitful information, with the caveat that caution should be taken in drawing conclusions from such statistical data.
Psychometricians like to theorize that psychometric tests produce interval scale measures of cognitive abilities (e.g. Lord & Novick, 1968; von Eye, 2005) but there is little prima facie
Prima facie
Prima facie is a Latin expression meaning on its first encounter, first blush, or at first sight. The literal translation would be "at first face", from the feminine form of primus and facies , both in the ablative case. It is used in modern legal English to signify that on first examination, a...
evidence to suggest that such attributes are anything more than ordinal for most psychological data (Cliff, 1996; Cliff & Keats, 2003; Michell, 2008). In particular, IQ scores reflect an ordinal scale, in which all scores are only meaningful for comparison, rather than an interval scale, in which a given number of IQ "points" corresponds to a unit of intelligence. Thus it is an error to write that an IQ of 160 is just as different from an IQ of 130 as an IQ of 100 is different from an IQ of 70.
In mathematical order theory
Order theory
Order theory is a branch of mathematics which investigates our intuitive notion of order using binary relations. It provides a formal framework for describing statements such as "this is less than that" or "this precedes that". This article introduces the field and gives some basic definitions...
, an ordinal scale defines a total
Total relation
In mathematics, a binary relation R over a set X is total if for all a and b in X, a is related to b or b is related to a .In mathematical notation, this is\forall a, b \in X,\ a R b \or b R a....
preorder
Preorder
In mathematics, especially in order theory, preorders are binary relations that are reflexive and transitive.For example, all partial orders and equivalence relations are preorders...
of objects (in essence, a way of sorting all the objects, in which some may be tied). The scale values themselves (such as labels like "great", "good", and "bad"; 1st, 2nd, and 3rd) have a total order
Total order
In set theory, a total order, linear order, simple order, or ordering is a binary relation on some set X. The relation is transitive, antisymmetric, and total...
, where they may be sorted into a single line with no ambiguities. If numbers are used to define the scale, they remain correct even if they are transformed by any monotonically increasing function. This property is known as the order isomorphism
Order isomorphism
In the mathematical field of order theory an order isomorphism is a special kind of monotone function that constitutes a suitable notion of isomorphism for partially ordered sets . Whenever two posets are order isomorphic, they can be considered to be "essentially the same" in the sense that one of...
. A simple example follows:
Judge's score x |
Score minus 8 x-8 |
Tripled score 3x |
Cubed score x3 |
|
---|---|---|---|---|
Alice's cooking ability | 10 | 2 | 30 | 1000 |
Bob's cooking ability | 9 | 1 | 27 | 729 |
Claire's cooking ability | 8.5 | 0.5 | 25.5 | 614.125 |
Dana's cooking ability | 8 | 0 | 24 | 512 |
Edgar's cooking ability | 5 | -3 | 15 | 125 |
Since x-8, 3x, and x3 are all monotonically increasing functions, replacing the ordinal judge's score by any of these alternate scores does not affect the relative ranking of the five people's cooking abilities. Each column of numbers is an equally legitimate ordinal scale for describing their abilities. However, the numerical (additive) difference between the various ordinal scores has no particular meaning.
See also Strict weak ordering
Strict weak ordering
In mathematics, especially order theory, a strict weak ordering is a binary relation In mathematics, especially order theory, a strict weak ordering is a binary relation ...
.
Interval scale
Quantitative attributes are all measurable on interval scales, as any difference between the levels of an attribute can be multiplied by any real number to exceed or equal another difference. A highly familiar example of interval scale measurement is temperature with the Celsius scale. In this particular scale, the unit of measurement is 1/100 of the temperature difference between the freezing and boiling points of water under a pressure of 1 atmosphere. The "zero point" on an interval scale is arbitrary; and negative values can be used. The formal mathematical term is an affine spaceAffine space
In mathematics, an affine space is a geometric structure that generalizes the affine properties of Euclidean space. In an affine space, one can subtract points to get vectors, or add a vector to a point to get another point, but one cannot add points. In particular, there is no distinguished point...
(in this case an affine line). Variables measured at the interval level are called "interval variables" or sometimes "scaled variables" as they have units of measurement.
Ratios between numbers on the scale are not meaningful, so operations such as multiplication and division cannot be carried out directly. But ratios of differences can be expressed; for example, one difference can be twice another.
The central tendency of a variable measured at the interval level can be represented by its mode, its median, or its arithmetic mean. Statistical dispersion can be measured in most of the usual ways, which just involved differences or averaging, such as range, interquartile range, and standard deviation. Since one cannot divide, one cannot define measures that require a ratio, such as studentized range or coefficient of variation. More subtly, while one can define moments about the origin, only central moments are useful, since the choice of origin is arbitrary and not meaningful. One can define standardized moments, since ratios of differences are meaningful, but one cannot define coefficient of variation, since the mean is a moment about the origin, unlike the standard deviation, which is (the square root of) a central moment.
Ratio measurement
Most measurement in the physical sciences and engineering is done on ratio scales. Mass, length, time, plane angle, energy and electric charge are examples of physical measures that are ratio scales. The scale type takes its name from the fact that measurement is the estimation of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind (Michell, 1997, 1999). Informally, the distinguishing feature of a ratio scale is the possession of a non-arbitrary zero value. For example, the KelvinKelvin
The kelvin is a unit of measurement for temperature. It is one of the seven base units in the International System of Units and is assigned the unit symbol K. The Kelvin scale is an absolute, thermodynamic temperature scale using as its null point absolute zero, the temperature at which all...
temperature scale has a non-arbitrary zero point of absolute zero
Absolute zero
Absolute zero is the theoretical temperature at which entropy reaches its minimum value. The laws of thermodynamics state that absolute zero cannot be reached using only thermodynamic means....
, which is denoted 0K and is equal to -273.15 degrees Celsius. This zero point is non arbitrary as the particles that compose matter at this temperature have zero kinetic energy.
Examples of ratio scale measurement in the behavioral sciences are all but non-existent. Luce (2000) argues that an example of ratio scale measurement in psychology can be found in rank and sign dependent expected utility theory.
All statistical measures can be used for a variable measured at the ratio level, as all necessary mathematical operations are defined. The central tendency of a variable measured at the ratio level can be represented by, in addition to its mode
Mode (statistics)
In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....
, its median
Median
In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...
, or its arithmetic mean
Arithmetic mean
In mathematics and statistics, the arithmetic mean, often referred to as simply the mean or average when the context is clear, is a method to derive the central tendency of a sample space...
, also its geometric mean
Geometric mean
The geometric mean, in mathematics, is a type of mean or average, which indicates the central tendency or typical value of a set of numbers. It is similar to the arithmetic mean, except that the numbers are multiplied and then the nth root of the resulting product is taken.For instance, the...
or harmonic mean
Harmonic mean
In mathematics, the harmonic mean is one of several kinds of average. Typically, it is appropriate for situations when the average of rates is desired....
. In addition to the measures of statistical dispersion defined for interval variables, such as range
Range (statistics)
In the descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...
and standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
, for ratio variables one can also define measures that require a ratio, such as studentized range or coefficient of variation
Coefficient of variation
In probability theory and statistics, the coefficient of variation is a normalized measure of dispersion of a probability distribution. It is also known as unitized risk or the variation coefficient. The absolute value of the CV is sometimes known as relative standard deviation , which is...
.
Debate on classification scheme
There has been, and continues to be, debate about the merits of the classifications, particularly in the cases of the nominal and ordinal classifications (Michell, 1986). Thus, while Stevens' classification is widely adopted, it is by no means universally accepted.Duncan (1986) observed that Stevens' classification nominal measurement is contrary to his own definition of measurement. Stevens (1975) said on his own definition of measurement that "the assignment can be any consistent rule. The only rule not allowed would be random assignment, for randomness amounts in effect to a nonrule". However, so-called nominal measurement involves arbitrary assignment, and the "permissible transformation" is any number for any other. This is one of the points made in Lord's (1953) satirical paper On the Statistical Treatment of Football Numbers.
Among those who accept the classification scheme, there is also some controversy in behavioural sciences over whether the mean is meaningful for ordinal measurement. In terms of measurement theory, it is not, because the arithmetic operations are not made on numbers that are measurements in units, and so the results of computations do not give numbers in units. However, many behavioural scientists use means for ordinal data anyway. This is often justified on the basis that ordinal scales in behavioural science are really somewhere between true ordinal and interval scales; although the interval difference between two ordinal ranks is not constant, it is often of the same order of magnitude. For example, applications of measurement models in educational contexts often indicate that total scores have a fairly linear relationship with measurements across a range of an assessment. Thus, some argue, that so long as the unknown interval difference between ordinal scale ranks is not too variable, interval scale statistics such as means can meaningfully be used on ordinal scale variables. Statistical analysis software such as PSPP
PSPP
PSPP is a free software application for analysis of sampled data. It has a graphical user interface and conventional command line interface. It is written in C, uses GNU Scientific Library for its mathematical routines, and plotutils for generating graphs....
require the user to select the appropriate measurement class for each variable. This ensures that subsequent user errors cannot inadvertently perform meaningless analyses (for example correlation analysis with a variable on a nominal level).
L. L. Thurstone made progress toward developing a justification for obtaining interval-level measurements based on the law of comparative judgment
Law of comparative judgment
The law of comparative judgment was conceived by L. L. Thurstone. In modern day terminology, it is more aptly described as a model that is used to obtain measurements from any process of pairwise comparison...
. For a common application of the law, see the Analytic Hierarchy Process
Analytic Hierarchy Process
The Analytic Hierarchy Process is a structured technique for organizing and analyzing complex decisions. Based on mathematics and psychology, it was developed by Thomas L...
. Further progress was made by Georg Rasch
Georg Rasch
Georg Rasch was a Danish mathematician, statistician, and psychometrician, most famous for the development of a class of measurement models known as Rasch models. He studied with R.A. Fisher and also briefly with Ragnar Frisch, and was elected a member of the International Statistical Institute in...
(1960), who developed the probabilistic Rasch model
Rasch model
Rasch models are used for analysing data from assessments to measure variables such as abilities, attitudes, and personality traits. For example, they may be used to estimate a student's reading ability from answers to questions on a reading assessment, or the extremity of a person's attitude to...
that provides a theoretical basis and justification for obtaining interval-level measurements from counts of observations such as total scores on assessments.
Another issue is derived from Nicholas R. Chrisman's article "Rethinking Levels of Measurement for Cartography", in which he introduces an expanded list of levels of measurement to account for various measurements that do not necessarily fit with the traditional notion of levels of measurement. Measurements bound to a range and repeat (like degrees in a circle, time, etc.), graded membership categories, and other types of measurement do not fit to Steven's original work, leading to the introduction of 6 new levels of measurement leading to: (1) Nominal, (2) Graded membership, (3) Ordinal, (4) Interval, (5) Log-Interval, (6) Extensive Ratio, (7) Cyclical Ratio, (8) Derived Ratio, (9) Counts and finally (10) Absolute. The extended levels of measurement are rarely used outside of academic geography.
Scale types and Stevens' "operational theory of measurement"
The theory of scale types is the intellectual handmaiden to Stevens' "operational theory of measurement", which was to become definitive within psychology and the behavioral sciences, despite Michell's characterization as its being quite at odds with Michell's understanding of measurement in the natural sciences (Michell, 1999). Essentially, the operational theory of measurement was a reaction to the conclusions of a committee established in 1932 by the British Association for the Advancement of ScienceBritish Association for the Advancement of Science
frame|right|"The BA" logoThe British Association for the Advancement of Science or the British Science Association, formerly known as the BA, is a learned society with the object of promoting science, directing general attention to scientific matters, and facilitating interaction between...
to investigate the possibility of genuine scientific measurement in the psychological and behavioral sciences. This committee, which became known as the Ferguson committee, published a Final Report (Ferguson, et al., 1940, p. 245) in which Stevens' sone
Sone
The sone was proposed as a unit of perceived loudness by Stanley Smith Stevens in 1936. In acoustics, loudness is the subjective perception of sound intensity...
scale (Stevens & Davis, 1938) was an object of criticism:
That is, if Stevens' sone
Sone
The sone was proposed as a unit of perceived loudness by Stanley Smith Stevens in 1936. In acoustics, loudness is the subjective perception of sound intensity...
scale was genuinely measuring the intensity of auditory sensations, then evidence for such sensations as being quantitative attributes must be produced. The evidence needed was the presence of additive structure - a concept comprehensively treated by the German mathematician Otto Hölder
Otto Hölder
Otto Ludwig Hölder was a German mathematician born in Stuttgart.Hölder first studied at the Polytechnikum and then in 1877 went to Berlin where he was a student of Leopold Kronecker, Karl Weierstraß, and Ernst Kummer.He is famous for many things including: Hölder's inequality, the Jordan–Hölder...
(Hölder, 1901). Given the physicist and measurement theorist Norman Robert Campbell dominated the Ferguson committee's deliberations, the committee concluded that measurement in the social sciences was impossible due to the lack of concatenation
Concatenation (mathematics)
In mathematics, concatenation is the joining of two numbers by their numerals. That is, the concatenation of 123 and 456 is 123456. Concatenation of numbers a and b is denoted a||b. Relevant subjects in recreational mathematics include Smarandache-Wellin numbers, home primes, and Champernowne's...
operations. This conclusion was later rendered false by the discovery of the theory of conjoint measurement
Theory of conjoint measurement
The theory of conjoint measurement is a general, formal theory of continuous quantity. It was independently discovered by the French economist Gerard Debreu and by the American mathematical psychologist R...
by Debreu (1960) and independently by Luce & Tukey (1964). However, Stevens' reaction was not to conduct experiments to test for the presence of additive structure in sensations, but instead to render the conclusions of the Ferguson committee null and void by proposing a new theory of measurement:
Stevens was greatly influenced by the ideas of another Harvard academic, the Nobel laureate
Nobel Prize
The Nobel Prizes are annual international awards bestowed by Scandinavian committees in recognition of cultural and scientific advances. The will of the Swedish chemist Alfred Nobel, the inventor of dynamite, established the prizes in 1895...
physicist Percy Bridgman (1927), whose doctrine of operationism Stevens used to define measurement. In Stevens' definition for example, it is the use of a tape measure that defines length (the object of measurement) as being measurable (and so by implication quantitative). Critics of operationism object that it confuses the relations between two objects or events for properties of one of those of objects or events (Hardcastle, 1995; Michell, 1999; Moyer, 1981a,b; Rogers, 1989).
The Canadian measurement theorist William Rozeboom (1966) was an early and trenchant critic of Stevens' theory of scale types. But it was not until much later with the work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens (1981a, b) and R. Duncan Luce
R. Duncan Luce
Robert Duncan Luce is the Distinguished Research Professor of Cognitive Science at the University of California, Irvine.Luce received a B.S. in Aeronautical Engineering from the Massachusetts Institute of Technology in 1945, and PhD in Mathematics from the same university in 1950...
(1986, 1987, 2001) did the concept of scale types receive the mathematical rigour that it lacked at its inception. As Luce (1997, p. 395) bluntly stated:
See also
- Measure (mathematics)Measure (mathematics)In mathematical analysis, a measure on a set is a systematic way to assign to each suitable subset a number, intuitively interpreted as the size of the subset. In this sense, a measure is a generalization of the concepts of length, area, and volume...
- Inter-rater reliabilityInter-rater reliabilityIn statistics, inter-rater reliability, inter-rater agreement, or concordance is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by...
- Cohen's kappaCohen's kappaCohen's kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative items. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance. Some...
- Category theoryCategory theoryCategory theory is an area of study in mathematics that examines in an abstract way the properties of particular mathematical concepts, by formalising them as collections of objects and arrows , where these collections satisfy certain basic conditions...
- Quantitative data
- Qualitative data
- Ramsey–Lewis method
- 3DLevelScanner
External links
- Hyperstat — Measurement Scales
- [ftp://ftp.sas.com/pub/neural/measurement.html Measurement theory: Frequently asked questions]