Regression discontinuity
Encyclopedia
In statistics
, econometrics
, epidemiology
and related disciplines, a regression discontinuity design (RDD) is a design that elicits the causal effects
of interventions by exploiting a given exogenous threshold determining assignment to treatment. By comparing observations lying closely on either side of the threshold, it is possible to estimate the local treatment effect
in environments in which randomization
was unfeasible. First applied by Donald Thistlewaite and Donald Campbell
to the evaluation of scholarship programs, the RDD has become increasingly popular in recent years.
(e.g. scholarship award): Since high performing students are more likely to be awarded the merit scholarship and continue performing well at the same time, comparing the outcomes of awardees and non-recipients would lead to an upward bias
of the estimates. Even if the scholarship did not improve marks at all, awardees would have performed better than non-recipients, simply because scholarships were given to students who were performing well ex ante.
Despite the absence of an experimental design, a RDD can exploit exogenous
characteristics of the intervention to elicit causal effects
. If all students above a given mark - for example 50% - are given the scholarship, it is possible to elicit the local treatment effect
by comparing students around the 50% cut-off: The intuition here is that a student scoring 49% is likely to be very similar to a student scoring 51% - given the pre-defined threshold of 50%, however, one student will receive the scholarship while the other will not. Comparing the outcome of the awardee (treatment group) to the counterfactual
outcome of the non-recipient (control group) will hence deliver the local treatment effect
.
of causal effects hinges on the crucial assumption that there is indeed a sharp cut-off, around which there is a discontinuity in the probability of assignment from 0 to 1. In reality, however, cut-offs are often not strictly implemented (e.g. exercised discretion for students who just fell short of passing the threshold) and the estimates will hence be biased.
In contrast to the sharp regression discontinuity design, a fuzzy regression discontinuity design (FRDD) does not require a sharp discontinuity in the probability of assignment but is applicable as long as the probability of assignment is different. The intuition behind it is related to the instrumental variable
strategy and intention to treat.
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, econometrics
Econometrics
Econometrics has been defined as "the application of mathematics and statistical methods to economic data" and described as the branch of economics "that aims to give empirical content to economic relations." More precisely, it is "the quantitative analysis of actual economic phenomena based on...
, epidemiology
Epidemiology
Epidemiology is the study of health-event, health-characteristic, or health-determinant patterns in a population. It is the cornerstone method of public health research, and helps inform policy decisions and evidence-based medicine by identifying risk factors for disease and targets for preventive...
and related disciplines, a regression discontinuity design (RDD) is a design that elicits the causal effects
Causality
Causality is the relationship between an event and a second event , where the second event is understood as a consequence of the first....
of interventions by exploiting a given exogenous threshold determining assignment to treatment. By comparing observations lying closely on either side of the threshold, it is possible to estimate the local treatment effect
Treatment effect
Treatment effect may refer to:* Design of experiments* Average treatment effect...
in environments in which randomization
Randomization
Randomization is the process of making something random; this means:* Generating a random permutation of a sequence .* Selecting a random sample of a population ....
was unfeasible. First applied by Donald Thistlewaite and Donald Campbell
Donald T. Campbell
Donald Thomas Campbell was an American social scientist. He is noted for his work in methodology. He coined the term "evolutionary epistemology" and developed a selectionist theory of human creativity.- Biography :...
to the evaluation of scholarship programs, the RDD has become increasingly popular in recent years.
Example
The intuition behind the RDD is well illustrated using the evaluation of merit-based scholarships. The main problem with estimating the causal effect of such an intervention is the endogeneity of assignment to treatmentTreatment
Treatment may refer to:* Treatment, therapy used to remedy a health problem* Treatment, a process or intervention in the design of experiments* Treatment group, a collection of items or individuals given the same treatment in an experiment* Water treatment...
(e.g. scholarship award): Since high performing students are more likely to be awarded the merit scholarship and continue performing well at the same time, comparing the outcomes of awardees and non-recipients would lead to an upward bias
Selection bias
Selection bias is a statistical bias in which there is an error in choosing the individuals or groups to take part in a scientific study. It is sometimes referred to as the selection effect. The term "selection bias" most often refers to the distortion of a statistical analysis, resulting from the...
of the estimates. Even if the scholarship did not improve marks at all, awardees would have performed better than non-recipients, simply because scholarships were given to students who were performing well ex ante.
Despite the absence of an experimental design, a RDD can exploit exogenous
Exogenous
Exogenous refers to an action or object coming from outside a system. It is the opposite of endogenous, something generated from within the system....
characteristics of the intervention to elicit causal effects
Causality
Causality is the relationship between an event and a second event , where the second event is understood as a consequence of the first....
. If all students above a given mark - for example 50% - are given the scholarship, it is possible to elicit the local treatment effect
Treatment effect
Treatment effect may refer to:* Design of experiments* Average treatment effect...
by comparing students around the 50% cut-off: The intuition here is that a student scoring 49% is likely to be very similar to a student scoring 51% - given the pre-defined threshold of 50%, however, one student will receive the scholarship while the other will not. Comparing the outcome of the awardee (treatment group) to the counterfactual
Counterfactual
Counterfactual may refer to:* Counterfactual conditional, a grammatical form * Counterfactual subjunctive, grammatical forms which in English are known as the past and pluperfect forms of the subjunctive mood* Counterfactual thinking* Counterfactual history* Alternate history, a literary genre*...
outcome of the non-recipient (control group) will hence deliver the local treatment effect
Treatment effect
Treatment effect may refer to:* Design of experiments* Average treatment effect...
.
Other examples
- Developmental education in higher education, when remediation is determined by a placement test
- Policies in which treatment is determined by an age eligibility criterion (e.g. pensions).
- Elections in which one politician wins by a marginal majority.
Advantages
- When properly implemented and analyzed, the RDD yields an unbiased estimate of the local treatment effect.
- RDD, as a quasi-experimentQuasi-experimentA quasi-experiment is an empirical study used to estimate the causal impact of an intervention on its target population. Quasi-experimental research designs share many similarities with the traditional experimental design or randomized controlled trial, but they specifically lack the element of...
, does not require ex ante randomization and circumvents ethical issues of random assignmentRandomizationRandomization is the process of making something random; this means:* Generating a random permutation of a sequence .* Selecting a random sample of a population ....
.
Disadvantages
- The statistical powerStatistical powerThe power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is actually false . The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis...
is considerably lower than a randomized experiment of the same sample sizeSample sizeSample size determination is the act of choosing the number of observations to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample...
, increasing the risk of erroneously dismissing significant effects of the treatment (Type II error) - The estimated effects are only unbiased if the functional formFunctional formIn programming and mathematics, a functional form is an operator or function that can either be applied to other operators or yield operators as result, or both...
of the relationship between the treatment and outcome is correctly modelled. The most popular caveats are non-linear relationships that are mistaken as a discontinuity.
Extensions
The identificationIdentifiability
In statistics, identifiability is a property which a model must satisfy in order for inference to be possible. We say that the model is identifiable if it is theoretically possible to learn the true value of this model’s underlying parameter after obtaining an infinite number of observations from it...
of causal effects hinges on the crucial assumption that there is indeed a sharp cut-off, around which there is a discontinuity in the probability of assignment from 0 to 1. In reality, however, cut-offs are often not strictly implemented (e.g. exercised discretion for students who just fell short of passing the threshold) and the estimates will hence be biased.
In contrast to the sharp regression discontinuity design, a fuzzy regression discontinuity design (FRDD) does not require a sharp discontinuity in the probability of assignment but is applicable as long as the probability of assignment is different. The intuition behind it is related to the instrumental variable
Instrumental variable
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....
strategy and intention to treat.
External links
- Regression-Discontinuity Analysis at Research Methods Knowledge Base