Structural equation modeling
Encyclopedia
Structural equation modeling (SEM) is a statistical technique for testing and estimating causal relations using a combination of statistical data and qualitative causal assumptions. This definition of SEM was articulated by the geneticist Sewall Wright
(1921), the economist Trygve Haavelmo
(1943) and the cognitive scientist Herbert Simon
(1953), and formally defined by Judea Pearl
(2000) using a calculus of counterfactuals.
Structural Equation Models (SEM) allow both confirmatory and exploratory modeling, meaning they are suited to both theory testing and theory development. Confirmatory modeling usually starts out with a hypothesis
that gets represented in a causal model. The concepts used in the model must then be operationalized
to allow testing of the relationships between the concepts in the model. The model is tested against the obtained measurement data to determine how well the model fits the data. The causal assumptions embedded in the model often have falsifiable
implications which can be tested against the data.
With an initial theory SEM can be used inductively by specifying a corresponding model and using data to estimate the values of free parameters. Often the initial hypothesis requires adjustment in light of model evidence. When SEM is used purely for exploration, this is usually in the context of exploratory factor analysis as in psychometric design.
Among the strengths of SEM is the ability to construct latent variable
s: variables which are not measured directly, but are estimated in the model from several measured variables each of which is predicted to 'tap into' the latent variables. This allows the modeler to explicitly capture the unreliability of measurement in the model, which in theory allows the structural relations between latent variables to be accurately estimated. Factor analysis
, path analysis and regression
all represent special cases of SEM.
In SEM, the qualitative causal assumptions are represented by the missing variables in each equation, as well as vanishing covariances among some error terms. These assumptions are testable in experimental studies and must be confirmed judgmentally in observational studies.
the dependent variable
(DV) regresses on the independent variable
(IV), meaning that the DV is being predicted by the IV. In SEM terminology, other variables regress on exogenous variables. Exogenous variables can be recognized in a graphical version of the model, as the variables sending out arrowheads, denoting which variable it is predicting. A variable that regresses on a variable is always an endogenous variable, even if this same variable is also used as a variable to be regressed on. Endogenous variables are recognized as the receivers of an arrowhead in the model.
It is important to note that SEM is more general than regression. In particular a variable can act as both independent and dependent variable.
Two main components of models are distinguished in SEM: the structural model showing potential causal dependencies between endogenous and exogenous variables, and the measurement model showing the relations between latent variables and their indicators. Exploratory and Confirmatory factor analysis
models, for example, contain only the measurement part, while path diagrams can be viewed as an SEM that only has the structural part.
In specifying pathways in a model, the modeler can posit two types of relationships: (1) free pathways, in which hypothesized causal (in fact counterfactual) relationships between variables are tested, and therefore are left 'free' to vary, and (2) relationships between variables that already have an estimated relationship, usually based on previous studies, which are 'fixed' in the model.
A modeler will often specify a set of theoretically plausible models in order to assess whether the model proposed is the best of the set of possible models. Not only must the modeler account for the theoretical reasons for building the model as it is, but the modeler must also take into account the number of data points and the number of parameters that the model must estimate to identify the model. An identified model is a model where a specific parameter value uniquely identifies the model, and no other equivalent formulation can be given by a different parameter value. A data point
is a variable with observed scores, like a variable containing the scores on a question or the number of times respondents buy a car. The parameter is the value of interest, which might be a regression coefficient between the exogenous and the endogenous variable or the factor loading (regression coefficient between an indicator and its factor). If there are fewer data points than the number of estimated parameters, the resulting model is "unidentified" , since there are too few reference points to account for all the variance in the model. The solution is to constrain one of the paths to zero, which means that it is no longer part of the model.
representing the relationships between variables and the estimated covariance matrices of the best fitting model. This is obtained through numerical maximization of a fit criterion as provided by maximum likelihood
estimation, weighted least squares or asymptotically distribution-free methods. This is often accomplished by using a specialized SEM analysis program of which several exist.
Formal statistical tests and fit indices have been developed for these purposes. Individual parameters of the model can also be examined within the estimated model in order to see how well the proposed model fits the driving theory. Most, though not all, estimation methods make such tests of the model possible.
Of course as in all statistical hypothesis tests
, SEM model tests are based on the assumption that the correct and complete relevant data have been modeled. In the SEM literature, discussion of fit has led to a variety of different recommendations on the precise application of the various fit indices and hypothesis tests.
Measures of fit differ in several ways. Traditional approaches to modeling start from a null hypothesis
, rewarding more parsimonious models (i.e. those with fewer free parameters), to others such as AIC
that focus on how little the fitted values deviate from a saturated model (i.e. how well they reproduce the measured values), taking into account the number of free parameters used. Because different measures of fit capture different elements of the fit of the model, it is appropriate to report a selection of different fit measures.
Some of the more commonly used measures of fit include:
For each measure of fit, a decision as to what represents a good-enough fit between the model and the data must reflect other contextual factors such as sample size
(very large samples make the Chi-squared test overly sensitive, for instance ), the ratio of indicators to factors, and the overall complexity of the model.
Caution should always be taken when making claims of causality even when experimentation or time-ordered studies have been done. The term causal model must be understood to mean: "a model that conveys causal assumptions," not necessarily a model that produces validated causal conclusions. Collecting data at multiple time points and using an experimental or quasi-experimental design can help rule out certain rival hypotheses but even a randomized experiment cannot rule out all such threats to causal inference. Good fit by a model consistent with one causal hypothesis invariably entails equally good fit by another model consistent with an opposing causal hypothesis. No research design, no matter how clever, can help distinguish such rival hypotheses, save for interventional experiments.
As in any science, subsequent replication and perhaps modification will proceed from the initial finding.
Sewall Wright
Sewall Green Wright was an American geneticist known for his influential work on evolutionary theory and also for his work on path analysis. With R. A. Fisher and J.B.S. Haldane, he was a founder of theoretical population genetics. He is the discoverer of the inbreeding coefficient and of...
(1921), the economist Trygve Haavelmo
Trygve Haavelmo
Trygve Magnus Haavelmo , born in Skedsmo, Norway, was an influential economist with main research interests centered on the fields of econometrics and economics theory. During World War II he worked with Nortraship in the Statistical Department in New York City. He received his Ph.D...
(1943) and the cognitive scientist Herbert Simon
Herbert Simon
Herbert Alexander Simon was an American political scientist, economist, sociologist, and psychologist, and professor—most notably at Carnegie Mellon University—whose research ranged across the fields of cognitive psychology, cognitive science, computer science, public administration, economics,...
(1953), and formally defined by Judea Pearl
Judea Pearl
Judea Pearl is a computer scientist and philosopher, best known for developing the probabilistic approach to artificial intelligence and the development of Bayesian networks ....
(2000) using a calculus of counterfactuals.
Structural Equation Models (SEM) allow both confirmatory and exploratory modeling, meaning they are suited to both theory testing and theory development. Confirmatory modeling usually starts out with a hypothesis
Hypothesis
A hypothesis is a proposed explanation for a phenomenon. The term derives from the Greek, ὑποτιθέναι – hypotithenai meaning "to put under" or "to suppose". For a hypothesis to be put forward as a scientific hypothesis, the scientific method requires that one can test it...
that gets represented in a causal model. The concepts used in the model must then be operationalized
Operationalization
In humanities, operationalization is the process of defining a fuzzy concept so as to make the concept clearly distinguishable or measurable and to understand it in terms of empirical observations...
to allow testing of the relationships between the concepts in the model. The model is tested against the obtained measurement data to determine how well the model fits the data. The causal assumptions embedded in the model often have falsifiable
Falsifiability
Falsifiability or refutability of an assertion, hypothesis or theory is the logical possibility that it can be contradicted by an observation or the outcome of a physical experiment...
implications which can be tested against the data.
With an initial theory SEM can be used inductively by specifying a corresponding model and using data to estimate the values of free parameters. Often the initial hypothesis requires adjustment in light of model evidence. When SEM is used purely for exploration, this is usually in the context of exploratory factor analysis as in psychometric design.
Among the strengths of SEM is the ability to construct latent variable
Latent variable
In statistics, latent variables , are variables that are not directly observed but are rather inferred from other variables that are observed . Mathematical models that aim to explain observed variables in terms of latent variables are called latent variable models...
s: variables which are not measured directly, but are estimated in the model from several measured variables each of which is predicted to 'tap into' the latent variables. This allows the modeler to explicitly capture the unreliability of measurement in the model, which in theory allows the structural relations between latent variables to be accurately estimated. Factor analysis
Factor analysis
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved, uncorrelated variables called factors. In other words, it is possible, for example, that variations in three or four observed variables...
, path analysis and regression
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
all represent special cases of SEM.
In SEM, the qualitative causal assumptions are represented by the missing variables in each equation, as well as vanishing covariances among some error terms. These assumptions are testable in experimental studies and must be confirmed judgmentally in observational studies.
Model specification
When SEM is used as a confirmatory technique, the model must be specified correctly based on the type of analysis that the researcher is attempting to confirm. When building the correct model, the researcher uses two different kinds of variables, namely exogenous and endogenous variables. The distinction between these two types of variables is whether the variable regresses on another variable or not. As in regressionLinear regression
In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple regression...
the dependent variable
Dependent and independent variables
The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...
(DV) regresses on the independent variable
Dependent and independent variables
The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...
(IV), meaning that the DV is being predicted by the IV. In SEM terminology, other variables regress on exogenous variables. Exogenous variables can be recognized in a graphical version of the model, as the variables sending out arrowheads, denoting which variable it is predicting. A variable that regresses on a variable is always an endogenous variable, even if this same variable is also used as a variable to be regressed on. Endogenous variables are recognized as the receivers of an arrowhead in the model.
It is important to note that SEM is more general than regression. In particular a variable can act as both independent and dependent variable.
Two main components of models are distinguished in SEM: the structural model showing potential causal dependencies between endogenous and exogenous variables, and the measurement model showing the relations between latent variables and their indicators. Exploratory and Confirmatory factor analysis
Factor analysis
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved, uncorrelated variables called factors. In other words, it is possible, for example, that variations in three or four observed variables...
models, for example, contain only the measurement part, while path diagrams can be viewed as an SEM that only has the structural part.
In specifying pathways in a model, the modeler can posit two types of relationships: (1) free pathways, in which hypothesized causal (in fact counterfactual) relationships between variables are tested, and therefore are left 'free' to vary, and (2) relationships between variables that already have an estimated relationship, usually based on previous studies, which are 'fixed' in the model.
A modeler will often specify a set of theoretically plausible models in order to assess whether the model proposed is the best of the set of possible models. Not only must the modeler account for the theoretical reasons for building the model as it is, but the modeler must also take into account the number of data points and the number of parameters that the model must estimate to identify the model. An identified model is a model where a specific parameter value uniquely identifies the model, and no other equivalent formulation can be given by a different parameter value. A data point
Data point
In statistics, a data point is a set of measurements on a single member of a statistical population, or a subset of those measurements for a given individual...
is a variable with observed scores, like a variable containing the scores on a question or the number of times respondents buy a car. The parameter is the value of interest, which might be a regression coefficient between the exogenous and the endogenous variable or the factor loading (regression coefficient between an indicator and its factor). If there are fewer data points than the number of estimated parameters, the resulting model is "unidentified" , since there are too few reference points to account for all the variance in the model. The solution is to constrain one of the paths to zero, which means that it is no longer part of the model.
Estimation of free parameters
Parameter estimation is done by comparing the actual covariance matricesCovariance matrix
In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
representing the relationships between variables and the estimated covariance matrices of the best fitting model. This is obtained through numerical maximization of a fit criterion as provided by maximum likelihood
Maximum likelihood
In statistics, maximum-likelihood estimation is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters....
estimation, weighted least squares or asymptotically distribution-free methods. This is often accomplished by using a specialized SEM analysis program of which several exist.
Assessment of fit
Assessment of fit is a basic task in SEM modeling: forming the basis for accepting or rejecting models and, more usually, accepting one competing model over another. The output of SEM programs includes matrices of the estimated relationships between variables in the model. Assessment of fit essentially calculates how similar the predicted data are to matrices containing the relationships in the actual data.Formal statistical tests and fit indices have been developed for these purposes. Individual parameters of the model can also be examined within the estimated model in order to see how well the proposed model fits the driving theory. Most, though not all, estimation methods make such tests of the model possible.
Of course as in all statistical hypothesis tests
Statistical hypothesis testing
A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...
, SEM model tests are based on the assumption that the correct and complete relevant data have been modeled. In the SEM literature, discussion of fit has led to a variety of different recommendations on the precise application of the various fit indices and hypothesis tests.
Measures of fit differ in several ways. Traditional approaches to modeling start from a null hypothesis
Null hypothesis
The practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...
, rewarding more parsimonious models (i.e. those with fewer free parameters), to others such as AIC
Akaike information criterion
The Akaike information criterion is a measure of the relative goodness of fit of a statistical model. It was developed by Hirotsugu Akaike, under the name of "an information criterion" , and was first published by Akaike in 1974...
that focus on how little the fitted values deviate from a saturated model (i.e. how well they reproduce the measured values), taking into account the number of free parameters used. Because different measures of fit capture different elements of the fit of the model, it is appropriate to report a selection of different fit measures.
Some of the more commonly used measures of fit include:
- Chi-Squared A fundamental measure of fit used in the calculation of many other fit measures. Conceptually it is a function of the sample size and the difference between the observed covariance matrix and the model covariance matrix.
- Akaike information criterionAkaike information criterionThe Akaike information criterion is a measure of the relative goodness of fit of a statistical model. It was developed by Hirotsugu Akaike, under the name of "an information criterion" , and was first published by Akaike in 1974...
(AIC)- A test of relative model fit: The preferred model is the one with the lowest AIC value.
- where k is the number of parameterParameterParameter from Ancient Greek παρά also “para” meaning “beside, subsidiary” and μέτρον also “metron” meaning “measure”, can be interpreted in mathematics, logic, linguistics, environmental science and other disciplines....
s in the statistical modelStatistical modelA statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but...
, and L is the maximized value of the likelihoodLikelihoodLikelihood is a measure of how likely an event is, and can be expressed in terms of, for example, probability or odds in favor.-Likelihood function:...
of the model.
- Root Mean Square Error of Approximation (RMSEA)
- Another test of model fit, good models are considered to have a RMSEA of .05 or less. Models whose RMSEA is .1 or more have a poor fit.
- Standardized Root Mean Residual (SRMR)
- The SRMR is a popular absolute fit indicator. A good model should have an SRMR smaller than .05.
- Comparative Fit Index (CFI)
- In examining baseline comparisons, the CFI depends in large part on the average size of the correlations in the data. If the average correlation between variables is not high, then the CFI will not be very high.
For each measure of fit, a decision as to what represents a good-enough fit between the model and the data must reflect other contextual factors such as sample size
Sample size
Sample size determination is the act of choosing the number of observations to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample...
(very large samples make the Chi-squared test overly sensitive, for instance ), the ratio of indicators to factors, and the overall complexity of the model.
Model modification
The model may need to be modified in order to improve the fit, thereby estimating the most likely relationships between variables. Many programs provide modification indices which report the improvement in fit that results from adding an additional path to the model. Modifications that improve model fit are then flagged as potential changes that can be made to the model. In addition to improvements in model fit, it is important that the modifications also make theoretical sense.Sample size
Where the proposed SEM is the basis for a research hypothesis, ad hoc rules of thumb requiring the choosing of 10 observations per indicator in setting a lower bound for the adequacy of sample sizes have been widely used since their original articulation by Nunnally(1967). Being linear in model constructs, these are easy to compute, but have been found to result in sample sizes that are too small. One study found that sample sizes in a particular stream of SEM literature averaged only 50% of the minimum needed to draw the conclusions the studies claimed. Overall, 80% of the research articles in the study drew conclusions from insufficient samples. Complexities which increase information demands in structural model estimation increase with the number of potential combinations of latent variables; while the information supplied for estimation increases with the number of measured parameters times the number of observations in the sample size – both are non-linear. Sample size in SEM can be computed through two methods: the first as a function of the ratio of indicator variables to latent variables, and the second as a function of minimum effect, power and significance. Software and methods for computing both have been developed by Westland (2010).Interpretation and communication
The model is then interpreted so that claims about the constructs can be made, based on the best fitting model.Caution should always be taken when making claims of causality even when experimentation or time-ordered studies have been done. The term causal model must be understood to mean: "a model that conveys causal assumptions," not necessarily a model that produces validated causal conclusions. Collecting data at multiple time points and using an experimental or quasi-experimental design can help rule out certain rival hypotheses but even a randomized experiment cannot rule out all such threats to causal inference. Good fit by a model consistent with one causal hypothesis invariably entails equally good fit by another model consistent with an opposing causal hypothesis. No research design, no matter how clever, can help distinguish such rival hypotheses, save for interventional experiments.
As in any science, subsequent replication and perhaps modification will proceed from the initial finding.
Advanced uses
- Invariance
- Multiple group comparison: This is a technique for assessing whether certain aspects of a structural equation model or confirmatory factor analysisConfirmatory factor analysisIn statistics, confirmatory factor analysis is a special form of factor analysis. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct . In contrast to exploratory factor analysis, where all loadings are free to vary,...
are the same across groups (e.g., gender, different cultures, test forms written in different languages, etc). - Latent growth modelingLatent growth modelingLatent growth modeling is a statistical technique used in the structural equation modeling framework to estimate growth trajectory. It is a longitudinal analysis technique to estimate growth over a period of time. It is widely used in the field of behavioral science, education and social science. ...
- Relations to other types of advanced models (hierarchical/multilevel models; item response theoryItem response theoryIn psychometrics, item response theory also known as latent trait theory, strong true score theory, or modern mental test theory, is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is based...
models) - Mixture modelMixture modelIn statistics, a mixture model is a probabilistic model for representing the presence of sub-populations within an overall population, without requiring that an observed data-set should identify the sub-population to which an individual observation belongs...
(latent class) SEM - Alternative estimation and testing techniques
- Robust inference
- Interface with surveySurvey samplingIn statistics, survey sampling describes the process of selecting a sample of elements from a target population in order to conduct a survey.A survey may refer to many different types or techniques of observation, but in the context of survey sampling it most often involves a questionnaire used to...
estimation - Multi-method multi-trait models
See also
- List of publications in statistics
- List of statistical topics
- List of statisticians
- Multivariate statisticsMultivariate statisticsMultivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one statistical variable. The application of multivariate statistics is multivariate analysis...
- Misuse of statisticsMisuse of statisticsA misuse of statistics occurs when a statistical argument asserts a falsehood. In some cases, the misuse may be accidental. In others, it is purposeful and for the gain of the perpetrator. When the statistical reason involved is false or misapplied, this constitutes a statistical fallacy.The false...
- Partial least squares regressionPartial least squares regressionPartial least squares regression is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a linear regression model by projecting the predicted variables and the...
- Regression analysisRegression analysisIn statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
Further reading
- Bartholomew, D J, and Knott, M (1999) Latent Variable Models and Factor Analysis Kendall's Library of Statistics, vol. 7. Arnold publishers, ISBN 0-340-69243-X
- Bentler, P.M. & Bonett, D.G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606.
- Bollen, K A (1989). Structural Equations with Latent Variables. Wiley, ISBN 0-471-01171-1
- Byrne, B. M. (2001) Structural Equation Modeling with AMOS - Basic Concepts, Applications, and Programming.LEA, ISBN 0-8058-4104-0
- Goldberger, A. S. (1972). Structural equation models in the social sciences. Econometrica 40, 979- 1001.
- Haavelmo, T. (1943) "The statistical implications of a system of simultaneous equations," Econometrica 11:1–2. Reprinted in D.F. Hendry and M.S. Morgan (Eds.), The Foundations of Econometric Analysis, Cambridge University Press, 477—490, 1995.
- Hoyle, R H (ed) (1995) Structural Equation Modeling: Concepts, Issues, and Applications. SAGE, ISBN 0-8039-5318-6
- Kaplan, D (2000) Structural Equation Modeling: Foundations and Extensions. SAGE, Advanced Quantitative Techniques in the Social Sciences series, vol. 10, ISBN 0-7619-1407-2
- Kline, R. B. (2010) Principles and Practice of Structural Equation Modeling (3rd Edition). The Guilford Press, ISBN 978-1-60623-877-6
- JöreskogKarl Jöreskog*Jöreskog, K. G., & Sörbom, D. . Advances in factor analysis and structural equation models. New York: University Press of America.*Jöreskog, K. G., & Moustaki, I. . Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347–387.- Festschrift...
, K. and F. Yang (1996). Non-linear structural equation models: The Kenny-Judd model with interaction effects. In G. Marcoulides and R. Schumacker, (eds.), Advanced structural equation modeling: Concepts, issues, and applications. Thousand Oaks, CA: Sage Publications.
SEM-specific software
- SEPATH in STATISTICASTATISTICASTATISTICA is a statistics and analytics software package developed by StatSoft. STATISTICA provides data analysis, data management, data mining, and data visualization procedures...
(Electronic Statistics Textbook) - LISREL
- Mplus
- Packages in RR (programming language)R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....
: - The GLLAMM package for StataStataStata is a general-purpose statistical software package created in 1985 by StataCorp. It is used by many businesses and academic institutions around the world...
- EQS
- Procedures in SAS (software)
- CALIS
- TCALIS
- AMOS in SPSSSPSSSPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....
External links
- NEUSREL homepage of the PLS-type software, that introduces new exploratory features.
- SEMNET, the main mailing list
- Structural equation modeling page under David Garson's StatNotes, NCSU
- Issues and Opinion on Structural Equation Modeling, SEM in IS Research
- Academic SEM-Workshops in Europe
- The causal interpretation of structural equations (or SEM survival kit) by Judea Pearl 2000.
- Structural Equation Modeling Reference List by Jason Newsom: journal articles and book chapters on structural equation models
- Ed Rigdon's Structural Equation Modeling Page: people, software and sites
- Path Analysis in AFNI: The open source (GPL) AFNI package contains SEM code