4 A Structural Equation Modeling Approach to Random‑ and Mixed‑Effects Models
Tải bản đầy đủ
246
COMBINING AND COMPARING EFFECT SIZES
10.4.1Estimating Random‑Effects Models
The SEM representation of random-effects meta-analysis (Cheung, 2008)
parallels the fixed-effects model I described in Chapter 9 (see Figure 9.2) but
models the effect size predicted by intercept path as a random slope (see, e.g.,
Bauer, 2003; Curran, 2003; Mehta & Neale, 2005; Muthén, 1994). In other
words, this path varies across studies, which captures the between-study
variance of a random-effects meta-analysis. Importantly, this SEM representation can only estimate these models using software that perform random
slope analyses.3
One4 path diagram convention for denoting randomly varying slopes is
shown in Figure 10.2. This path diagram contains the same representation
of regressing the transformed effect size onto the transformed intercept as
Path diagram:
1.0*
Intercept*
Zr *
u
0*
u
b0
1
m
Mplus syntax:
TITLE: Random-effects analysis
DATA: File is Table10_1.txt;
VARIABLE: NAMES N Age r Zr W interc;
USEVARIABLES ARE Zr interc;
DEFINE: w2 = SQRT(W);
Zr = w2 * Zr;
interc = w2 * interc;
ANALYSIS: TYPE=RANDOM;
!Specifies random slopes analysis
MODEL:
[Zr@0.0];
!Fixes intercept at 0
Zr@1.0;
!Fixes variance at 1
u | Zr ON interc;
!U as random effect
[u*];
!Specifies estimation of random-effects mean
u*;
!Specifies estimation of variance of random effect
OUTPUT:
FIGURE 10.2. Path diagram and Mplus syntax to estimate random-effects model.
Fixed-, Random-, and Mixed-Effects Models
247
does the fixed-effects model of Chapter 9 (see Figure 9.1). However, there is
a small circle on this path, which indicates that this path can vary randomly
across cases (studies). The label u next to this circle denotes that the newly
added piece to the path diagram—the latent construct labeled u—represents
the random effect. The regression path (b0) from the constant (i.e., the triangle with “1” in the middle) to this construct captures the random-effects
mean. The variance of this construct (m, using Cheung’s 2008 notation) is
the estimated between-study variance of the effect size (what I had previously
called t2).
To illustrate, I fit the data from 22 studies shown in Table 10.1 under
an SEM representation of a random-effects model. As I described in Chapter 9, the effect sizes (Zr) and intercepts (the constant 1) of each study are
transformed by multiplying these values by the square root of the study’s
weight (Equation 9.7). This allows each study to be represented as an equally
weighted case in the analysis, as the weighting is accomplished through these
transformations.
The Mplus syntax shown in Figure 10.2 specifies that this is a randomslopes analysis by inserting the “TYPE=RANDOM” command, specifying
that U represents the random effect with estimated mean and variance. The
mean of U is the random-effects mean of this meta-analysis; here, the value
was estimated to be 0.369 with a standard error of .049. This indicates that
the random-effects mean Zr is .369 (equivalent r = .353) and statistically significant (Z = .369/.049 = 7.53, p < .01; alternatively, I could compute confidence intervals). The between-study variance (t2) is estimated as the variance
of U; here, the value is .047.
The random-effects mean and estimated between-study variance obtained
using this SEM representation are similar to those I reported earlier (Section
10.2). However, they are not identical (and the differences are not due solely
to rounding imprecision). The differences in these values are due to the difference in estimation methods used by these two approaches; the previously
described version used least squares criteria, whereas the SEM representation used maximum likelihood (the most common estimation criterion for
SEM). To my knowledge, there has been no comprehensive comparison of
which estimation method is preferable for meta-analysis (or—more likely—
under what conditions one estimator is preferable to the other). Although I
encourage you to watch for future research on this topic, it seems reasonable
to conclude for now that results should be similar, though not identical, for
either approach.
248
COMBINING AND COMPARING EFFECT SIZES
10.4.2Estimating Mixed‑Effects Models
As you might anticipate, this SEM approach (if you have followed the material so far) can be rather easily extended to estimate mixed-effects models,
in which fixed-effects moderators are evaluated in the context of random
between-study heterogeneity. To evaluate mixed-effects models in an SEM
framework, you simply build on the random-effects model (in which the
transformed intercept predicting transformed effect size slope randomly varies across studies) by adding transformed study characteristics (moderators)
as fixed predictors of the effect size.
I demonstrate this analysis using the 22 studies from Table 10.1, in
which I evaluate moderation by sample age while also modeling betweenstudy variance (paralleling analyses in Section 10.3). This model is graphically shown in Figure 10.3, with accompanying Mplus syntax. As a reminder,
the effect size and all predictors (e.g., age and intercept) are transformed for
each study by multiplying by the square root of the study weight (Equation
9.7). To evaluate the moderator, you evaluate the predictive path between
the coded study characteristic (age) and the effect size. In this example, the
value was estimated as b1 = .013, with a standard error of .012, so it was not
statistically significant (Z = .013/.012 = 1.06, p = .29). These results are similar to those obtained using the iterative matrix algebra approach I described
in Section 10.3, though they will not necessarily be identical given different
estimator criteria.
10.4.3Conclusions Regarding SEM Representations
As with fixed-effects moderator analyses, the major advantage of estimating
mixed-effects meta-analytic model in the SEM framework (Cheung, 2008) is
the ability to retain studies with missing predictors (i.e., coded study characteristics in the analyses). If you are fluent with SEM, you may even find
it easier to estimate models within this framework than using the other
approaches.
You should, however, keep in mind several cautions that arise from the
novelty of this approach. It is likely that few (if any) readers of your metaanalysis will be familiar with this approach, so the burden falls on you to
describe it to the reader. Second, the novelty of this approach also means that
some fundamental issues have yet to be evaluated in quantitative research. For
instance, the relative advantages of maximum likelihood versus least squares
criteria, as well as modifications that may be needed under certain condi-
Fixed-, Random-, and Mixed-Effects Models
249
Path diagram:
Age*
b1
Intercept*
1.0*
Zr *
u
0*
u
b0
1
m
Mplus syntax:
TITLE: Mixed-effects analysis
DATA: File is Table10_1.txt;
VARIABLE: NAMES N Age r Zr W interc;
USEVARIABLES ARE Age Zr interc;
DEFINE: w2 = SQRT(W);
Zr = w2 * Zr;
interc = w2 * interc;
Age = w2 * Age;
ANALYSIS: TYPE=RANDOM; !Specifies random slope analysis
MODEL:
[Zr@0.0];
!Fixes intercept at 0
Zr@1.0;
!Fixes variance at 1
u | Zr ON interc;
!U as random effect
Zr ON Age;
!Age as fixed-effect predictor
[u*];
!Specifies estimation of random-effects mean
u*;
!Specifies estimation of variance of random effect
OUTPUT:
FIGURE 10.3. Path diagram and Mplus syntax to estimate mixed-effects model.
tions (e.g., restricted maximum likelihood or other estimators with small
numbers of studies) represent fundamental statistical underpinnings of this
approach that have not been fully explored (see Cheung, 2008). Nevertheless,
this representation of meta-analysis within SEM has the potential to merge
to analytic approaches with long histories, and there are many opportunities to apply the extensive tools from the SEM field in your meta-analyses.
For these reasons, I view the SEM representation as a valuable approach to
consider, and I encourage you to watch the literature for further advances in
this approach.
250
COMBINING AND COMPARING EFFECT SIZES
10.5 Practical Matters: Which Model
Should I Use?
In Sections 10.1 and 10.2, I have presented the random-effects model for
estimating mean effect sizes, which can be contrasted with the fixed-effects
model I described in Chapter 8. I have also described (Section 10.3) mixedeffects models, in which (fixed) moderators are evaluated in the context of
conditional random heterogeneity; this section can be contrasted with the
fixed-effects moderator analyses of Chapter 9. An important question to ask
now is which of these models you should use in a particular meta-analysis.
At least five considerations are relevant: the types of conclusions you wish
to draw, the presence of unexplained heterogeneity among the effect sizes in
your meta-analysis, statistical power, the presence of outliers, and the complexity of performing these analyses. I have arranged these in order from
most to least important, and I elaborate on each consideration next. I conclude this section by describing the consequences of using an inappropriate
model; these consequences serve as a further set of considerations in selecting a model.
Perhaps the most important consideration in deciding between a fixedversus random-effects model, or between a fixed-effects model with moderators versus a mixed-effects model, is the types of conclusions you wish to
draw. As I described earlier, conclusions from fixed-effects models are limited to only the sample of studies included in your meta-analysis (i.e., “these
studies show . . . ” type conclusions), whereas random- and mixed-effects
models allow more generalizable conclusions (i.e., “the research shows . . . ”
or “there is...” type of conclusions). Given that the last-named type of conclusions are more satisfying (because they are more generalizable), this consideration typically favors the random- or mixed-effects models. Regardless of
which type of model you select, however, it is important that you frame your
conclusions in a way consistent with your model.
A second consideration is based on the empirical evidence of unexplained
heterogeneity. By unexplained heterogeneity, I mean two things. First, in the
absence of moderator analysis (i.e., if just estimating the mean effect size),
finding a significant heterogeneity (Q) test (see Chapter 8) indicates that the
heterogeneity among effect sizes cannot be explained by sampling fluctuation
alone. Second, if you are conducting fixed-effects moderator analysis, you
should examine the within-group heterogeneity (Qwithin; for ANOVA analogue tests) or residual heterogeneity (Qresidual; for regression analog tests).
If these are significant, you conclude that there exists heterogeneity among
effect sizes not systematically explained by the moderators.5 In both situa-
Fixed-, Random-, and Mixed-Effects Models
251
tions, you might use the absence versus presence of unexplained heterogeneity to inform your choice between fixed- versus random- or mixed-effects
models (respectively). Many meta-analysts take this approach. However, I
urge you to not make this your only consideration because the heterogeneity (i.e., Q) test is an inferential test that can vary in statistical power. In
meta-analyses with many studies that have large sample sizes, you might find
a significant residual heterogeneity that is trivial, whereas a meta-analysis
with few studies having small sample sizes might fail to detect potentially
meaningful heterogeneity. For this reason, I recommend against basing your
model decision only on empirical findings of unexplained heterogeneity.
A third consideration is the relative statistical power of fixed- versus
random-effects models (or fixed-effects with moderators versus mixedeffects models). The statistical power of a meta-analysis depends on many
factors—number of studies, sample sizes of studies, degree to which effect
sizes must be corrected for artifacts, magnitude of population variance in
effect size, and of course true mean population effect size. Therefore, it is not
a straightforward computation (see e.g., Cohn & Becker, 2003; Field, 2001;
Hedges & Pigott, 2001, 2004). However, to illustrate this difference in power
between fixed- and random-effects models, I have graphed some results of
a simulation by Field (2001), shown in Figure 10.4. These plots make clear
the greater statistical power of fixed-effects versus random-effects models.
More generally, fixed-effects analyses will always provide as high (when t2
= 0) or higher (when t2 > 0) statistical power than random-effects models.
This makes sense in light of my earlier observation that the random-effects
weights are always smaller than the fixed-effects weights; therefore, the sum
of weights is smaller and the standard error of the average effect size is larger
for random- than for fixed-effects models. Similarly, analysis of moderators
in fixed-effects models will provide as high or higher statistical power as
mixed-effects models. For these reasons, it may seem that this consideration
would always favor fixed-effects models. However, this conclusion must be
tempered by the inappropriate precision associated with high statistical
power when a fixed-effects model is used inappropriately in the presence
of substantial variance in population effect sizes (see below). Nevertheless,
statistical power is one important consideration in deciding among models:
If you have questionable statistical power (small number of studies and/or
small sample sizes) to detect the effects you are interested in, and you are
comfortable with the other considerations, then you might choose a fixedeffects model.
The presence of studies that are outliers in terms of either their effect
sizes or their standard errors (e.g., sample sizes) is better managed in ran-