6 Example 1: Using SAS and SPSS to Conduct Two-Level Multivariate Analysis
Tải bản đầy đủ
586
â†œæ¸€å±®
â†œæ¸€å±®
Multivariate Multilevel Modeling
predictor. We include the school level in the analysis model in sectionÂ€14.7. The first
analysis for our two-level model will include the outcomes, but will have no predictors. The goal of this analysis is to obtain some descriptive statistics and the model
deviance, which will be used to test the effect of the treatment. In the second analysis,
a coded treatment variable will be included as an explanatory variable for each outcome. The goal of this analysis is to determine if students in the intervention group
score significantly greater, on average, than those in the control group for any of the
outcomes. If a significant overall treatment effect is present, then the treatment effect
for each outcome will be estimated and tested for significance. In the third analysis, we
presume that the outcome scores are measured on the same scale (which they are for
the simulated data set). We then test whether the impact of the treatment is the same
for each outcome.
14.6.1â•‡ Estimating the Empty or NullÂ€Model
The first analysis of the data does not include any explanatory variables and is called
an unconditional or empty model. For this model, EquationÂ€1 is the level-1 model. At
the second level of the model, the parameters, π1j and π2j, which represent the variables
intention (Y1) and knowledge (Y2), are allowed to vary across students. This level-2
modelÂ€is
π1 j = β10 + r1 j (2)
π 2 j = β 20 + r2 j ,
(3)
where β10 and β20 represent the mean for intention and knowledge, respectively. The
residual terms (r1j and r2j) are assumed to follow a bivariate normal distribution, with
an expected mean of zero, some variance and covariance.
The five parameters to be estimated for this empty model are two fixed effects (i.e., β10,
β20), which are the means of Y1 and Y2, the variance of r1j, or Y1 (τπ11), the variance of
r2j, or Y2 (τπ22), and their covariance (τπ12). Note that 1,600 cases are being used in the
analysis. The SAS and SPSS commands for estimating these parameters are given in
TableÂ€14.5 and selected results are presented in TableÂ€14.6.
In TableÂ€14.6, the SAS and SPSS outputs show that the means for intention (Y1) and
knowledge (Y2), as shown in the tables of fixed effects, are respectively, 50.05 and
50.29, with variances 119.95 and 127.90, and covariance 60.87, as shown in the covariance parameter tables and also in the R or residual covariance matrices. Given the
covariance matrix, the standard deviations for intention and knowledge are, respectively, 10.95 and 11.31, and the correlation of the residuals is .491, indicating that
intention and knowledge are positively and moderately correlated. The model deviance
for the five parameters is 12,030.2, as shown in the outputs. This deviance value, as
previously indicated in this text, reflects the fit of the model. This model deviance (i.e.,
−2LL) will be compared to the deviance obtained when the treatment variable is added
Chapter 14
â†œæ¸€å±®
â†œæ¸€å±®
the model to determine whether the fit of the model improves with the addition of the
treatment indicator variable. Note that the full maximum likelihood estimation procedure (as implemented throughout the chapter), not restricted maximum likelihood,
needs to be used when one wishes to test the effects of explanatory variables using
deviances.
Note that although we do not place much focus on testing variances in this chapter,
each variance and covariance element can be tested for significance with a z test, as
described in ChapterÂ€13. This test is provided by SAS and SPSS through the COVTEST and TESTCOV options, respectively, and is shown in the covariance parameters
tables in TableÂ€14.6. However, the z test for variances should be used as a rough guide
for determining statistical significance as the sampling distribution of the variance is
approximately normal. AÂ€chi-square test using model deviances is preferred for testing
variances, with such a test illustrated in sectionÂ€14.7.3.
14.6.2 Including an Explanatory Variable in theÂ€Model
We now include the treatment variable in this multivariate analysis. The goal of the
analysis is to determine if there are treatment effects for any outcome, which will be
accomplished by a global test of the null hypothesis that no treatment effects are present for any outcome. If treatment effects are present, then the effects and statistical test
results for the treatment will be examined for each outcome. For this two-level model,
EquationÂ€1 is the level-1 model. At the student level, EquationsÂ€2 and 3 are modified so
that the treatment variable (Treat) is added to each of the equations. Although dummy
coding can be used for the treatment variable, the coding employed here uses values
of −.5 and .5 representing membership in the control and treatment conditions, respectively. The level-2 model that is used nowÂ€is
π1 j = β10 + β11 X j + r1 j
(4)
π 2 j = β 20 + β 21 X j + r2 j ,
(5)
where Xj represents the treatment variable. Due to the coding used for the treatment
variable, β10 and β20 represent the mean for intention and knowledge, respectively.
More importantly, β11 and β21 represent the mean difference between students in the
experimental and control conditions for intention and knowledge, respectively. The
residual terms are assumed to follow a bivariate normal distribution, with an equal
variance-covariance matrix across treatment groups. Note that the multivariate null
hypothesis for the test of the treatment is H0 : β11Â€=Â€β21Â€=Â€0, which will be tested by
using deviances from this and the empty model of EquationsÂ€1–3.
For this model, four fixed effects (the four βs in EquationsÂ€ 4 and 5) and the three
variance-covariance elements are to be estimated. Note that to include an explanatory
variable in the model so that separate effects of that variable are estimated for each
outcome, SAS and SPSS require that a given explanatory variable be multiplied by the
587
588
â†œæ¸€å±®
â†œæ¸€å±®
Multivariate Multilevel Modeling
Table 14.6 Selected Output for the Two-Level EmptyÂ€Model
SAS
Estimated R Matrix for Student 1
Row
Col1
Col2
1
2
119.95
60.8689
60.8689
127.9
Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
12030.2
12040.2
12040.2
12063.6
Solution for Fixed Effects
Effect
Index1
Estimate
Standard
Error
DF
tÂ€Value
Pr > |t|
Index1
Index1
1
2
50.0548
50.2909
0.3872
0.3998
800
800
129.27
125.78
<.0001
<.0001
Covariance Parameter Estimates
Cov Parm
UN(1,1)
UN(2,1)
UN(2,2)
Subject
Student
Student
Student
Estimate
Standard Error
119.95
60.8689
127.9
5.9976
4.8794
6.3951
Z Value
20
12.47
20
Pr Z
<.0001
<.0001
<.0001
SPSS
Estimates of Fixed Effectsa
Parameter
[Index1=1]
[Index1=2]
a
Estimate
50.054814
50.290916
Std. Error
0.38722
0.399847
Df
800
800
t
129.267
125.775
Sig.
95%
Confidence
Interval
Lower
Bound
Upper
Bound
49.294726
49.506042
50.814902
51.075789
Dependent Variable: Response.
(Continued)
Chapter 14
â†œæ¸€å±®
â†œæ¸€å±®
Table 14.6:â•‡ (Continued)
Estimates of Covariance Parametersa
95% Confidence Interval
Parameter
Repeated
Measures
a
UN (1,1)
UN (2,1)
UN (2,2)
Estimate
Std. Error
Wald Z
Sig.
Lower
Bound
119.95174
60.868913
127.902203
5.997587
4.879436
6.39511
20
12.475
20
0
0
0
108.754309
51.305394
115.962601
Upper
Bound
132.302067
70.432431
141.071116
Dependent Variable: Response.
Information Criteriaa
-2 Log Likelihood
12030.164
Akaike’s Information Criterion (AIC)
12040.164
Hurvich and Tsai’s Criterion (AICC)
12040.202
Bozdogan’s Criterion (CAIC)
12072.053
Schwarz’s Bayesian Criterion (BIC)
12067.053
The information criteria are displayed in smaller-is-better forms.
a
Dependent Variable: Response.
Residual Covariance (R) Matrixa
[Index1 = 1]
[Index1 = 2]
[Index1 = 1]
[Index1 = 2]
119.95174
60.868913
60.868913
127.902203
Unstructured
a
Dependent Variable: Response.
Index1 variable. So, to estimate β11 and β21 in SAS, the term TREAT*INDEX1 must be
added to the MODEL statement shown in TableÂ€14.5. In SPSS, we add the statements
WITH TREAT to the MIXED command and TREAT*INDEX1 to the FIXED subcommand. The complete SAS and SPSS control lines for estimating all models in this
chapter are shown in sectionÂ€14.9. TableÂ€14.7 shows selected results for this model.
As shown in the outputs, the deviance for this current model is 11,847.6, with seven
parameters estimated. Recall that the deviance for the empty model was 12,030.2 with
five parameters estimated. The global test for the null hypothesis that no treatment
effects are present for any of the outcomes (H0: β11Â€=Â€β21Â€=Â€0) can be tested by computing the difference in these deviances, which is distributed as a chi-square value having
degrees of freedom equal to the difference in the number of parameters estimated for
589
590
â†œæ¸€å±®
â†œæ¸€å±®
Multivariate Multilevel Modeling
Table 14.7:â•‡ Selected Output for the Two-Level Model With Treatment Effects
SAS
Solution for Fixed Effects
Effect
Index1
Estimate
Standard
Error
DF
tÂ€Value
Pr > |t|
Index1
Index1
Treat*Index1
Treat*Index1
1
2
1
2
50.0548
50.2909
8.7233
8.6257
0.3552
0.3696
0.7104
0.7393
800
800
800
800
140.92
136.06
12.28
11.67
<.0001
<.0001
<.0001
<.0001
Estimated R Matrix for Student 1
Row
Col1
Col2
1
2
100.93
42.0579
42.0579
109.3
Covariance Parameter Estimates
Cov Parm
Subject
Estimate
Standard Error
Z Value
Pr Z
UN(1,1)
UN(2,1)
Student
Student
100.93
42.0579
5.0464
4.0001
20
10.51
<.0001
<.0001
UN(2,2)
Student
109.3
5.4651
20
<.0001
Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
11847.6
11861.6
11861.7
11894.4
SPSS
Estimates of Fixed Effectsa
95% Confidence Interval
Parameter
Estimate
Std. Error
Df t
[Index1=1]
[Index1=2]
[Index1=1] * Treat
[Index1=2] * Treat
50.054814
50.290916
8.723254
8.625689
0.35519
0.369631
0.71038
0.739262
800 140.924
800 136.057
800 12.28
800 11.668
a
Sig. Lower Bound
49.3576
49.565355
7.328825
7.174567
Upper Bound
50.752029
51.016477
10.117683
10.07681
Dependent Variable: Response.
(Continued)
Chapter 14
â†œæ¸€å±®
â†œæ¸€å±®
Table 14.7:â•‡ (Continued)
Estimates of Covariance Parametersa
95% Confidence Interval
Parameter
Repeated
Measures
a
Estimate
UN (1,1)
UN (2,1)
UN (2,2)
Std. Error
100.92795 5.046398
42.057895 4.00007
109.301577 5.465079
Wald Z
Sig.
20
10.514
20
Lower
Bound
Upper
Bound
91.50638
34.217901
99.098334
111.319573
49.897889
120.555355
Dependent Variable: Response.
Information Criteriaa
-2 Log Likelihood
Akaike’s Information Criterion (AIC)
Hurvich and Tsai’s Criterion (AICC)
Bozdogan’s Criterion (CAIC)
Schwarz’s Bayesian Criterion (BIC)
11847.606
11861.606
11861.676
11906.25
11899.25
The information criteria are displayed in smaller-is-better forms.
a
Dependent Variable: Response.
Residual Covariance (R) Matrixa
[Index1 = 1]
[Index1 = 2]
[Index1 = 1]
[Index1 = 2]
100.92795
42.057895
42.057895
109.301577
Unstructured
a
Dependent Variable: Response.
these models, i.e., 7 − 5Â€=Â€2. This deviance test can be used here because the empty
model can be obtained from this current model by constraining the treatment effects to
be zero. Computing the difference in model deviances results in a chi-square value of
12,030.2 − 11,847.6Â€=Â€182.6, which is statistically significant, as this value exceeds the
chi-square critical value of 5.99 (αÂ€=Â€.05, dfÂ€=Â€2).
Since rejection of the overall multivariate null hypothesis suggests that treatment
effects are present for at least one of the outcomes, we now consider the estimates
and statistical test results of the treatment effect for each outcome. Here, the two
null hypotheses being tested are H0 : β11Â€=Â€0 and H0 : β21Â€=Â€0. As shown in the
outputs, the treatment effects are 8.72 (SEÂ€=Â€.710) for intention and 8.63 (SE =
.739) for knowledge. The t ratios of about 12.28 (p < .05) and 11.67 (p < .05), respectively, for intention and knowledge, suggest that treatment effects are present
591
592
â†œæ¸€å±®
â†œæ¸€å±®
Multivariate Multilevel Modeling
for each outcome in the population. To obtain the group means for each outcome,
values of −.5 and .5 for the control and experimental group can be inserted into
EquationsÂ€4 and 5, along with the parameter estimates. Thus, for intention, the control group mean is 50.055 − .5(8.723)Â€=Â€45.693, and the experimental group mean is
50.055 + .5(8.723)Â€=Â€54.417. For knowledge, you can confirm that the control group
mean is 45.978 and the experimental group mean is 54.604. The residual variances
are also shown in the outputs, and they are 100.93 (SD =10.05) for intention and
109.30 (SD = 10.45) for knowledge. The correlation between the residuals can be
calculated in the usual manner and is .400.
14.6.3â•‡ Comparison to Traditional MANOVA Results
For comparison purposes, we provide and briefly discuss selected SPSS results from a
traditional multivariate analysis of these same data with the treatment as the explanatory variable. TableÂ€14.8 shows that the p value associated with Wilks’ lambda is quite
small, leading to the decision to reject the overall multivariate null hypothesis for the
treatment effects, the same decision as obtained with MVMM. In the parameter estimates table in TableÂ€14.8, SPSS automatically dummy codes the treatment variable,
coding the value for the experimental and control groups, in this example, as 0 and
1, respectively. Thus, in that table, the intercept represents the experimental group
average for a given outcome, and the treatment effect is computed by subtracting the
experimental mean from the control mean (thus obtaining negative differences). Other
than that, the SPSS results in this table are virtually the same as those obtained with
the MVMM approach, with the difference in means estimated to be 8.72 (SE =.711)
for intentions and 8.63 (SE =.740) for knowledge. Thus, if desired, MVMM can be
used in place of traditional multivariate analysis. The remaining analyses in this chapter illustrate some extensions of the traditional MANOVA approach that can be more
effectively handled byÂ€MVMM.
14.6.4â•‡Testing Whether the Effect of a Predictor Differs Across
Outcomes
The final analysis conducted with the two-level example tests whether the effect of the
treatment is of the same magnitude for each outcome. Given that the outcomes are measured on or placed on the same scale, investigators may wish to learn if a new intervention
has stronger effects for some outcomes than others. This can be done by first constraining
the fixed effects—in this case treatment effects—to be equal, and then testing the difference in fit using the deviances between this constrained model and one where the effects
are freely estimated. In EquationsÂ€4 and 5, the effects of the treatment are freely estimated (i.e., without constraints) for intention (β11) and knowledge (β21). In this analysis,
we test whether these treatment effects are the same or different for the two outcomes.
The model used now is essentially the same as with the previous analysis except that an
assumed common treatment effect will be estimated. As such, the number of parameters
estimated is now six, consisting of three fixed effects (i.e., only one treatment effect estimate) and the three elements in the variance-covariance matrix.
Chapter 14
â†œæ¸€å±®
â†œæ¸€å±®
Table 14.8:â•‡ Selected Output From a Traditional MANOVA
Multivariate Testsb
Effect
Value
F
Hypothesis df
Error df
Sig.
a
Intercept
Pillai’s Trace
Wilks’ Lambda
Hotelling’s Trace
Roy’s Largest Root
.972
.028
34.263
34.263
13654.003
13654.003a
13654.003a
13654.003a
2.000
2.000
2.000
2.000
797.000
797.000
797.000
797.000
.000
.000
.000
.000
Treat
Pillai’s Trace
Wilks’ Lambda
Hotelling’s Trace
Roy’s Largest Root
.204
.796
.256
.256
102.149a
102.149a
102.149a
102.149a
2.000
2.000
2.000
2.000
797.000
797.000
797.000
797.000
.000
.000
.000
.000
a
a
Exact statistic
Design: Intercept + Treat
Parameter Estimates
95% Confidence Interval
Dependent
Variable
Y1
Y2
a
Parameter
B
Intercept
[Treat=-.50]
[Treat=.50]
54.416 .503
-8.723 .711
0a
.
Intercept
[Treat=-.50]
[Treat=.50]
54.604 .523
-8.626 .740
0a
.
Sig.
Lower
Bound
Upper
Bound
108.196 .000
-12.264 .000
53.429
-10.119
55.404
-7.327
Std. Error
t
.
.
104.327 .000
-11.653 .000
.
.
.
53.576
-10.079
.
.
.
55.631
-7.173
This parameter is set to zero because it is redundant.
To estimate this model in SAS and SPSS, we replace the TREAT*INDEX1 term in the
previous model with TREAT. Complete control lines for this model can be found in
sectionÂ€14.9. Selected outputs are presented in TableÂ€14.9.
As shown in the outputs, the deviance associated with this constrained treatment-effects
model is 11,847.621, which is only slightly larger (i.e., reflecting worse fitting) than
the previous model that provided separate estimates of treatment effects. Specifically, the difference in model deviances, which is distributed as a chi-square value is
11,847.621 − 11,847.606Â€=Â€0.015, which does not exceed the chi-square critical value
of 3.84 (αÂ€=Â€.05, dfÂ€=Â€1). Thus, this test does not suggest that these two models have
different fit. As such, there is evidence supporting the hypothesis that the effect of the
intervention is similar for intention and knowledge. Note that in the SAS and SPSS
outputs, shown in the fixed effects tables of TableÂ€14.9, the common treatment effect is
estimated to be 8.678 (SEÂ€=Â€.606) and is statistically significant (p < .05).
593
594
â†œæ¸€å±®
â†œæ¸€å±®
Multivariate Multilevel Modeling
Table 14.9:â•‡ Selected Output for the Two-Level Model With Treatment Effects
Constrained to Be Equal
SAS
Solution for Fixed Effects
Effect
Index1
Estimate
Standard
Error
DF
tÂ€Value
Pr > |t|
Index1
Index1
Treat
1
2
50.0548
50.2909
8.6777
0.3552
0.3696
0.606
799
799
799
140.92
136.06
14.32
<.0001
<.0001
<.0001
Estimated R Matrix for Student 1
Row
Col1
Col2
1
2
100.93
42.0573
42.0573
109.3
Covariance Parameter Estimates
Cov Parm
Subject
Estimate
Standard Error
Z Value
Pr Z
UN(1,1)
UN(2,1)
UN(2,2)
Student
Student
Student
100.93
42.0573
109.3
5.0464
4.0001
5.4651
20
10.51
20
<.0001
<.0001
<.0001
Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
11847.6
11859.6
11859.7
11887.7
SPSS
Estimates of Fixed Effectsa
95% Confidence
Interval
Parameter
Estimate
Std. Error
Df
T
[Index1=1]
[Index1=2]
Treat
50.054814
50.290916
8.67771
0.355191
0.369632
0.606001
799.994
799.993
800
140.924
136.057
14.32
a
Sig.
Lower
Bound
Upper
Bound
49.357598
49.565353
7.488171
50.75203
51.016479
9.867249
Dependent Variable: Response.
(Continued)
Chapter 14
â†œæ¸€å±®
â†œæ¸€å±®
Table 14.9:â•‡ (Continued)
Estimates of Covariance Parametersa
95% Confidence
Interval
Parameter
Repeated
Measures
a
Estimate
UN (1,1)
UN (2,1)
UN (2,2)
100.928469
42.057303
109.302254
Std. Error
Wald Z
5.046442
4.000082
5.465135
20
10.514
20
Sig.
Lower
Bound
Upper
Bound
91.506817
34.217285
99.098907
111.320185
49.89732
120.556151
Dependent Variable: Response.
Information Criteriaa
-2 Log Likelihood
Akaike’s Information Criterion (AIC)
Hurvich and Tsai’s Criterion (AICC)
Bozdogan’s Criterion (CAIC)
Schwarz’s Bayesian Criterion (BIC)
11847.621
11859.621
11859.673
11897.887
11891.887
The information criteria are displayed in smaller-is-better forms.
a
Dependent Variable: Response.
Residual Covariance (R) Matrixa
[Index1 = 1]
[Index1 = 2]
[Index1 = 1]
[Index1 = 2]
100.928469
42.057303
42.057303
109.302254
Unstructured
a
Dependent Variable: Response.
14.7â•‡EXAMPLE 2: USING SAS AND SPSS TO CONDUCT
THREE-LEVEL MULTIVARIATE ANALYSES
The previous examples illustrated two-level multivariate analyses, but did not
include the school level in the statistical model. In the research design used in this
chapter, students are nested in one of 40 schools. Often, such nesting needs to be
accounted for in the analysis because the responses of students within treatment
groups are not independent, as is assumed in the previous analysis. Instead, these
responses are likely related because the students share a similar environment. As discussed previously, such dependence, if not accounted for statistically, can increase
type IÂ€error rates associated with fixed effects and may lead to false claims of, in
this case, the presence of treatment effects. The analyses in this section take the
595
596
â†œæ¸€å±®
â†œæ¸€å±®
Multivariate Multilevel Modeling
within-school dependence into account by adding a third level—the school level—to
the multilevel model. Further, instead of including only the treatment variable in the
model, we include other explanatory variables, including student gender, student
pretest knowledge, a school average of these pretest scores, and a treatment-gender
productÂ€term.
There is one primary hypothesis underlying these analyses. That is, while treatment
effects are expected to be present for intention and knowledge for both boys and girls,
boys are expected to derive greater benefit from the computer-based instruction. The
reason for this extra impact of the intervention, we assume, is that fifth-grade boys will
enjoy playing the instructional video game more than girls. As a result, the impact that
the experimental program has for intention and knowledge will be greater for boys
than girls. Thus, the investigators hypothesize the presence of a treatment-by-gender
interaction for both outcomes, where the intervention will have stronger effects on
intention and knowledge for boys than for girls.
In addition, because the cluster randomized trial with this limited number of schools (i.e.,
40) does not generally provide for great statistical power, knowledge pretest scores were
collected from all students. These scores are expected to be fairly strongly associated with
both outcomes. Further, because associations may be stronger at the school level than at the
student level, the researchers computed school averages of the knowledge pretest scores
and plan to include this variable in the model to provide for increased power.
Three MVMM analyses are illustrated next. The first analysis includes the treatment variable as the sole explanatory variable. The purpose of this analysis is to obtain a preliminary
estimate of the treatment effect for each outcome. The second analysis includes all of the
explanatory variables as well as the treatment-by-gender interaction. The primary purpose
of this analysis is to test the hypothesized interactions. If the multivariate test for the interaction is significant, the analysis will focus on examining the treatment-by-gender interaction for each outcome, and if significant, describing the nature of any interactions obtained.
The third analysis will illustrate a multivariate test for multiple variance and covariance
elements. Often, in practice, it is not clear if, for example, the association between a student
explanatory variable and outcome is the same or varies across schools. Researchers may
then rely on empirical evidence (e.g., a statistical test result) to address this issue.
14.7.1â•‡ AÂ€Three-Level Model for Treatment Effects
For this first analysis, EquationÂ€1, which had previously been the level-1 model, needs
to be modified slightly in order to acknowledge the inclusion of the school level. The
level-1 model nowÂ€is:
Yijk = π1 jk a1 jk + π 2 jk a2 jk ,
(6)
which is identical to EquationÂ€1 except that subscript k has been added. Thus, π1jk and
π2jk represent the intention and knowledge posttest scores, respectively, for a given