Tải bản đầy đủ
6 Example 1: Using SAS and SPSS to Conduct Two-Level Multivariate Analysis

# 6 Example 1: Using SAS and SPSS to Conduct Two-Level Multivariate Analysis

Tải bản đầy đủ

586

â†œæ¸€å±®

â†œæ¸€å±®

Multivariate Multilevel Modeling

predictor. We include the school level in the analysis model in sectionÂ€14.7. The first
analysis for our two-level model will include the outcomes, but will have no predictors. The goal of this analysis is to obtain some descriptive statistics and the model
deviance, which will be used to test the effect of the treatment. In the second analysis,
a coded treatment variable will be included as an explanatory variable for each outcome. The goal of this analysis is to determine if students in the intervention group
score significantly greater, on average, than those in the control group for any of the
outcomes. If a significant overall treatment effect is present, then the treatment effect
for each outcome will be estimated and tested for significance. In the third analysis, we
presume that the outcome scores are measured on the same scale (which they are for
the simulated data set). We then test whether the impact of the treatment is the same
for each outcome.
14.6.1â•‡ Estimating the Empty or NullÂ€Model
The first analysis of the data does not include any explanatory variables and is called
an unconditional or empty model. For this model, EquationÂ€1 is the level-1 model. At
the second level of the model, the parameters, π1j and π2j, which represent the variables
intention (Y1) and knowledge (Y2), are allowed to vary across students. This level-2
modelÂ€is
π1 j = β10 + r1 j (2)
π 2 j = β 20 + r2 j , 

(3)

where β10 and β20 represent the mean for intention and knowledge, respectively. The
residual terms (r1j and r2j) are assumed to follow a bivariate normal distribution, with
an expected mean of zero, some variance and covariance.
The five parameters to be estimated for this empty model are two fixed effects (i.e., β10,
β20), which are the means of Y1 and Y2, the variance of r1j, or Y1 (τπ11), the variance of
r2j, or Y2 (τπ22), and their covariance (τπ12). Note that 1,600 cases are being used in the
analysis. The SAS and SPSS commands for estimating these parameters are given in
TableÂ€14.5 and selected results are presented in TableÂ€14.6.
In TableÂ€14.6, the SAS and SPSS outputs show that the means for intention (Y1) and
knowledge (Y2), as shown in the tables of fixed effects, are respectively, 50.05 and
50.29, with variances 119.95 and 127.90, and covariance 60.87, as shown in the covariance parameter tables and also in the R or residual covariance matrices. Given the
covariance matrix, the standard deviations for intention and knowledge are, respectively, 10.95 and 11.31, and the correlation of the residuals is .491, indicating that
intention and knowledge are positively and moderately correlated. The model deviance
for the five parameters is 12,030.2, as shown in the outputs. This deviance value, as
previously indicated in this text, reflects the fit of the model. This model deviance (i.e.,
−2LL) will be compared to the deviance obtained when the treatment variable is added

Chapter 14

â†œæ¸€å±®

â†œæ¸€å±®

the model to determine whether the fit of the model improves with the addition of the
treatment indicator variable. Note that the full maximum likelihood estimation procedure (as implemented throughout the chapter), not restricted maximum likelihood,
needs to be used when one wishes to test the effects of explanatory variables using
deviances.
Note that although we do not place much focus on testing variances in this chapter,
each variance and covariance element can be tested for significance with a z test, as
described in ChapterÂ€13. This test is provided by SAS and SPSS through the COVTEST and TESTCOV options, respectively, and is shown in the covariance parameters
tables in TableÂ€14.6. However, the z test for variances should be used as a rough guide
for determining statistical significance as the sampling distribution of the variance is
approximately normal. AÂ€chi-square test using model deviances is preferred for testing
variances, with such a test illustrated in sectionÂ€14.7.3.
14.6.2 Including an Explanatory Variable in theÂ€Model
We now include the treatment variable in this multivariate analysis. The goal of the
analysis is to determine if there are treatment effects for any outcome, which will be
accomplished by a global test of the null hypothesis that no treatment effects are present for any outcome. If treatment effects are present, then the effects and statistical test
results for the treatment will be examined for each outcome. For this two-level model,
EquationÂ€1 is the level-1 model. At the student level, EquationsÂ€2 and 3 are modified so
that the treatment variable (Treat) is added to each of the equations. Although dummy
coding can be used for the treatment variable, the coding employed here uses values
of −.5 and .5 representing membership in the control and treatment conditions, respectively. The level-2 model that is used nowÂ€is
π1 j = β10 + β11 X j + r1 j 

(4)

π 2 j = β 20 + β 21 X j + r2 j , 

(5)

where Xj represents the treatment variable. Due to the coding used for the treatment
variable, β10 and β20 represent the mean for intention and knowledge, respectively.
More importantly, β11 and β21 represent the mean difference between students in the
experimental and control conditions for intention and knowledge, respectively. The
residual terms are assumed to follow a bivariate normal distribution, with an equal
variance-covariance matrix across treatment groups. Note that the multivariate null
hypothesis for the test of the treatment is H0 : β11Â€=Â€β21Â€=Â€0, which will be tested by
using deviances from this and the empty model of EquationsÂ€1–3.
For this model, four fixed effects (the four βs in EquationsÂ€ 4 and 5) and the three
variance-covariance elements are to be estimated. Note that to include an explanatory
variable in the model so that separate effects of that variable are estimated for each
outcome, SAS and SPSS require that a given explanatory variable be multiplied by the

587

588

â†œæ¸€å±®

â†œæ¸€å±®

Multivariate Multilevel Modeling

Table 14.6 Selected Output for the Two-Level EmptyÂ€Model
SAS
Estimated R Matrix for Student 1
Row

Col1

Col2

1
2

119.95
60.8689

60.8689
127.9

Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)

12030.2
12040.2
12040.2
12063.6
Solution for Fixed Effects

Effect

Index1

Estimate

Standard
Error

DF

tÂ€Value

Pr > |t|

Index1
Index1

1
2

50.0548
50.2909

0.3872
0.3998

800
800

129.27
125.78

<.0001
<.0001

Covariance Parameter Estimates
Cov Parm
UN(1,1)
UN(2,1)
UN(2,2)

Subject
Student
Student
Student

Estimate

Standard Error

119.95
60.8689
127.9

5.9976
4.8794
6.3951

Z Value
20
12.47
20

Pr Z
<.0001
<.0001
<.0001

SPSS
Estimates of Fixed Effectsa
Parameter

[Index1=1]
[Index1=2]
a

Estimate

50.054814
50.290916

Std. Error

0.38722
0.399847

Df

800
800

t

129.267
125.775

Sig.

95%
Confidence
Interval
Lower
Bound

Upper
Bound

49.294726
49.506042

50.814902
51.075789

Dependent Variable: Response.

(Continued)

Chapter 14

â†œæ¸€å±®

â†œæ¸€å±®

 Table 14.6:â•‡ (Continued)
Estimates of Covariance Parametersa
95% Confidence Interval
Parameter
Repeated
Measures
a

UN (1,1)
UN (2,1)
UN (2,2)

Estimate

Std. Error

Wald Z

Sig.

Lower
Bound

119.95174
60.868913
127.902203

5.997587
4.879436
6.39511

20
12.475
20

0
0
0

108.754309
51.305394
115.962601

Upper
Bound
132.302067
70.432431
141.071116

Dependent Variable: Response.

Information Criteriaa
-2 Log Likelihood

12030.164

Akaike’s Information Criterion (AIC)

12040.164

Hurvich and Tsai’s Criterion (AICC)

12040.202

Bozdogan’s Criterion (CAIC)

12072.053

Schwarz’s Bayesian Criterion (BIC)

12067.053

The information criteria are displayed in smaller-is-better forms.
a

Dependent Variable: Response.

Residual Covariance (R) Matrixa
[Index1 = 1]
[Index1 = 2]

[Index1 = 1]

[Index1 = 2]

119.95174
60.868913

60.868913
127.902203

Unstructured
a
Dependent Variable: Response.

Index1 variable. So, to estimate β11 and β21 in SAS, the term TREAT*INDEX1 must be
added to the MODEL statement shown in TableÂ€14.5. In SPSS, we add the statements
WITH TREAT to the MIXED command and TREAT*INDEX1 to the FIXED subcommand. The complete SAS and SPSS control lines for estimating all models in this
chapter are shown in sectionÂ€14.9. TableÂ€14.7 shows selected results for this model.
As shown in the outputs, the deviance for this current model is 11,847.6, with seven
parameters estimated. Recall that the deviance for the empty model was 12,030.2 with
five parameters estimated. The global test for the null hypothesis that no treatment
effects are present for any of the outcomes (H0: β11Â€=Â€β21Â€=Â€0) can be tested by computing the difference in these deviances, which is distributed as a chi-square value having
degrees of freedom equal to the difference in the number of parameters estimated for

589

590

â†œæ¸€å±®

â†œæ¸€å±®

Multivariate Multilevel Modeling

 Table 14.7:â•‡ Selected Output for the Two-Level Model With Treatment Effects
SAS
Solution for Fixed Effects
Effect

Index1

Estimate

Standard
Error

DF

tÂ€Value

Pr > |t|

Index1
Index1
Treat*Index1
Treat*Index1

1
2
1
2

50.0548
50.2909
8.7233
8.6257

0.3552
0.3696
0.7104
0.7393

800
800
800
800

140.92
136.06
12.28
11.67

<.0001
<.0001
<.0001
<.0001

Estimated R Matrix for Student 1
Row

Col1

Col2

1
2

100.93
42.0579

42.0579
109.3

Covariance Parameter Estimates
Cov Parm

Subject

Estimate

Standard Error

Z Value

Pr Z

UN(1,1)
UN(2,1)

Student
Student

100.93
42.0579

5.0464
4.0001

20
10.51

<.0001
<.0001

UN(2,2)

Student

109.3

5.4651

20

<.0001

Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)

11847.6
11861.6
11861.7
11894.4
SPSS
Estimates of Fixed Effectsa
95% Confidence Interval

Parameter

Estimate

Std. Error

Df t

[Index1=1]
[Index1=2]
[Index1=1] * Treat
[Index1=2] * Treat

50.054814
50.290916
8.723254
8.625689

0.35519
0.369631
0.71038
0.739262

800 140.924
800 136.057
800 12.28
800 11.668

a

Sig. Lower Bound
49.3576
49.565355
7.328825
7.174567

Upper Bound
50.752029
51.016477
10.117683
10.07681

Dependent Variable: Response.

(Continued)

Chapter 14

â†œæ¸€å±®

â†œæ¸€å±®

 Table 14.7:â•‡ (Continued)
Estimates of Covariance Parametersa
95% Confidence Interval
Parameter
Repeated
Measures
a

Estimate
UN (1,1)
UN (2,1)
UN (2,2)

Std. Error

100.92795 5.046398
42.057895 4.00007
109.301577 5.465079

Wald Z

Sig.

20
10.514
20

Lower
Bound

Upper
Bound

91.50638
34.217901
99.098334

111.319573
49.897889
120.555355

Dependent Variable: Response.

Information Criteriaa
-2 Log Likelihood
Akaike’s Information Criterion (AIC)
Hurvich and Tsai’s Criterion (AICC)
Bozdogan’s Criterion (CAIC)
Schwarz’s Bayesian Criterion (BIC)

11847.606
11861.606
11861.676
11906.25
11899.25

The information criteria are displayed in smaller-is-better forms.
a
Dependent Variable: Response.

Residual Covariance (R) Matrixa

[Index1 = 1]
[Index1 = 2]

[Index1 = 1]

[Index1 = 2]

100.92795
42.057895

42.057895
109.301577

Unstructured
a
Dependent Variable: Response.

these models, i.e., 7 − 5Â€=Â€2. This deviance test can be used here because the empty
model can be obtained from this current model by constraining the treatment effects to
be zero. Computing the difference in model deviances results in a chi-square value of
12,030.2 − 11,847.6Â€=Â€182.6, which is statistically significant, as this value exceeds the
chi-square critical value of 5.99 (αÂ€=Â€.05, dfÂ€=Â€2).
Since rejection of the overall multivariate null hypothesis suggests that treatment
effects are present for at least one of the outcomes, we now consider the estimates
and statistical test results of the treatment effect for each outcome. Here, the two
null hypotheses being tested are H0 : β11Â€=Â€0 and H0 : β21Â€=Â€0. As shown in the
outputs, the treatment effects are 8.72 (SEÂ€=Â€.710) for intention and 8.63 (SE =
.739) for knowledge. The t ratios of about 12.28 (p < .05) and 11.67 (p < .05), respectively, for intention and knowledge, suggest that treatment effects are present

591

592

â†œæ¸€å±®

â†œæ¸€å±®

Multivariate Multilevel Modeling

for each outcome in the population. To obtain the group means for each outcome,
values of −.5 and .5 for the control and experimental group can be inserted into
EquationsÂ€4 and 5, along with the parameter estimates. Thus, for intention, the control group mean is 50.055 − .5(8.723)Â€=Â€45.693, and the experimental group mean is
50.055 + .5(8.723)Â€=Â€54.417. For knowledge, you can confirm that the control group
mean is 45.978 and the experimental group mean is 54.604. The residual variances
are also shown in the outputs, and they are 100.93 (SD =10.05) for intention and
109.30 (SD = 10.45) for knowledge. The correlation between the residuals can be
calculated in the usual manner and is .400.
14.6.3â•‡ Comparison to Traditional MANOVA Results
For comparison purposes, we provide and briefly discuss selected SPSS results from a
traditional multivariate analysis of these same data with the treatment as the explanatory variable. TableÂ€14.8 shows that the p value associated with Wilks’ lambda is quite
small, leading to the decision to reject the overall multivariate null hypothesis for the
treatment effects, the same decision as obtained with MVMM. In the parameter estimates table in TableÂ€14.8, SPSS automatically dummy codes the treatment variable,
coding the value for the experimental and control groups, in this example, as 0 and
1, respectively. Thus, in that table, the intercept represents the experimental group
average for a given outcome, and the treatment effect is computed by subtracting the
experimental mean from the control mean (thus obtaining negative differences). Other
than that, the SPSS results in this table are virtually the same as those obtained with
the MVMM approach, with the difference in means estimated to be 8.72 (SE =.711)
for intentions and 8.63 (SE =.740) for knowledge. Thus, if desired, MVMM can be
used in place of traditional multivariate analysis. The remaining analyses in this chapter illustrate some extensions of the traditional MANOVA approach that can be more
effectively handled byÂ€MVMM.
14.6.4â•‡Testing Whether the Effect of a Predictor Differs Across
Outcomes
The final analysis conducted with the two-level example tests whether the effect of the
treatment is of the same magnitude for each outcome. Given that the outcomes are measured on or placed on the same scale, investigators may wish to learn if a new intervention
has stronger effects for some outcomes than others. This can be done by first constraining
the fixed effects—in this case treatment effects—to be equal, and then testing the difference in fit using the deviances between this constrained model and one where the effects
are freely estimated. In EquationsÂ€4 and 5, the effects of the treatment are freely estimated (i.e., without constraints) for intention (β11) and knowledge (β21). In this analysis,
we test whether these treatment effects are the same or different for the two outcomes.
The model used now is essentially the same as with the previous analysis except that an
assumed common treatment effect will be estimated. As such, the number of parameters
estimated is now six, consisting of three fixed effects (i.e., only one treatment effect estimate) and the three elements in the variance-covariance matrix.

Chapter 14

â†œæ¸€å±®

â†œæ¸€å±®

 Table 14.8:â•‡ Selected Output From a Traditional MANOVA
Multivariate Testsb
Effect

Value

F

Hypothesis df

Error df

Sig.

a

Intercept

Pillai’s Trace
Wilks’ Lambda
Hotelling’s Trace
Roy’s Largest Root

.972
.028
34.263
34.263

13654.003
13654.003a
13654.003a
13654.003a

2.000
2.000
2.000
2.000

797.000
797.000
797.000
797.000

.000
.000
.000
.000

Treat

Pillai’s Trace
Wilks’ Lambda
Hotelling’s Trace
Roy’s Largest Root

.204
.796
.256
.256

102.149a
102.149a
102.149a
102.149a

2.000
2.000
2.000
2.000

797.000
797.000
797.000
797.000

.000
.000
.000
.000

a
a

Exact statistic
Design: Intercept + Treat

Parameter Estimates
95% Confidence Interval
Dependent
Variable
Y1

Y2

a

Parameter

B

Intercept
[Treat=-.50]
[Treat=.50]

54.416 .503
-8.723 .711
0a
.

Intercept
[Treat=-.50]
[Treat=.50]

54.604 .523
-8.626 .740
0a
.

Sig.

Lower
Bound

Upper
Bound

108.196 .000
-12.264 .000

53.429
-10.119

55.404
-7.327

Std. Error

t

.

.

104.327 .000
-11.653 .000

.

.

.

53.576
-10.079

.

.
.

55.631
-7.173

This parameter is set to zero because it is redundant.

To estimate this model in SAS and SPSS, we replace the TREAT*INDEX1 term in the
previous model with TREAT. Complete control lines for this model can be found in
sectionÂ€14.9. Selected outputs are presented in TableÂ€14.9.
As shown in the outputs, the deviance associated with this constrained treatment-effects
model is 11,847.621, which is only slightly larger (i.e., reflecting worse fitting) than
the previous model that provided separate estimates of treatment effects. Specifically, the difference in model deviances, which is distributed as a chi-square value is
11,847.621 − 11,847.606Â€=Â€0.015, which does not exceed the chi-square critical value
of 3.84 (αÂ€=Â€.05, dfÂ€=Â€1). Thus, this test does not suggest that these two models have
different fit. As such, there is evidence supporting the hypothesis that the effect of the
intervention is similar for intention and knowledge. Note that in the SAS and SPSS
outputs, shown in the fixed effects tables of TableÂ€14.9, the common treatment effect is
estimated to be 8.678 (SEÂ€=Â€.606) and is statistically significant (p < .05).

593

594

â†œæ¸€å±®

â†œæ¸€å±®

Multivariate Multilevel Modeling

 Table 14.9:â•‡ Selected Output for the Two-Level Model With Treatment Effects
Constrained to Be Equal
SAS
Solution for Fixed Effects
Effect

Index1

Estimate

Standard
Error

DF

tÂ€Value

Pr > |t|

Index1
Index1
Treat

1
2

50.0548
50.2909
8.6777

0.3552
0.3696
0.606

799
799
799

140.92
136.06
14.32

<.0001
<.0001
<.0001

Estimated R Matrix for Student 1
Row

Col1

Col2

1
2

100.93
42.0573

42.0573
109.3

Covariance Parameter Estimates
Cov Parm

Subject

Estimate

Standard Error

Z Value

Pr Z

UN(1,1)
UN(2,1)
UN(2,2)

Student
Student
Student

100.93
42.0573
109.3

5.0464
4.0001
5.4651

20
10.51
20

<.0001
<.0001
<.0001

Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)

11847.6
11859.6
11859.7
11887.7
SPSS
Estimates of Fixed Effectsa
95% Confidence
Interval

Parameter

Estimate

Std. Error

Df

T

[Index1=1]
[Index1=2]
Treat

50.054814
50.290916
8.67771

0.355191
0.369632
0.606001

799.994
799.993
800

140.924
136.057
14.32

a

Sig.

Lower
Bound

Upper
Bound

49.357598
49.565353
7.488171

50.75203
51.016479
9.867249

Dependent Variable: Response.

(Continued)

Chapter 14

â†œæ¸€å±®

â†œæ¸€å±®

 Table 14.9:â•‡ (Continued)
Estimates of Covariance Parametersa
95% Confidence
Interval
Parameter
Repeated
Measures
a

Estimate
UN (1,1)
UN (2,1)
UN (2,2)

100.928469
42.057303
109.302254

Std. Error

Wald Z

5.046442
4.000082
5.465135

20
10.514
20

Sig.

Lower
Bound

Upper
Bound

91.506817
34.217285
99.098907

111.320185
49.89732
120.556151

Dependent Variable: Response.

Information Criteriaa
-2 Log Likelihood
Akaike’s Information Criterion (AIC)
Hurvich and Tsai’s Criterion (AICC)
Bozdogan’s Criterion (CAIC)
Schwarz’s Bayesian Criterion (BIC)

11847.621
11859.621
11859.673
11897.887
11891.887

The information criteria are displayed in smaller-is-better forms.
a
Dependent Variable: Response.

Residual Covariance (R) Matrixa

[Index1 = 1]
[Index1 = 2]

[Index1 = 1]

[Index1 = 2]

100.928469
42.057303

42.057303
109.302254

Unstructured
a
Dependent Variable: Response.

14.7â•‡EXAMPLE 2: USING SAS AND SPSS TO CONDUCT
THREE-LEVEL MULTIVARIATE ANALYSES
The previous examples illustrated two-level multivariate analyses, but did not
include the school level in the statistical model. In the research design used in this
chapter, students are nested in one of 40 schools. Often, such nesting needs to be
accounted for in the analysis because the responses of students within treatment
groups are not independent, as is assumed in the previous analysis. Instead, these
responses are likely related because the students share a similar environment. As discussed previously, such dependence, if not accounted for statistically, can increase
type IÂ€error rates associated with fixed effects and may lead to false claims of, in
this case, the presence of treatment effects. The analyses in this section take the

595

596

â†œæ¸€å±®

â†œæ¸€å±®

Multivariate Multilevel Modeling

within-school dependence into account by adding a third level—the school level—to
the multilevel model. Further, instead of including only the treatment variable in the
model, we include other explanatory variables, including student gender, student
pretest knowledge, a school average of these pretest scores, and a treatment-gender
productÂ€term.
There is one primary hypothesis underlying these analyses. That is, while treatment
effects are expected to be present for intention and knowledge for both boys and girls,
boys are expected to derive greater benefit from the computer-based instruction. The
reason for this extra impact of the intervention, we assume, is that fifth-grade boys will
enjoy playing the instructional video game more than girls. As a result, the impact that
the experimental program has for intention and knowledge will be greater for boys
than girls. Thus, the investigators hypothesize the presence of a treatment-by-gender
interaction for both outcomes, where the intervention will have stronger effects on
intention and knowledge for boys than for girls.
In addition, because the cluster randomized trial with this limited number of schools (i.e.,
40) does not generally provide for great statistical power, knowledge pretest scores were
collected from all students. These scores are expected to be fairly strongly associated with
both outcomes. Further, because associations may be stronger at the school level than at the
student level, the researchers computed school averages of the knowledge pretest scores
and plan to include this variable in the model to provide for increased power.
Three MVMM analyses are illustrated next. The first analysis includes the treatment variable as the sole explanatory variable. The purpose of this analysis is to obtain a preliminary
estimate of the treatment effect for each outcome. The second analysis includes all of the
explanatory variables as well as the treatment-by-gender interaction. The primary purpose
of this analysis is to test the hypothesized interactions. If the multivariate test for the interaction is significant, the analysis will focus on examining the treatment-by-gender interaction for each outcome, and if significant, describing the nature of any interactions obtained.
The third analysis will illustrate a multivariate test for multiple variance and covariance
elements. Often, in practice, it is not clear if, for example, the association between a student
explanatory variable and outcome is the same or varies across schools. Researchers may
then rely on empirical evidence (e.g., a statistical test result) to address this issue.
14.7.1â•‡ AÂ€Three-Level Model for Treatment Effects
For this first analysis, EquationÂ€1, which had previously been the level-1 model, needs
to be modified slightly in order to acknowledge the inclusion of the school level. The
level-1 model nowÂ€is:
Yijk = π1 jk a1 jk + π 2 jk a2 jk , 

(6)

which is identical to EquationÂ€1 except that subscript k has been added. Thus, π1jk and
π2jk represent the intention and knowledge posttest scores, respectively, for a given