5 EXAMPLE: BALANCED ONE-WAY TREATMENT STRUCTURE
Tải bản đầy đủ - 0trang
14
Analysis of Messy Data, Volume III: Analysis of Covariance
Table 12.3 contains the PROC MIXED code to fit the model in Equation 12.1
where the covariance between the slopes and intercepts is specified to be zero. Using
the statement
random variety n2*variety/solution;
specifies that there are variance components for the intercepts (variety) and for the
slopes (n2*variety), but does not specify a value for the covariance between the
slopes and intercepts, so the covariance sets to zero. The REML estimates of the
variance components are σˆ 2a = 5.818, σˆ 2b = 0.1785, and σˆ 2ε = 0.6944. The mixed
ˆ = 10.4821 and βˆ = 1.9379.
model estimates of the parameters of the model are α
Table 12.4 contains the PROC MIXED code to fit the model in Equation 12.1
where the random statement allows for the slopes and intercepts to be correlated.
The statement
random Int N2/subject=variety type=un solution;
specifies that the intercepts (Int) and slopes (N2) within each level of variety (subject =
variety) has an unstructured correlation structure (type = un). When the random
statement involves a “/ ”, the interpretation of the terms on the left side of the “/ ”
includes the term or terms on the right side of the “/ ”. For this statement, Int means
Variety and n2 means n2*variety. The REML estimates of the parameters are σˆ 2a =
5.8518 from UN(1,1), σˆ 2b = 0.1785 from UN(2,2), σˆ ab = 0.7829 from UN(2,1),
and σˆ 2ε = 0.6944. The mixed models estimates of the mean of the intercepts and
ˆ = 10.4821, βˆ = 1.9379, so the estimate of the population mean yield
slopes are α
at a given level of nitrogen is µˆ y|N2 = 10.4821 + 1.9379N2. Since the model is
balanced, all parameters in Table 12.4 have the same estimates in Table 12.3, except
σab = 0. The solutions and standard errors for the fixed effects are identical, but the
solutions and standard errors for the random effects are different. The information
criteria indicate that the model with the non-zero covariance fits the data better than
the one with the covariance set to zero (the smaller the information criteria the better
the covariance structure). Table 12.5 contains the solution for the individual intercepts and slopes for each of the varieties in the study based on the model with the
non-zero covariance between the slopes and intercepts. These two sets of solutions
satisfy the sum-to-zero restriction, i.e., the intercepts sum-to-zero and the slopes
sum-to-zero. These are the estimated best linear unbiased predictors of the intercepts
and slopes (Littel et al., 1996). The solution for the random effects based on the
model with a zero covariance between the slopes and intercepts are somewhat
different than the ones in Table 12.5 (results not shown here). The main difference
is the estimated standard errors for the independent slope and intercept model are
smaller than for the correlated slope and intercept model. The predicted values of
ˆ + âi and bˆ i(p) = βˆ
the slope and intercept for the ith variety are obtained by âi(p) = α
ˆ
ˆ
ˆ
+ bi, where α and β are the estimates of the population intercept and slope and âi
and bˆ i are the predicted values of the intercepts and slopes obtained from the solution
for the random effects. Estimate statements in Table 12.6 are used to obtain the
predicted slopes and intercepts for each of the varieties. The statements
estimate ‘slope 1’ N2 1 | N2 1/subject 1;
estimate ‘slope 2’ N2 1 | N2 1/subject 0 1;
© 2002 by CRC Press LLC
Random Effects Models with Covariates
15
TABLE 12.4
PROC MIXED Code to Fit the Random Coefficient Model with the
Correlated Slopes and Intercepts Covariance Structure and Results
Proc mixed data=ex_12_5 cl covtest ic;
class variety;
model yield=N2/solution ddfm=kr;
random Int N2/subject=variety type=un solution;
CovParm
UN(1,1)
UN(2,1)
UN(2,2)
Residual
Subject
Variety
Variety
Variety
Estimate
5.8518
0.7829
0.1785
0.6944
StdErr
3.4794
0.5624
0.1238
0.2624
ZValue
1.68
1.39
1.44
2.65
ProbZ
0.0463
0.1639
0.0747
0.0041
Alpha
0.05
0.05
0.05
0.05
Parameters
4
AIC
108.86
AICC
110.76
HQIC
106.19
BIC
108.64
CAIC
112.64
Effect
Intercept
N2
Estimate
10.4821
1.9379
StdErr
0.9278
0.1745
DF
6
6
tValue
11.30
11.10
Probt
0.0000
0.0000
Effect
N2
NumDF
1
DenDF
6
FValue
123.26
ProbF
0.0000
Neg2LogLike
100.86
Lower
2.3813
–0.3194
0.0650
0.3722
Upper
30.3058
1.8852
1.3886
1.7270
TABLE 12.5
Solution for the Random Effect Slopes and Intercepts
for Each Variety
Effect
Intercept
N2
Intercept
N2
Intercept
N2
Intercept
N2
Intercept
N2
Intercept
N2
Intercept
N2
Variety
1
1
2
2
3
3
4
4
5
5
6
6
7
7
Estimate
1.1497
0.3635
0.3980
–0.1311
2.4327
0.6020
–0.0397
–0.1125
–0.6342
–0.3895
1.5760
0.1471
–4.8826
–0.4796
StdErrPred
0.9935
0.2406
0.9935
0.2406
0.9935
0.2406
0.9935
0.2406
0.9935
0.2406
0.9935
0.2406
0.9935
0.2406
df
7.6
9.6
7.6
9.6
7.6
9.6
7.6
9.6
7.6
9.6
7.6
9.6
7.6
9.6
tValue
1.16
1.51
0.40
–0.54
2.45
2.50
–0.04
–0.47
–0.64
–1.62
1.59
0.61
–4.91
–1.99
Probt
0.2823
0.1632
0.6997
0.5983
0.0416
0.0323
0.9692
0.6507
0.5420
0.1379
0.1533
0.5552
0.0014
0.0755
provide predicted values for the slopes of varieties 1 and 2 (similar statements were
used for the other varieties). The fixed effects are on the left hand side of “ | ” and
the random effects are on the right hand side. On the left hand side using N2 1
© 2002 by CRC Press LLC
16
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 12.6
Estimate Statements Used to Obtain Predicted Values for Each of the
Varieties At N2 = –3 and N2 = 3 and the Predicted Slopes and Intercepts
estimate
estimate
estimate
estimate
estimate
estimate
estimate
estimate
‘Var 1 at N2=–3’ intercept 1 N2 –3|Int 1 N2 –3/subject
‘Var 2 at N2=–3’ intercept 1 N2 –3|Int 1 N2 –3/subject 0
‘Var 1 at N2= 3’ intercept 1 N2 3|Int 1 N2 3/subject
‘Var 2 at N2= 3’ intercept 1 N2 3|Int 1 N2 3/subject 0
‘slope 1’ N2 1 | N2 1/subject 1;
‘slope 2’ N2 1 | N2 1/subject 0 1;
‘Int 1’ Int 1 | Int 1/subject 1;
‘Int 2’ Int 1 | Int 1/subject 0 1;
Slopes
Variety
1
2
3
4
5
6
7
Value
2.3014
1.8067
2.5399
1.8254
1.5483
2.0850
1.4583
Stderr
0.1933
0.1933
0.1933
0.1933
0.1933
0.1933
0.1933
Intercepts
Value
11.6319
10.8802
12.9149
10.4425
9.8479
12.0582
5.5995
Stderr
0.4194
0.4194
0.4194
0.4194
0.4194
0.4194
0.4194
1;
1;
1;
1;
Predicted values
at N2 = –3
Predicted values
at N2 = –3
Value
4.7278
5.4600
5.2952
4.9662
5.2029
5.8032
1.2246
Value
18.5360
16.3004
20.5345
15.9187
14.4929
18.3131
9.9744
Stderr
0.7229
0.7229
0.7229
0.7229
0.7229
0.7229
0.7229
Stderr
0.7084
0.7084
0.7084
0.7084
0.7084
0.7084
0.7084
selects the value of βˆ and N2 1/subject 1 selects the value of bˆ 1. Since the random
statement includes “/subject = variety”, then the “/subject 1” on the estimate statement specifies to select the first level of variety. Using “/subject 0 0 0 1” would
select the fourth level of variety. The statements
estimate ‘Int 1’ Int 1 | Int 1/subject 1;
estimate ‘Int 2’ Int 1 | Int 1/subject 0 1;
provide predicted values for the intercepts of varieties 1 and 2 (similar statements
ˆ and Int 1/subject
were used for the other varieties). Using Int 1 selects the value of α
1 selects the value of â1.
The predicted values of the slopes and intercepts and the corresponding estimated
standard errors for each of the varieties are displayed in Table 12.6. The variance
of these predicted slopes and of these predicted intercepts are proportional to the
estimates of the variance components σa2 and σb2 . The estimate statements
estimate ‘Var 1 at N2=–3’ intercept 1 N2 –3|Int 1
N2 –3/subject 1;
estimate ‘Var 2 at N2=–3’ intercept 1 N2 –3|Int 1
N2 –3/subject 0 1;
estimate ‘Var 1 at N2= 3’ intercept 1 N2 3|Int 1
N2 3/subject 1;
estimate ‘Var 2 at N2= 3’ intercept 1 N2 3|Int 1
N2 3/subject 0 1;
© 2002 by CRC Press LLC
Random Effects Models with Covariates
17
Random Coefficient Models for Varieties
Yield Per Plot (lbs)
25.0000
20.0000
15.0000
10.0000
5.0000
0
-4
-3
-2
-1
0
1
2
3
4
Coded N2 Level
1
5
2
6
3
7
4
8
FIGURE 12.1 Graph of the predicted models for each variety where variety 8 corresponds
to the mean model.
provide predicted values for varieties 1 and 2 at N2 = –3 and N2 = 3. These are
predicted values from the regression equations using the intercepts and slopes for
each variety. The set of statements can be extended to include all of the varieties.
The predicted values at N2 = –3 and 3 for all of the varieties are included in
Table 12.6. Those predicted values were used to construct a graph of the set of
regression lines for these varieties which are displayed in Figure 12.1. The methods
of moments estimates of the variance components process described in Section 12.3
are summarized in Table 12.7. Since the model is balanced, the method of moments
estimates of the variance components are identical to the REML estimates displayed
in Table 12.4.
TABLE 12.7
Sums of Squares, the Cross Product and Their Expectations
for Computing the Methods of Moments Estimates of the
Variance and Covariance Components for Example 12.5
Sum of Squares
SS(INTERCEPTS) = 578.434286
SS(SLOPES) = 511.817143
SSCP = 375.788571
SS(ERROR) = 9.721000
σε2 = 0.6943571
σS2 = 0.1785393
© 2002 by CRC Press LLC
Expected Sum of Squares
24 σε2 + 96 σI2
120 σε2 + 2400 σS2
2
480 σIS
14 σε2
σI2 = 5.8517679
2 = 0.7828929
σIS
18
Analysis of Messy Data, Volume III: Analysis of Covariance
FIGURE 12.2 Fit model screen to fit the random coefficient model with independent slopes
and intercepts.
FIGURE 12.3 REML estimates of the variance components and tests for the fixed effects
and the random effects.
The JMP® software can fit the independent slope and intercept model to the data
set, but it cannot fit the model with non-zero covariance. Figure 12.2 contains the fit
model screen where yield is specified to be the response variable and the model
contains N2, Variety, and N2*Variety, where the last two terms are specified as being
random by using the attributes menu. Figure 12.3 contains the REML estimates of the
variance components as well as test for the fixed and random effects. The indication
is that the variance components are significantly different from zero, or they are
important components in describing the variability in the model. The custom test menu
can be used to provide predicted values for each variety’s slope, intercept, or model
evaluated at some value for N2. Those results were not included in this example. PROC
MIXED can perform similar tests for the independent slope and intercept model by
specifying method = type1, or type2, or type3 (an example is included in Section 12.7).
12.6 EXAMPLE: UNBALANCED ONE-WAY
TREATMENT STRUCTURE
The data in Table 12.8 are from randomly selected cities and then randomly selected
school districts with each city where the response variable is the amount of income
© 2002 by CRC Press LLC
Random Effects Models with Covariates
19
TABLE 12.8
Data for Example in Section 12.6 with One-Way Random Effects Treatment
Structure (Cities) Where the Response Variable is the Amount Spent on
Vocational Training and the Covariate is the Percent Unemployment
City 1
Spent
27.9
43.0
26.0
34.4
43.4
42.4
27.9
25.9
23.3
34.3
Unemp
4.9
9.4
4.2
6.6
9.5
8.8
4.6
4.6
4.2
6.8
City 6
Spent
28.0
18.2
29.2
25.0
22.6
34.9
39.3
37.3
30.5
Unemp
6.3
2.6
6.7
5.4
3.8
8.4
9.8
9.1
6.8
City 2
Spent
29.6
12.7
19.9
12.3
23.8
28.8
21.0
13.9
13.2
Unemp
8.1
0.9
4.6
1.9
5.9
8.0
4.4
2.1
1.3
City 7
Spent
37.6
34.7
38.6
33.8
10.1
29.8
15.1
35.0
Unemp
9.9
8.8
9.9
8.2
0.1
7.0
1.8
8.5
City 3
Spent
17.5
21.2
19.0
14.5
25.2
15.0
24.6
Unemp
3.1
6.4
4.0
2.0
8.4
2.4
8.7
City 8
Spent
10.6
21.4
32.1
28.3
38.7
19.8
40.0
36.3
29.0
Unemp
0.0
3.4
7.0
5.8
9.2
2.7
9.7
8.2
5.9
City 4
Spent
28.0
16.0
16.8
26.5
23.1
18.0
17.8
11.6
24.2
Unemp
9.6
3.3
4.2
8.6
6.2
3.1
3.9
0.5
7.4
City 9
Spent
10.9
12.0
14.1
15.4
11.2
17.9
26.2
14.2
21.4
Unemp
2.9
2.8
3.6
5.2
1.4
5.6
10.0
3.6
7.2
City 5
Spent
22.0
22.6
12.8
11.8
23.5
26.0
24.5
18.8
11.2
Unemp
6.7
6.9
2.2
1.4
8.4
9.3
8.0
5.7
1.8
City 10
Spent
8.6
9.1
13.6
13.7
13.4
15.3
23.1
22.2
Unemp
0.8
0.7
3.4
3.1
3.6
4.4
7.8
7.4
to be spent on vocational training (y) and the possible covariate is the level of
unemployment (x). The PROC MIXED code to fit Model 12.1 is in Table 12.9 where
ˆ a2 = 2.5001 from UN(1,1), σ
ˆ b2 = 0.4187 from
the estimates of the parameters are σ
ˆ ε2 = 0.6832. The mixed models estimates
UN(2,2), σˆ ab = 0.3368 from UN(2,1), and σ
ˆ = 9.6299, βˆ = 2.3956, so the estimate
of the mean of the intercepts and slopes are α
of the population mean spent income on vocational training at a given level of
ˆ y|EMP = 9.6299 + 2.3956 UNEMP. The solution for the random
unemployment is µ
effects that satisfy the sum-to-zero restriction within the intercepts and within the
slopes are in Table 12.10. The estimate statements in Table 12.11 provide predicted
values (estimated BLUPS) for each of the selected cities evaluated at 10% unemployment. Table 12.11 contains the predicted values for each of the cities evaluated
at 0, 8, and 10%. Tables 12.12, 12.13, and 12.14 contain the PROC MIXED code
and results for fitting Model 12.1 where the values of the covariate have been altered
by subtracting 2, then mean (5.4195402) and 8 from the percent unemployment.
The estimate of the variance component for the intercepts depends on the amount
subtracted from the covariate as described in Section 12.4. When X0 is subtracted
© 2002 by CRC Press LLC