7 EXAMPLE: BALANCED INCOMPLETE BLOCK DESIGN STRUCTURE WITH FOUR TREATMENTS USING JMP®
Tải bản đầy đủ - 0trang
20
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 10.21
Estimate Statements Used to Provide Predicted Values for Treatment 1
for Each of the 16 Blocks
estimate ‘11’ intercept 1 trt 1 0 0 0 x*trt 100|block 1;
estimate ‘12’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 1;
estimate ‘13’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 1;
estimate ‘14’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 1;
estimate ‘15’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 1;
estimate ‘16’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 1;
estimate ‘17’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 1;
estimate ‘18’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0 1;
estimate ‘19’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0 0 1;
estimate ‘110’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0 0 0 1;
estimate ‘111’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0 0 0 0 1;
estimate ‘112’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0
0 0 0 0 1;
estimate ‘113’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0
0 0 0 0 0 1;
estimate ‘114’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0
0 0 0 0 0 0 1;
estimate ‘115’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1;
estimate ‘116’ intercept 1 trt 1 0 0 0 x*trt 100|block 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1;
Treatment 1
Block
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Estimate
108.1873
106.6044
111.8443
111.3143
109.8052
112.3372
108.0669
108.1085
107.9095
103.9985
113.2101
109.8421
114.0777
113.4230
107.5628
112.3774
Means
109.9168
Treatment 2
Block
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Estimate
109.5241
107.9412
113.1811
112.6511
111.1420
113.6740
109.4037
109.4453
109.2463
105.3353
114.5469
111.1789
115.4145
114.7598
108.8996
113.7142
111.2536
Treatment 3
Block
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Estimate
108.5245
106.9416
112.1815
111.6515
110.1424
112.6743
108.4041
108.4456
108.2466
104.3357
113.5472
110.1793
114.4148
113.7602
107.9000
112.7146
110.2540
Treatment 4
Block
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Estimate
112.8872
111.3043
116.5442
116.0142
114.5051
117.0370
112.7668
112.8083
112.6093
108.6984
117.9099
114.5420
118.7775
118.1229
112.2627
117.0773
114.6167
the model specification menu to enable the fitted model to match the models fit by
PROC MIXED. Click on the Run Model button to carry out the analysis. The
parameter estimates are in Figure 10.3 where the ones of interest correspond to
Intercept, TRT[1], TRT[2], TRT[3], X, TRT[1]*X, TRT[2]*X, and TRT[3]*X. The
© 2002 by CRC Press LLC
More Than Two Treatments in a Blocked Design Structure
21
FIGURE 10.1 JMP® data screen for the incomplete block example.
FIGURE 10.2 Fit model table where block is declared as a random effect and no center is
selected from model specification.
estimates of the slopes and intercepts can be constructed as described in Chapter 3.
The estimates of the variance components and tests for the fixed effects are in
Figure 10.4. The estimates of the variance components are the same as obtained by
PROC MIXED, displayed in Table 10.19, as is the test for equal intercepts, TRT.
Since the model included X and TRT*X, a test of the slopes equal to zero hypothesis
is not obtained, but a test of the equal slopes hypothesis is provided through the
TRT*X term. The significance level is 0.0897. The custom test screen needs to be
used to provide the test of the slopes equal to zero hypothesis. Such a screen is in
Figure 10.5 where there are four columns used to construct the estimates of the
slopes. The value line provides the estimates of the slopes, which are identical to
© 2002 by CRC Press LLC
22
Analysis of Messy Data, Volume III: Analysis of Covariance
FIGURE 10.3 Parameter estimates from JMP®.
FIGURE 10.4 Estimates of the variance components and tests for the effects.
those from PROC MIXED; thus JMP® is providing the combined estimates of the
model parameters (Table 10.19). The F Ratio provides the test of the slopes equal
to zero hypothesis and is the same as obtained by PROC MIXED in Table 10.19.
Finally, the custom test screen is used to provide estimates of the models for each
of the treatments evaluated at X = 100, as displayed in Figure 10.6. The results of
the custom test to provide the least squares means are in Figure 10.7. These are the
same least squares means as obtained by PROC MIXED and have the same interpretation (Table 10.20). The least squares means are the mean of the predicted values
at X = 100 from all of the blocks, whether or not a treatment occurred in all blocks.
10.8 SUMMARY
When the design structure involves blocking, information about slopes and intercepts
can be extracted from within block comparisons and between block comparisons.
The type of information available depends on the type of blocking. If the designs
are connected, the same within block information is available from all designs, but
© 2002 by CRC Press LLC
More Than Two Treatments in a Blocked Design Structure
23
FIGURE 10.5 Custom tests screen used to provide estimates of the slopes and a test of the
slopes equal to zero hypothesis.
FIGURE 10.6 Coefficients used to provide least squares means evaluated at X = 100.
the between block information depends on the type of blocking used. RCB designs
have different information than incomplete block designs and these differences are
evident when the within block and between block estimates are combined. The Type I
SS BLOCKS from the within block analysis contains variation due to covariates;
thus biased estimates of variance components occur from the within block comparisons. Most of the difficulties occurring because of blocking are avoided when a
mixed model approach is used to provide the analysis. The details of the mixed
model approach are discussed in Chapter 13.
The final topic discussed was that of using an incomplete block design structure.
The example in Section 10.6 indicates the interpretation of the least squares means
© 2002 by CRC Press LLC
24
Analysis of Messy Data, Volume III: Analysis of Covariance
FIGURE 10.7 Least squares means evaluated at X = 100 for each of the treatments.
is the means of the predicted values of the treatments from all blocks, whether the
treatments were observed in all blocks or not. This is another reason that the choice
of the blocking factor is very important. The choice of the blocking factor must not
be able to interact with the treatment or else the least squares means are likely not
meaningful.
REFERENCES
Kenward, M. G. and Roger, J. H. (1997). Small sample inference for fixed effects from
restricted maximum likelihood, Biometrics 54:983.
Milliken, G. A. and Johnson, D. E. (1992). Analysis of Messy Data, Volume I: Design
Experiments, London, Chapman & Hall.
EXERCISES
EXERCISE 10.1: For the data set in Section 10.6, obtain the within block and the
between block estimates of the model’s parameters. The process is to show that there
is between block information about the intercepts when an incomplete block design
structure is used while it does not exist when a complete block design structure is
used for the data set in Section 10.5.
EXERCISE 10.2: For the data in Section 10.6, use a common slope model and
obtain the within block and between block estimates of the parameters.
EXERCISE 10.3: The data in the following table consist of eight incomplete blocks
where Treatments 1 and 2 occur in blocks 1 through 4 and Treatments 3 and 4 occur
in blocks 5 through 8. Use a common slopes model to describe the data for each
treatment. Obtain the within block estimates and the between block estimates of the
parameters. Since the design is not connected, care must be taken in constructing
the parameters for each of the models. Use a mixed models code such as PROC
MIXED to obtain predicted values for each of the treatments at each of the blocks
evaluated at X = 40. Show that the means of these predicted values are the same as
the least squares means evaluated at X = 40.
© 2002 by CRC Press LLC
More Than Two Treatments in a Blocked Design Structure
25
Data for Exercise 10.3
block
1
2
3
4
5
6
7
8
© 2002 by CRC Press LLC
treat
1
1
1
1
3
3
3
3
x
40.7
41.9
41.2
38.3
40.7
39.3
39.5
39.0
y
62.1
61.1
62.6
57.7
69.3
70.2
66.0
64.0
treat
2
2
2
2
4
4
4
4
x
37.5
40.2
39.0
40.0
41.0
39.0
39.1
40.8
y
63.1
62.9
60.6
63.3
70.6
71.7
63.5
68.5
11
Covariate Measured
on the Block in RCB
and Incomplete Block
Design Structures
11.1 INTRODUCTION
Measuring the value of the covariate on the block, i.e., setting up blocks where all
experimental units within a block have the same value of the covariate, is often used
as a method of constructing blocks of experimental units. Animals are grouped by
age, weight, or stage of life. Students are grouped by class, age, or by IQ. The
grouping of the experimental units by the value of some covariate forms more
homogeneous groups on which to compare the treatments than by not blocking.
However, one must be much more concerned that there is no interaction between
the levels of the treatments and the levels of the factor used to construct the blocks.
An assumption of the RCB (randomized complete block), or as a matter of fact any
blocked design structure, is that there is no interaction between the factors in the
treatment structure and the factors in the design structure (Milliken and Johnson,
1992). It is not unusual to see researchers construct blocks by using a factor such
as initial age or weight or current thickness. However, some thought should be taken
into account about the possibility of interaction with the levels of the treatments.
The usual approach to the analysis of such data sets is to remove the block to block
variation by the analysis of variance and not consider doing an analysis of covariance,
i.e., ignore the fact that a covariate was measured. That strategy is appropriate if the
slopes of the treatments’ regression lines are equal, but if the slopes of the lines are
unequal, a model with a covariate is required in order to extract the necessary
information from the data. A block total or mean model must be used to make
decisions about the slopes of the model before the analysis can continue.
There are some changes in the form of the analysis when the covariate is
measured on the block rather than being measured on the experimental units within
each block. The data in Table 11.1 are from a blocked experiment with two treatments
where the covariate is measured on the block, i.e., the covariate has the same value
for each of the two experimental units within the block. The within block analysis
does not provide information about the magnitudes of the slopes. The sum of squares
due to the slopes being zero after the block effects have been removed does not test
the hypothesis that the slopes are zero, in fact the sum of squares is equal to zero
providing no test. If the slopes are assumed to be equal, then the block effects
© 2002 by CRC Press LLC
2
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 11.1
Data for Example in Section 11.7
BLK
1
2
3
4
5
6
7
8
X
23.2
26.9
29.4
22.7
30.6
36.9
17.6
28.5
y1
60.4
59.9
64.4
63.5
80.6
75.9
53.7
66.3
y2
76.0
76.3
77.8
75.6
94.6
96.1
62.3
81.6
y_sum
136.4
136.2
142.2
139.1
175.2
172.0
116.0
147.9
y_dif
–15.6
–16.4
–13.4
–12.1
–14.0
–20.2
–8.6
–15.3
Note: Involves two treatments in a one-way treatment
structure in a RCB design structure where y1 is the
response for Treatment 1 and y2 is the response for Treatment 2; x is the covariate; BLK is the block; y_sum and
y_dif are the sum and difference, respectively, of the
observations within each block.
removes all information about the slopes. However, the information about the slopes
can be extracted by constructing and analyzing the proper models. The proper models
are the within block model and the between block model. The above described
example with two treatments is analyzed in detail in Sections 11.2 to 11.7. Two
additional examples involving more than two treatments, one in a RCB design
structure and the other in a balanced incomplete block design structure, are included
to demonstrate additional complications in the analyses.
11.2 THE WITHIN BLOCK MODEL
The basic model to describe the data in Table 11.1 is
y ij = α i + βi x j + b j + ε ij , i = 1, 2, j = 1, 2, …, 6
(11.1)
where bj ~ iid N(0, σ 2b), εij ~ iid N(0, σ 2ε ) and the value of the covariate is the same
for both experimental units in each block denoted by xj (only one subscript on X).
The within block model is constructed by subtracting the values of the two
observations within each block of Model 11.1, say the observation for Treatment 1
minus the observation for Treatment 2, yielding the model:
y1 j − y 2 j = α1 − α 2 + (β1 − β2 ) x j + ε1 j − ε 2 j
(
)
= α d + βd x j + ε dj , where ε dj ~ N 0, 2σ ε2 .
(11.2)
This is a simple linear regression model where y1j – y2j is the dependent variable
and xj is the independent variable with intercept α1 – α2 and slope β1 – β2. Fitting
© 2002 by CRC Press LLC
Covariate Measured on the Block in RCB
3
the model to the data provides estimates of α1 – α2, β1 – β2, and σ 2ε. If β1 = β2, the
appropriate analysis for comparing the treatments is a paired t-test or a RCB analysis
of variance with two treatments. The within block analysis provides statistics to test
H0: α1 – α2 and H0: β1 – β2, but it does not enable the individual parameters (α1,
α2, β1, and β2) to be estimated.
11.3 THE BETWEEN BLOCK MODEL
The between block or block sum or block total model is:
y1 j + y 2 j = α1 + α 2 + (β1 + β2 ) x j + 2 b j + ε1 j + ε 2 j
( (
))
= α s + βs x j + esj* , where esj* ~ N 0, 2 σ ε2 + 2σ b2 .
(11.3)
Model 11.3 is a simple linear regression model where where y1j + y2j is the dependent
variable and xj is the independent variable with intercept α1 + α2 and slope β1 + β2.
Fitting this model to the data provides estimates of α1 + α2, β1 + β2 and σ 2ε + 2σ 2b = σ2e*.
11.4 COMBINING WITHIN BLOCK AND BETWEEN
BLOCK INFORMATION
By combining the estimates from the within block model and the between block
model, estimates of all of the parameters can be obtained. The estimates are
αˆ 1 = (α s + α d ) 2 , α 2 = (αˆ s − αˆ d ) 2 ,
(
)
(
)
βˆ 1 = βˆ s + βˆ d 2 , βˆ 2 = βˆ s − βˆ d 2
(
(11.4)
)
σˆ b2 = σˆ e2* − σˆ ε2 2 .
The estimates of α1, α2, β1, and β2 involve both between block information and
within block information. Assuming the data are normally distributed, the estimators
from within blocks are independently distributed of the estimators from between
blocks.
As in Chapters 9 and 10, the beta-hat model can be used to combine the within
block and the between block information to obtain the estimators of the model’s
parameters and the corresponding covariance matrix. Let 1 = {(α1 – α2), (β1 – β2)] ′,
2 = (α1 + α2, β1 + β2) ′, Var(ˆ 1) = Σw , Var(ˆ 2) = ⌺b and = (α1, α2, β1, β2)′. The
beta-hat models relating ˆ 1 and ˆ 2 to are
ˆ 1 = H1 + e1, e1 ~ N(0, Σ w ) and
ˆ 2 = H2 + e2 , e2 ~ N(0, Σ b )
© 2002 by CRC Press LLC
(11.5)
4
Analysis of Messy Data, Volume III: Analysis of Covariance
where
1
H1 =
0
−1
0
0
1
0
1
and H2 =
−1
0
1
0
0
1
0
.
1
The combined estimator of θ is
[
θˆ c = H1′ Σˆ w−1 H1 + H′2 Σˆ −b1 H2
] [H′ Σˆ
−1
1
−1
w
θˆ 1 + H′2 Σˆ −b1 θˆ 2
]
with estimated approximate covariance matrix
[
Σˆ θ = H1′ Σˆ w−1 H1 + H′2 Σˆ −b1 H2
]
−1
.
(11.6)
To demonstrate the above described process, the data in Table 11.1 are used
where the within block analysis is in Table 11.5 and the between block analysis is
in Table 11.6. From Table 11.5, the within block information is
−1.585320 ˆ
2
ˆ 1 =
⌺2 = (2.05389)
−
0
476912
.
3.153408
−0.112267
−0.112267
0.00416
and from Table 11.6 the between block information is
66.442387 ˆ
2
θˆ b =
⌺b = (10.09140)
.
2
935407
3.153408
−0.112267
−0.112267
.
0.00416
The combined estimator of θ and the approximate covariance matrix are
83.608516
32.4285
34.0138
ˆ = 76.957248
⌺
θˆ c =
θ
−2.976619
1.2292
−2.739822
1.7062
76.957248
83.608516
−2.739822
2.976619
−2.976619
−2.739822
0.110347
0.101569
−2.739822
−2.976619
0.101569
0.113047
A matrix manipulation software such as PROC IML described in Chapter 9 can be
used to carry out the above computations. PROC MIXED was used to provide the
computations of the combined estimator, which are displayed in the fourth part of
Table 11.7. To test equality of slopes, test H0: β1 – β2 = 0 vs. Ha: (not Ha). Let
a = [0 0 1 –1] ′, then
βˆ 1 − βˆ 2 = a ′ˆ c = −0.4769
© 2002 by CRC Press LLC
Covariate Measured on the Block in RCB
5
with variance Var( βˆ 1 – βˆ 2 ) = a′ Σˆ θ a = (.132502) 2 . The test statistic is t c =
–0.4769/.132502 = –3.5993, and when compared to a t-distribution with 6 d.f., the
significance level is 0.0114. The contrast statement was used in Table 11.7 to provide
the test for equal slopes, labeled as b1 = b2.
11.5 COMMON SLOPE MODEL
When the slopes are equal, Model 11.1 becomes
y1 j = α i + βX j + b j + ε ij .
(11.7)
y1 j − y 2 j = α1 − α 2 + ε1 j − ε 2 j ,
(11.8)
The within block model is
a model with intercept α1 – α2 and does not contain any information about the
covariate. The between block model is
y1 j + y 2 j = (α1 + α 2 ) + 2βX j + 2 b j + ε ij ,
(11.9)
a simple linear regression model with intercept α1 + α2 and slope 2β. All information
about the covariate is contained in the between block model.
In order to compute adjusted means or LSMEANS, the estimate of = [α1, α2, β] ′
needs to be computed. Let 1 = (α1 – α2), 2 = (α1 + α2, β), Var (ˆ 1) = Σw Var (ˆ 2) =
Σb, then the beta-hat models relating 1 and 2 to θ are
1 = H1 + e1
e1 ~ N(0, Σ 2 )
2 = H2 + e2 e2 ~ N(0, Σ b )
[ ].
1 1 0
0 0 1
where H1 = [1 –1 0] and H2 =
The weight least squares estimate of is
[
ˆ c = H1′ Σˆ 2 H1 + H′2 Σˆ b H2
] [H′ Σˆ
−1
1
w
ˆ 1 + H′2 Σˆ b ˆ 2
with estimated approximate covariance matrix
[
Σˆ θ = H1′ Σˆ w H1 + H′2 Σˆ b H2
© 2002 by CRC Press LLC
]
−1
.
]