3 INCOMPLETE BLOCK DESIGN STRUCTURE — WITHIN AND BETWEEN BLOCK INFORMATION
Tải bản đầy đủ - 0trang
More Than Two Treatments in a Blocked Design Structure
3
block design provides the same within block information as the RCB, i.e., estimates
–
–
–
of α1 – α., α2 – α., …, αt – α., β1, β2, …, βt.
The between block information from the incomplete block design is different
than that of the RCB. The reason is that not all treatments occur in each block; thus
the block total models are different. Let Rs = {indices of treatments in block s}.
Then the block total model for block s is
∑ y = ∑α + ∑β X
is
i
iεRs
iεRs
i
is
+ rs bs +
iεRs
∑ε
is
(10.3)
iεRs
where rs is number of treatments in block s. The variance of a block total is
Var
y is = rs σ ε2 + rsσ b2 .
iεRs
∑
(
)
For simplicity, assume the rs are all equal to r (equal block sizes); then
Var
y is = r σ ε2 + rσ b2 .
iεRs
∑
(
)
(10.4)
If there is a sufficient number of blocks (more than 2t), then between block
information provides estimates of α1, α2, …, αt , β1, β2, …, βt . Table 10.1 contains
an incomplete block design with three treatments in eight blocks. Let Tj denote the
total for block j; then the between block or block total model is
T1 1
T 0
2
T3 1
T4 = 0
T5 1
T6 1
T 0
7
T8 0
1
1
0
1
1
1
1
1
0
1
1
1
0
0
1
1
X11
0
X13
0
X15
X16
0
0
X 21
X 22
0
X 24
X 25
X 26
X 27
X 28
0
X 32
X 33
X 34
0
0
X 37
X 38
α1
α
2
α 3
+e
β1
β2
β3
The model is full rank, thus all parameters are estimable and therefore the between
block model provides estimates of α1, α2, α3, β1, β2, and β3. The between block
information needs to be combined with the within block information to obtain better
estimators. There are some incomplete block designs that do not allow all intercepts
to be estimated from the between block model. Those types of designs are not
considered here.
© 2002 by CRC Press LLC
4
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 10.1
An Incomplete Block Arrangement with
Three Treatments in Eight Blocks
Block
Treatments
1
1
2
2
2
3
3
3
1
4
3
2
5
2
1
6
1
2
7
2
3
8
3
2
10.4 COMBINING BETWEEN BLOCK AND WITHIN
BLOCK INFORMATION
When combining between and within block information, the functions of the parameters need to be consistent for both models. For example, the RCB provides within
–
–
block estimates of αi – α. and βi and between block estimates of α. and βi. The goal
–
–
–
is to obtain a combined estimate of the vector of parameters θ = (α., α1 – α ., α2 – α .,
–
…, αt – α., β1, β2, …, βt)′.
An additional complication occurs when the solution to the normal equations
does not yield estimates of the desired functions of the parameters. That is the case
for PROC GLM of the SAS® system where the within block information provides
estimates of
α1 − α t , α 2 − α t , …, α t − α t = 0, β1, β2 , …, β t .
Let α*i = (αi – α t ), then
α i* − α.* = α i − α. .
Thus, before continuing with the combining process, the estimates of estimable
–
functions of the αi’s need to be transformed into estimates of αi – α..
For three treatments in a RCB design structure, the within block estimates are
(
)
′
θˆ 1 = α1 − α., α 2 − α., α 3 − α., βˆ 1, βˆ 2 , βˆ 3
(10.5)
and the between block estimates are
(
)
′
θˆ 2 = αˆ ., βˆ 1, βˆ 2 , βˆ 3 .
The within block estimates can be expressed as the beta-hat model
© 2002 by CRC Press LLC
(10.6)
More Than Two Treatments in a Blocked Design Structure
0
0
0
ˆ 1 = H1 + ε1 where H1 =
0
0
0
5
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
,
0
0
1
(10.7)
ε1 ~ N(0, Σ1 ), Σ1 = σ ε2 G1,
G1 is a matrix of constants from the respective partition of the inverse of X′ X, and
X is the design matrix including, treatment effects, block effects, and covariates.
The between block information from the RCB can be expressed as the beta-hat
model
ˆ 2 = H 2 + ε 2
1
0
where H =
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
,
0
1
(10.8)
ε2 ~ N(0, Σ2), Σ2 = σ B2 G2, σ B2 = r (σ ε2 + rσ b2 ), and G2 is the inverse of WW′ where
W is the between block design matrix.
The between block estimates and the within block estimates are distributed
independently (under normality and the independence of the b’s and the ε’s); thus
the joint within/between beta-hat model is
ˆ 1 H1
ε1
ˆ = +
2 H 2
ε 2
where
0
⌺2
0 ⌺1
ε1
ε ~ N 0 , 0
2
The best linear unbiased estimator (assuming σ ε2 and σ b2 are known) of θ is
(
ˆ = H1′ ⌺1− H1 + H′2 ⌺2− H 2
© 2002 by CRC Press LLC
) (H′ ⌺ ˆ
−1
1
−
1 1
)
+ H′2 ⌺2−ˆ 2 .
6
Analysis of Messy Data, Volume III: Analysis of Covariance
The weighted least squares estimate of θ or combined between/within block estimate
of θ (assuming σ ε2 and σ b2 are unknown) is
(
ˆ − H + H′ ⌺
ˆ−
ˆ c = H1′ ⌺
2 2 H2
1
1
) (H′ ⌺ˆ ˆ + H′ ⌺ˆ ˆ )
−1
1
−
1
2
1
−
2
2
(10.9)
ˆ ε2 G1 and Σˆ 2 = r(σ
ˆ ε2 + rσ
ˆ b2 )G2, σ
ˆ ε2 is the within block residual mean
where Σˆ 1 = σ
2
2
square and r(σˆ ε + r σˆ b ) is the between block residual mean square. (If the block sizes
are not equal, then the function of σε2 and σb2 cannot be factored out of Σ2 as well as
the block totals will have unequal variances. (See Chapter 13 when you have unequal
block sizes.) The estimated approximate variance of θˆ c is
( ) (
ˆ − H + H′ ⌺
ˆ−
Var ˆ c = H1′ ⌺
2 2 H2
1
1
)
−
(10.10)
and the variance of an estimable linear combination of θ, say a′ θˆ c, is Var = (a′ θˆ c ) =
a′Var (θˆ c )a. The number of degrees of freedom associated with this estimated variance needs to be approximated. The Satterthwaite or Kenward and Roger approximation or a weighted average of t-values can be used. An approximate (1 – α)100%
LSD value can be computed (using a weighted t-value similar to Chapter 24 of
Milliken and Johnson, 1992) by replacing Σˆ 1 by (tα/2,df1)Σˆ 1 and Σˆ 2 by (tα/2,df2 )Σˆ 2 in
the Var (θˆ c) where df1 is the degrees of freedom of the residual mean square from
the within block analysis and df2 is the degrees of freedom of the residual mean
square from the within block analysis. The resulting value of Var (θˆ c) is the approximate LSD value. The approximate t-value used in this LSD computation is
t ␣* 2 =
( (
)
)
( )
ˆ
a ′ Var( ) a
−
ˆ − H + H′ t
ˆ
a ′ H1′ t ␣ 2,df1 ⌺
2 ␣ 2,df2 ⌺ 2 H 2 a
1
1
.
(10.11)
c
* to the t
Approximate degrees of freedom can be computed by matching tα/2
α/2,df values
in the t-table. A Satterthwaite approximation and a Kenward-Roger (Kenward and
Roger, 1997) approximation to the degrees of freedom are available as options in
PROC MIXED. The statistic to test the parallelism hypothesis can be computed by
constructing a beta-hat model for the slopes. Let ˆ c = W ˆ c where W = [0, 0, I]; then
the asymptotic sampling distribution of ˆ c is N(, ⌺ˆ ) where ⌺ˆ = W Var(ˆc )W′′.
The beta-hat model under the equal slope hypothesis is
ˆ c = j  + ε*
where j is a t × 1 vector of ones. The residual mean square from the beta-hat model
is the test statistic, i.e.,
ˆ −1 − ⌺
ˆ −1 j j′ ⌺
ˆ −1
uc = ˆ ′c ⌺
ˆ
ˆ
ˆ
( j′⌺ˆ j) ˆ
−1
ˆ
c
(t − 1)
which has an approximate small sample size distribution of F(t – 1, df1 ).
© 2002 by CRC Press LLC
(10.12)
More Than Two Treatments in a Blocked Design Structure
7
Another small sample approximation can be obtained by recomputing uc
ˆ and ⌺
ˆ are replaced with (F
ˆ
ˆ
where ⌺
1
2
α,(t – 1),df1)⌺ 1 and (Fα,(t – 1),df2)⌺ 2, respectively.
Denote this value of uc by u*c . An approximate F value is
F *α,t −1,df = u c u c* .
(
c
(10.13)
)
*
The approximate degrees of freedom dfc are determined by matching F(α,t
– 1,dfc) to an
F-table with t – 1 degrees of freedom in the numerator and significance level α. The
approximate small sampling distribution for uc is F(t – 1,dfc) .
For the three-treatment incomplete block design structure in Table 10.1, the
within block estimates and corresponding beta-hat model are the same as above for
the RCB (Equations 10.5 through 10.7). The between block estimates are
(
)
′
ˆ 2 = αˆ 1, αˆ 2 , αˆ 3 , βˆ 1, βˆ 2 , βˆ 3 .
Next transform the intercepts to ˆ 3 where
ˆ 3 = T ˆ 2 where
(
)
′
ˆ 3 = αˆ ., α1 − α., α 2 − α., α 3 − α., βˆ 1, βˆ 2 , βˆ 3
and
13
23
− 1 3
T = − 1 3
0
0
0
13
−1 3
23
−1 3
0
0
0
13
−1 3
−1 3
23
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
The beta-hat model for θˆ 3 is
ˆ 3 = H 3 + ε 3
(10.14)
where H3 = I7 and Var(ε 3 ) = Σ3 = T Σ2T ′.
The combined estimate of can be computed from Equation 10.9 where ˆ 2, H2
and Σˆ 2 are replaced by ˆ 3, H3, and Σˆ 3, respectively. Two examples are used to
demonstrate the above ideas.
© 2002 by CRC Press LLC
8
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 10.2
Data for the Example in Section 10.5 Involving Five Treatments
in a RCB Design Structure Where Y is the Response Variable
and X is the Covariate
BLOCK
1
2
3
4
5
6
7
8
9
10
11
12
Treatment 1
Treatment 2
Treatment 3
Treatment 4
Treatment 5
Y1
55.1
50.7
58.4
60.5
60.6
60.4
62.1
59.5
68.1
66.7
68.7
64.9
Y2
59.6
59.1
65.5
57.9
65.6
64.8
64.7
64.0
64.9
75.9
68.5
63.6
Y3
63.5
65.7
66.1
72.7
70.6
62.4
66.0
76.1
80.5
82.8
75.3
75.5
Y4
66.1
59.9
77.8
73.5
91.1
69.7
76.2
78.3
79.8
84.0
87.8
80.2
Y5
80.5
78.6
76.4
78.7
82.2
67.0
86.8
81.2
96.4
79.0
76.1
88.5
X1
14.0
15.4
13.5
19.0
12.0
24.1
28.9
18.7
25.6
22.7
25.3
22.6
X2
11.4
20.6
21.2
9.0
16.9
21.3
23.6
16.0
13.2
28.5
14.8
11.1
X3
14.4
20.3
15.6
20.8
16.5
14.2
17.8
24.8
27.2
28.5
19.2
18.4
X4
12.2
11.9
21.6
17.0
31.4
16.2
22.9
19.1
19.4
22.4
26.1
19.8
X5
21.1
24.3
16.1
17.5
19.0
12.5
26.9
19.8
28.6
16.7
14.9
22.4
10.5 EXAMPLE: FIVE TREATMENTS IN RCB
DESIGN STRUCTURE
The data in Table 10.2 are yields (Y) (kilograms per hectare) of winter wheat where
the treatments are herbicides applied during the spring and the covariate is the depth
of adequate moisture (centimeters) measured on each plot. The design structure is
a RCB.
Table 10.3 contains the SAS® system code used to extract the within block
information from the data set. The estimate of σε2 is 0.9965 and the parameter
estimates provide estimates of the slopes (denoted by x*trt 1, …, x*trt 5) and
estimates of αi – α5 i = 1, 2, …, 5, (denoted by trt 1, … , trt 5). Estimate statements
–
have been included in Table 10.4 to provide estimates of αi – α
., quantities that are
needed in the combined estimator process. Table 10.5 contains the PROC GLM code
to provide the statistic to test the equal slopes hypothesis using the within block
information. The value of the F statistic is 32.23 with a significance level less than
0.0001, indicating there is sufficient within block information to conclude the slopes
are not all equal. Using the expected mean square for BLOCKS and ERROR, the
ˆ b2 = (59.8718 – 0.99650/4.5455 =
estimate of the block variance component is σ
12.9524. The block totals for Y are listed in Table 10.6 and are denoted by SY, where
SX1, …, SX5 are the sums of the covariates within each block. The PROC GLM
code to fit the between block model is in Table 10.7. Also included are estimate
–
statements to provide estimates of α and the five slopes. The contrast statement
provides a between block test of the equal slopes hypothesis. The resulting F statistic
is 5.50 with significance level 0.0330; again there is sufficient between block information to conclude the slopes are not all equal. The between block analysis of
© 2002 by CRC Press LLC
More Than Two Treatments in a Blocked Design Structure
9
TABLE 10.3
PROC GLM Code to Fit the Within Block Model with
Parameter Estimates
PROC GLM DATA=LONG10; CLASSES TRT BLOCK;
MODEL Y=BLOCK TRT X*TRT/SOLUTION INVERSE;
ESTIMATE “A1-Abar” TRT .8 –.2 –.2 –.2 –.2;
ESTIMATE “A2-Abar” TRT –.2 .8 –.2 –.2 –.2;
ESTIMATE “A3-Abar” TRT –.2 –.2 .8 –.2 –.2;
ESTIMATE “A4-Abar” TRT –.2 –.2 –.2 .8 –.2;
ESTIMATE “A5-Abar” TRT –.2 –.2 –.2 –.2 .8;
Source
Model
Error
Corr Total
df
20
39
59
SS
5724.9054
38.8640
5763.7693
MS
286.2453
0.9965
FValue
287.25
ProbF
0.0000
Source
BLOCK
trt
x*trt
df
11
4
5
SS(III)
658.5902
6.5423
1040.1444
MS
59.8718
1.6356
208.0289
FValue
60.08
1.64
208.76
ProbF
0.0000
0.1834
0.0000
Estimate
–1.3249
0.5019
0.9823
3.0808
0.0000
0.5100
0.6734
0.9135
1.0798
1.4309
Biased
1
1
1
1
1
0
0
0
0
0
StdErr
1.8508
1.7831
1.9378
1.9275
tValue
–0.72
0.28
0.51
1.60
Probt
0.4783
0.7798
0.6151
0.1180
0.0613
0.0579
0.0708
0.0611
0.0693
8.32
11.63
12.90
17.67
20.66
0.0000
0.0000
0.0000
0.0000
0.0000
Parameter
trt 1
trt 2
trt 3
trt 4
trt 5
x*trt 1
x*trt 2
x*trt 3
x*trt 4
x*trt 5
TABLE 10.4
PROC GLM Code for the Within Block Model
Parameter Estimates from Estimate Statements
ESTIMATE
ESTIMATE
ESTIMATE
ESTIMATE
ESTIMATE
Parameter
A1-Abar
A2-Abar
A3-Abar
A4-Abar
A5-Abar
© 2002 by CRC Press LLC
“A1-Abar”
“A2-Abar”
“A3-Abar”
“A4-Abar”
“A5-Abar”
Estimate
–1.9730
–0.1461
0.3343
2.4328
–0.6480
TRT
TRT
TRT
TRT
TRT
.8 –.2 –.2 –.2 –.2;
–.2 .8 –.2 –.2 –.2;
–.2 –.2 .8 –.2 –.2;
–.2 –.2 –.2 .8 –.2;
–.2 –.2 –.2 –.2 .8;
StdErr
1.0980
0.9878
1.2054
1.1397
1.2343
tValue
–1.80
–0.15
0.28
2.13
0.53
Probt
0.0801
0.8832
0.7830
0.0391
–0.6026