Tải bản đầy đủ - 0 (trang)
6 EXAMPLE: TWO-WAY TREATMENT STRUCTURE WITH ONE COVARIATE

6 EXAMPLE: TWO-WAY TREATMENT STRUCTURE WITH ONE COVARIATE

Tải bản đầy đủ - 0trang

10



Analysis of Messy Data, Volume III: Analysis of Covariance



TABLE 8.9

SAS® System Code to Perform Preliminary Computations to Compare

WOOD*STOVE models.

*Fit one model to all data;

PROC REG; MODEL ENERGY = MOISTURE;

*Fit a model to each level of wood;

PROC SORT; BY WOOD;

PROC REG; BY WOOD; MODEL ENERGY = MOISTURE;

*Fit a model to each level of stove type;

PROC SORT; BY STOVE;

PROC REG; BY STOVE; MODEL ENERGY = MOISTURE;

*Fit a model to each combination of WOOD*STOVE;

PROC SORT; BY STOVE WOOD;

PROC REG; BY STOVE WOOD; MODEL ENERGY = MOISTURE;



TABLE 8.10

Residual Sums of Squares and Degrees of

Freedom for One Model for All Data, One

Model for Each Wood and One Model

for Each Stove to Compute SSRES(COMBINED),

SSRES(WOOD), and SSRES(STOVE)

(1) One model SSRES(COMBINED) = 326.64870 d.f. = 86

(2) WOOD

Black Walnut

Osage Orange

Red Oak

White Pine

SSRes(WOOD)

(3) STOVE

Type A

Type B

Type C

SSRES(STOVE)



SSRes(WOODi )

19.05508

8.70268

9.66393

10.71815



df

20

22

18

20



48.13984



80



SSRes(STOVEj )

173.27729

70.59879

82.32894



df

28

23

31



326.20502



82



for the levels of wood, the equality of models for the levels of stove, and testing for

wood by stove interaction effects on the models. By summing the model over the

levels of stove for each level of wood, a wood level model can be obtained as

yi ⋅ k = α i . + βi . Mi ⋅ k + εi ⋅ k

© 2002 by CRC Press LLC



Comparing Models for Several Treatments



11



TABLE 8.11

Residual Sums of Squares and Degrees

of Freedom for Various Wood*Stove

Combinations Used to Compute

SSRES(POOLED)

STOVE

Type A

Type A

Type A

Type A

Type B

Type B

Type B

Type B

Type C

Type C

Type C

Type C



WOOD

Black Walnut

Osage Orange

Red Oak

White Pine

Black Walnut

Osage Orange

Red Oak

White Pine

Black Walnut

Osage Orange

Red Oak

White Pine



SSRES(POOLED)



SSRes(Wij ,Sj )

0.10304

0.34459

0.77695

0.66661

0.54988

1.00689

0.25559

1.17348

1.02172

1.47882

1.29265

0.48916



df

4

7

3

8

6

4

4

3

6

7

7

5



9.15941



64



To compute the sum of squares due to test for equality of wood models, fit a model

to the data from each level of wood. The code in the second part of Table 8.9 is

used to fit a model to each level of wood. The sums of squares residuals for each

level of wood are in part (2) of Table 8.10 where SSRES(WOOD) = 48.13984 and

is based on 80 degrees of freedom. The sum of squares due to deviations from the

equal models for the levels of wood hypothesis is SS_WOOD = 326.64870 –

48.13984 = 278.50886 and is based on 86 – 80 = 6 degrees of freedom. The value

of the F statistic is 324.34 as displayed in Table 8.12. The significance level is very



TABLE 8.12

Sums of Squares, Degrees of Freedom and F Statistics

for Wood, Stove and Wood*Stove

σˆ 2 = 0.143116

SSWOOD = 326.64870 – 48.13984 = 317.48929



df = 6



52.91488

Fcw = ---------------------- = 324.34

0.143116

SS STOVE = 326.64870 – 326.20502 = 0.44368



df = 4



0.11092

Fcs = ---------------------- = 0.78

0.143116

SS WOOD × STOVE = 48.13984 + 326.20502 – 326.64820 – 9.159410

= 38.53675, d.f. = 80 + 82 – 86 – 64 = 12

3.21140

Fcws = ---------------------- = 22.4391

0.143116



© 2002 by CRC Press LLC



12



Analysis of Messy Data, Volume III: Analysis of Covariance



small, providing sufficient evidence to conclude the levels of wood models are not

all identical. Recall this testing a Type II hypothesis (Milliken and Johnson (1992)).

Next investigate the effect of the levels of stove on the regression models. This

is accomplished by testing the equality of the models for the levels of stove. By

summing the model over the levels of wood for each level of stove, a stove level

model can be obtained as

y. jk = α. j + β. j M. jk + ε. jk

To compute the sum of squares due to test for equality of stove models, fit a model

to the data from each level of stove. The code in the third part of Table 8.9 is used

to fit a model to each level of stove. The sums of squares residuals for each level

of stove are in part (3) of Table 8.10 where SSRES(STOVE) = 326.20502 and is

based on 82 degrees of freedom. The sum of squares due to deviations from the

equal models for the levels of stove hypothesis is SS_STOVE = 326.64870 –

326.20502 = 0.44368 and is based on 86 – 82 = 4 degrees of freedom. The value

of the F statistic is 0.78 as displayed in Table 8.12. The significance level is very

large, so there is no evidence the levels of stove models are not all identical.

The final step is to investigate the possibility of wood by stove interaction effect

on the regression models. The no interaction effect on the regression models hypothesis is equivalent to the hypothesis that all 2 × 2 table differences (Milliken and

Johnson (1992)) of the vectors of parameters are equal to zero. This no interaction

hypothesis can be stated as

α ij  α is  α rj  α rs 

H 0 WS :   −   −   +   = 0 for all i, j, s, r vs. Ha : not H 0 WS

 βij   βis   β rj   β rs 



(



)



The sum of squares due to deviations from the above hypothesis is SS_WOOD ×

STOVE = 48.13984 + 326.20502 – 326.64870 – 9.159410 – 38.53675 and is based

on 80 + 82 – 86 – 64 = 12 degrees of freedom. The value of the F statistic is 22.44

as shown in Table 8.12. The significance level is <0.0001, indicating that there is

sufficient evidence to conclude there is an interaction effect on the wood by stove

models. Table 8.13 contains the analysis of variance table summarizing the above

factorial effects on the models. Remember the test statistics are comparing the

models not just an individual parameter of the models. So the numbers of degrees

of freedom for the main effects and interaction in Table 8.13 are the usual numbers

of degrees of freedom times the number of parameters in a treatment combination

model. Remember, these are Type II sums of squares.



8.7 DISCUSSION

A very useful method is presented to test identical model hypotheses. The method

could be used to perform prior tests of equality of models before examining the

individual parameters in order to control the Type I error rate. That is, one would

© 2002 by CRC Press LLC



Comparing Models for Several Treatments



13



TABLE 8.13

Analysis of Variance Table to Summarize

the Factorial Effects of Wood and Stove

on the Regression Models

Source

WOOD

STOVE

WOOD*STOVE

ERROR



df

6

4

12

64



SS(II)

317.48929

0.44368

38.53675

9.15941



F

324.35

0.78

22.44



only continue with the analysis of covariance process of simplifying the form of the

model if there were adequate evidence to believe the models are not identical. The

process is easily implemented with a multiple regression computer package, i.e.,

these comparisons can be accomplished without a general linear models computer

package.



REFERENCES

Hinds, M. A. and Milliken, G. A. (1987). Statistical methods to use nonlinear models to

compare silage treatments, Biometrical Journal 29(6), 825–834.

Milliken, G. A. and Johnson, D. E. (1992). Analysis of Messy Data, Volume I: Design

Experiments, London: Chapman & Hall.

Schaff, D. A., Milliken, G. A., and Clayberg, C. D. (1988). A method for analyzing nonlinear

models when the data are from a split-plot or repeated measures design, Biometrical

Journal 30(2), 139–146.



EXERCISES

EXERCISE 8.1: Compare the equality of the models for the data in Section 3.4.

EXERCISE 8.2: Compare the equality of the models for the data in Section 4.5.

EXERCISE 8.3: Compare the equality of the models for the data in Section 5.4.

EXERCISE 8.4: Compare the equality of the models for the data in Section 5.7.

EXERCISE 8.5: For the data in Section 5.5, use contrast statements to obtain tests

for equal wood models, equal stove models, and the no interaction hypothesis and

compare the results to those in Section 8.6. The contrast statements will provide

tests of Type III hypotheses.



© 2002 by CRC Press LLC



9



Two Treatments

in a Randomized

Complete Block

Design Structure



9.1 INTRODUCTION

The introduction of blocking into the analysis of covariance models presents another

dimension in the analysis. That dimension involves obtaining information about the

slopes of the lines from the block means or totals, a process called the recovery of

interblock information, as well as obtaining information about the slopes from the

within block comparisons. The recovery of interblock information about treatment

effects is used in the analysis of incomplete block designs, but there is no interblock

information about treatment effects in complete block designs. This chapter develops

the general methodology for analyzing analysis of covariance models when the data

are collected in complete blocks. The next step is to consider the analysis of covariance in incomplete block designs which include split-plot and repeated measures

designs (discussed in later chapters). A simple experiment involving two treatments

in six blocks is used throughout this chapter to demonstrate the various concepts.

The last section gives an example with equal slopes in 20 blocks.



9.2 COMPLETE BLOCK DESIGNS

An experiment was conducted to evaluate the effect of two herbicides on soybean

yield. The experimental design consists of a one-way treatment structure in a randomized complete block design structure. The herbicides were preemergence herbicides and were incorporated into the soil. The activity of the herbicides was thought

to be influenced by the amount of organic matter in the soil, so the organic matter

content of each plot was determined and was used as a possible covariate. For

purposes of discussion, it is assumed that organic matter affects the yield of the

plots linearly. As in any modeling process, this assumption must be substantiated

before the analysis can continue. Since the herbicides were of differing chemical

compositions, there was the possibility of unequal slopes.

A model that can be used to describe the yield of the soybeans grown on a plot

in the jth block treated by the ith herbicide is



© 2002 by CRC Press LLC



2



Analysis of Messy Data, Volume III: Analysis of Covariance



y ij = α i + βi x ij + b j + ε ij , i = 1, 2, j = 1, 2, …, 6,



(9.1)



where yij denotes the observed yield per plot in bushels per acre,

αi denotes the mean response of the ith herbicide when the value of the

covariate (organic matter) is zero,

xij denotes the value of the covariate (organic matter) measured on the experimental unit in the jth block receiving the ith herbicide,

βi is the slope of the regression line for the ith herbicide,

bj is the random block effect associated with the jth block where the block

effects are assumed to be distributed iid N (0, σ b2) , and

εij denotes the random error where the errors are assumed to be distributed

iid N (0, σ ε2 ) .

Model 9.1 is a mixed model in that its parameters include two components of

variance (Milliken and Johnson, 1992, and Littel et al., 1996). To help understand

how the mixed models analysis operates, the within block analysis, the between

block analysis, and the combined within and between block analysis are discussed.

Before the development of mixed models software, the analysis of Model 9.1 could

be carried out in two parts: the Within Block Analysis (that is done by most computer

codes) and the Between Block Analysis (which is not done by most computer codes).



9.3 WITHIN BLOCK ANALYSIS

The within block analysis provides estimates of the slopes which are based on the

within block information, i.e., the estimates of the slopes are based on contrasts of

the observations computed within each block. Within block information is free of

block effects and the variance of a within block estimate is a scalar multiple of σ2ε.

To demonstrate this idea, consider the data in Table 9.1 which represent the yield of

soybeans in bushels per acre where the treatments are two herbicides, the covariate

is the percent organic matter, and there are six blocks.

The within block analysis of Model 9.1 can be carried out by taking contrasts

within the blocks and analyzing the corresponding models. Since there are only two

treatments per block, the only contrast within each block is the difference dj = y1j

– y2j. The within block model is constructed by taking the difference of the models,

i.e., the model for dj is



(



d j = α1 + β1x1 j + b j + ε1 j − α 2 + β2 x 2 j + b j + ε 2 j



)



= α1 + β1x1 j + ε1 j − α 2 − β2 x 2 j − ε 2 j

= α1 − α 2 + β1x1 j − β2 x 2 j + ε1 j − ε 2 j

The difference model is free of the block effects and variance of each difference is

2 σ 2ε. By letting αd = α1 – α2 and ej = ε1j – ε2j, the model for the differences (called



© 2002 by CRC Press LLC



Two Treatments in a Randomized Complete Block Design Structure



3



TABLE 9.1

Yield of Soybeans (in bu/acre) for

Herbicide Treatments with Percent

Organic Matter as a Covariate

in RCB Design Structure

Herbicide 1

Block

1

2

3

4

5

6



Yield

26.6

31.1

34.7

34.4

32.1

28.5



OM

0.91

1.22

1.43

1.45

1.33

1.10



Herbicide 2

Yield

30.2

29.2

32.1

31.9

30.2

31.0



OM

1.02

0.89

1.39

1.47

1.27

1.12



a within block model) can be expressed as the two independent variable multiple

regression model

d j = α d + β1x1 j − β2 x 2 j + e j .

By fitting the multiple regression model to the data in Table 9.1, one obtains estimates

of the estimable functions of the parameters of the original model, i.e., α1 – α2, β1,

β2, and σ 2ε. The matrix form of the model for the differences computed from the

data in Table 9.1 is

−3.60  1

 1.90  1

 



 2.60  1

=



 2.50  1

 2.00  1

 



−2.50  1



0.91

1.22

1.43

1.45

1.33

1.10



−1.02 

−0.89



−1.39



−1.47

−1.27



−1.12 



[



α d 

 

 β1  + e.

 β2 



]



or d = Z η + e, where η′ = α d , β1, β2 .

The least squares estimates of the parameters of the model for the differences are

αˆ d 

−13.82302281

−1

ˆ 

 16.95662402  .

=

β

Z

Z

Z

d

=

 1 ( ′ ) ′





 βˆ 2 

 5.64513210 



© 2002 by CRC Press LLC



(9.2)



4



Analysis of Messy Data, Volume III: Analysis of Covariance



The residual sum of squares for the difference model is 1.44, which is based on

3 degrees of freedom. The residual mean square, 1.44/3 = 0.48, is an estimate of

2 σˆ ε2, the variance of the dj. The estimated covariance matrix of the estimated parameter vector is

2σˆ (Z′Z)

2

ε



−1



 3.633765

= −2.0436818

 0.8542064



−2.0436818

5.1684106

3.65479537



0.8542064

3.6579537 = Σˆ w .

4.5168137



(9.3)



The estimated covariance matrix corresponding to the within block analysis estimates

of the two slopes is the lower 2 × 2 partition of Σˆ w , or

5.1684106

Σˆ β = 

w

3.6579537



3.6579537

.

4.168137 



The estimates of the standard errors of the parameter estimates are obtained by

taking the square root of the diagonal elements of the covariance matrix. These

standard errors can be used to construct confidence intervals about the respective

parameters. The above estimates of the parameters are what would be obtained from

a computer code that performs the within block analysis, but there is additional

information about the slopes and treatment effects from the between block analysis.



9.4 BETWEEN BLOCK ANALYSIS

The between block analysis utilizes the model of the block totals and provides

information about the parameters contained in those block totals. Let tj = y1j + y2j

denote the total of the two observations in the jth block. The block total model is

constructed by taking the totals of the corresponding models, i.e.,

t j = α1 + β1x1 j + ε1 j + α 2 + β2 x 2 j + b j + ε 2 j

= α1 + α 2 + β1x1 j + β2 x 2 j + 2 b j + ε1 j + ε 2 j

which can be expressed as: tj = αt + β1x1j + β2x2j + rj, where αt = α1 + α2 and rj =

2bj + ε1j + ε2j. The variance of a block total is 2(σε2 + 2σ2b). A multiple regression

program can be used to fit the above model to the block totals and obtain the between

block estimates of αt, β1, β2, and 2(σε2 + 2σ2b).

The matrix form of the block total model for the data in Table 9.1 is

56.8 1

60.3 1

 



66.8 1

=



66.3 1

62.4 1

 



59.5 1

© 2002 by CRC Press LLC



0.91

1.22

1.43

1.45

1.33

1.10



1.02 

0.89



1.39 



1.47 

1.27 



1.12 



α t 

 

β1  + r.

β2 



Two Treatments in a Randomized Complete Block Design Structure



[



5



]



or t = M τ + e, where τ′ = α t , β1, β2 .

The between block estimates are

αˆ t 

38.48854914

 





βˆ 1  = (M′ M)−1 M′t = 13.83912245

 





βˆ 2 

 5.32201593

 



(9.4)



The residual sum of squares from the between block model is 3.25 and is based

on 3 degrees of freedom. The between block residual mean square is 3.25/3, which

is an estimated 2(σε2 + 2σ2b). The estimated covariance matrix of the between block

estimates is

 8.218374



(3.25 3) (M′M)−1 = −4.622627



−1.932139



−4.622627



−1.932139



−8.274010  = Σˆ b .



10.216685



11.690547

−8.274010



(9.5)



The estimated covariance matrix corresponding to the between block analysis estimates of the two slopes is the lower 2 × 2 partition of Σˆ b, or

 11.690547

Σˆ β = 

b

−8.274010



−8.274010 

.

10.211685



9.5 COMBINING WITHIN BLOCK AND BETWEEN

BLOCK INFORMATION

The vector of parameters for Model 9.1 is ␪ = [α1, α2, β1, β2]′. The within block

model provides estimates of ␪1 = [α1 – α2, β1, β2]′ and the between block model

provides estimates of ␪2 = [α1 + α2, β1, β2]. ␪1 and ␪2 are linear transforms of ␪

expressed as ␪1 = H1␪ and ␪2 = H2␪ where

1

H1 = 0

0



−1

0

0



0

1

0



0

1

0  and H2 = 0

0

1 



1

0

0



0

1

0



The estimators can be expressed as beta-hat models (Chapter 6)

␪ˆ 1 = H1␪ + e1 where e1 ~ N(0, Σ W )

and

␪ˆ 2 = H2␪ + e2 where e2 ~ N(0, Σ b )

© 2002 by CRC Press LLC



0

0 .

1 



6



Analysis of Messy Data, Volume III: Analysis of Covariance



The two estimators are independently distributed; thus the joint model is

 θˆ 1   H1 

 0  Σ W

 e1 

 e1 

 ˆ  =   θ +   where   ~ N   , 

θ2  H2 

e 2 

e 2 

 0  0



0 

.

Σ b  



If the variances are known, then the BLUE (Best Linear Unbiased Estimator) of θ is



[



θˆ B = H1′Σ −W1H1 + H′2 Σ −b1H2



]



−1



[H Σ

1



−1

W 1



θˆ + H′2 Σ −b1θˆ 2



]



with sampling distribution



[



(



θˆ B ~ N θ, H1′Σ w−1H1 + H′2 Σ −b1H2



)



−1



]



or θˆ B ~ N(θ, Σθ).

The variance components are most likely unknown; thus the weighted least

squares combined estimator of θ is

−1



−1

−1

−1

−1

θˆ B = H1′ Σˆ H1 + H′2 Σˆ H2  H1 Σˆ θˆ 1 + H2 Σˆ θˆ 2 

⋅b

⋅ b 

 ⋅ W

  ⋅ W



with approximate estimated covariance matrix



() [



Var θˆ = H1′Σˆ −W1H1 + H′2 Σˆ −b1H2



]



−1



= Σˆ θ .



θˆ is the mixed models estimate of the parameters obtained when the above estimates

of the variance components are used in place of the actual variance components

(Littell et al., 1996). The combined estimate of θ for the data in Table 9.1 is

 αˆ 1   11.94177

ˆ  



ˆθ = α 2  = 25.41262 

 βˆ 1  15.55772 

ˆ  



 β2   4.48663

with estimated covariance matrix

 2.6379

 0.7031

Σˆ θ = 

−2.0748



 −0.5681



0.7031

2.1475

−0.5467

−1.7450



−2.0748

−0.5467

1.6733

0.4581



−0.5681

−1.7450 



0.4581



1.4623



where θˆ 1, Σˆ w , θˆ 2, and Σˆ b are from equations 9.2, 9.3, 9.4, and 9.5, respectively.

© 2002 by CRC Press LLC



Two Treatments in a Randomized Complete Block Design Structure



7



The combined estimate of the slopes should be used when the variance of the

combined estimate is smaller than the variance of the within block estimates of the

slopes (Ash, 1982). When the number of blocks is small, the between block information may not be useful. There needs to be one more block than treatments before

a between block estimate can be computed and there needs to be more blocks in

order to obtain a within block residual mean square with adequate degrees of freedom.

The within block model was used to obtain estimates of α1 – α2, β1, and β2 and

the between block model was used to obtain estimates of α1 + α2, β1, and β2. Neither

the within block model nor the between block model provides estimates of α1 and

α2, but the combined estimator does provide estimates of all of the parameters, α1,

α2, β1, and β2, where α1 = (αt + αd)/2 and α2 = (αt – αd)/2 . For further discussion

on intra-block models (within block), interblock models (between block), and the

process of combining estimators, see John (1971) and Fergen (1997).

Once the estimate of the parameters of θ have been obtained and the covariance

matrix has been estimated, it is generally of interest to estimate linear combinations

of θ, such as a′′θ. Some choices for a are to (1) provide estimates of the regression

models evaluated at X = X0 by letting a′′1 = (1, 0, X0, 0) and a′′2 = (0, 1, 0, X0) or

(2) to provide estimates of the differences of the regression models evaluated at X =

X0 by letting a′ = (1, –1, X0, –X0). The approximate sampling distribution of a linear

combination of θˆ is a′ θˆ ~ N(a′θ, a′Σθa).



9.6 DETERMINING THE FORM OF THE MODEL

Thus far the discussion about the model in Equation 9.1 assumes that the slopes are

unequal. The next step in the analysis, after determining that straight lines are

adequate to describe the data for each treatment, is to test the equality of slopes.

Generally there is sufficient within block information to test the equal slopes hypothesis without considering the between block information, although, if there were

many blocks, a test based on the combined estimate could be quite a bit more

powerful. The model comparison method can be used to construct a statistic based

on the within block information to test the equal slope hypothesis

H 0 : β1 = β2 = β0 vs. H 0 : ( not H 0 :)

where β0 is unspecified.

The model under the conditions of the null hypothesis is

y ij = α i + β0 x ij + b j + ε ij , i = 1, 2, j = 1, 2, …, a.



(9.6)



Let RSS(H0) denote the residual sum of squares for the model under the conditions

of H0: which is based on (2 – 1)(a – 1) – 1 = dfRSS(H0) degrees of freedom, where

“a” is the number of blocks in the experiment. Let RSS denote the residual sum of

squares for the unrestricted Model 9.1, which is based on (2 – 1)(a – 1) – 2 = dfRSS

degrees of freedom. The sum of squares due to deviations from the equal slope

© 2002 by CRC Press LLC



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

6 EXAMPLE: TWO-WAY TREATMENT STRUCTURE WITH ONE COVARIATE

Tải bản đầy đủ ngay(0 tr)

×