7 EXAMPLE: COMPARING RESPONSE SURFACE MODELS
Tải bản đầy đủ - 0trang
Multiple Covariants on One-Way Treatment Structure
21
TABLE 4.16
Strength of Pills Made by Two Formulations
with Concentrations of Two Binders
CONC1
1.00
0.00
–1.00
–1.41
–1.00
0.00
1.00
1.41
0.00
0.00
0.00
0.00
CONC2
1.00
1.41
1.00
0.00
–1.00
–1.41
–1.00
0.00
0.00
0.00
0.00
0.00
U3Y Force
14.14
10.64
1.47
2.28
8.53
6.57
3.14
8.16
13.58
11.73
13.07
12.24
X2Z Force
16.02
14.07
7.20
3.85
10.30
12.78
8.14
9.45
13.98
15.66
16.83
14.41
TABLE 4.17
Analysis of Variance Table to Provide the Estimate of Pure
Error for the Response Surface Models Used to Compare
Pill Strength
PROC GLM DATA=RESPSUR; CLASS FORMULA CONC1 CONC2;
MODEL FORCE = FORMULA*CONC1*CONC2;
where
i =
j =
k =
m=
Source
Model
Error
Corrected Total
df
17
6
23
SS
458.69
7.04
465.72
MS
26.98
1.17
FValue
23.01
ProbF
0.0004
Source
Formula*Conc1*Conc2
df
17
SS(Type III)
458.69
MS
26.98
FValue
23.01
ProbF
0.0004
1, 2 for the two formulations
1, 2, …, 5, for the levels of x1 (CONC1),
1, 2, …, 5, for the levels of x2 (CONC2),
1 or 4 for the replications per formulation.
Since there are four replications of the center point for each formulation, the
variation between these observations within a formulation provides an estimate of
pure error (or a model free estimate of the variance). The estimate of the pure error
is σˆ 2PE = 1.17, which is the Mean Square Error from the analysis of variance in
Table 4.17. To compute the estimate of 2PE using PROC GLM of the SASđ system,
â 2002 by CRC Press LLC
22
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 4.18
Code and Analysis of Variance Table to Test Each Set of the
Slopes are Equal to Zero
PROC GLM DATA=RESPSUR; CLASS FORMULA;
MODEL FORCE=FORMULA CONC1*FORMULA CONC2*FORMULA
C11*FORMULA C22*FORMULA C12*FORMULA;
Source
Model
Error
Corrected Total
df
11
12
23
SS
456.55
9.17
465.72
MS
41.50
0.76
FValue
54.29
ProbF
0.0000
Source
FORMULA
CONC1*FORMULA
CONC2*FORMULA
C11*FORMULA
C22*FORMULA
C12*FORMULA
df
1
2
2
2
2
2
SS (Type III)
13.16
56.97
17.20
201.94
30.97
111.68
MS
13.16
28.49
8.60
100.97
15.49
55.84
FValue
17.21
37.26
11.25
132.08
20.26
73.04
ProbF
0.0013
0.0000
0.0018
0.0000
0.0001
0.0000
include Formula, Conc1, and Conc2 in the class statement and then use Formula*Conc1*Conc2 in the model statement (see Table 4.17). This process constructs
a means type model where there is one mean for each combination of Formula by
Conc1 by Conc2. For this example there are nine combinations for each formula.
The resulting Error Sum of Squares measures the variability of those experimental
units treated alike, which in this case are those with levels (0,0) of (Conc1,Conc2)
for each formula. There are four center points for each formulation; thus there are
six degrees of freedom for pure error with three degrees of freedom coming from
each formula. Next, Model 4.13 is fit to the data to (1) determine if the quadratic
response surface model adequately describes the data from each formulation, (2) to
obtain estimates of the parameters of the models, and (3) to test the hypotheses that
the sets of slopes are equal to zero, i.e., Hos: βs1 = βs2 = 0 vs. Has: (not Hos), s = 1,
2, …, 5. The SAS® system code to fit Model 4.13 is in Table 4.18 along with the
results. The terms C11, C22, and C12 denote CONC1*CONC1, CONC2*CONC2,
and CONC1*CONC2, respectively. The sum of squares due to lack of fit is computed
by subtracting the sum of squares for pure error (see Table 4.17) from the current
model error sum of squares (Table 4.18). The computations are SS Lack of Fit =
9.17 – 7.04 = 2.13, based on 6 = 12 – 6 degrees of freedom. The value of the statistic
to test the lack of the ability of the model to fit the cell means is F = (2.13/6)/(7.04/6) =
0.303, indicating there is no evidence of lack of fit. The F statistics in Table 4.18
provide tests of the following hypotheses: CONC1*FORMULA tests H0: β11 = β12 =
0 vs. Ha:(not H0:) with significance level 0.0000, CONC2*FORMULA tests H0:
β21 = β22 = 0 vs. Ha:(not H0:) with significance level 0.0018, C11*FORMULA tests
H0: β31 = β32 = 0 vs. Ha:(not H0:) with significance level 0.0000, C22*FORMULA
tests H0: β41 = β42 = 0 vs. Ha:(not H0:) with significance level 0.0001, and C12*FORMULA tests H0: β51 = β52 = 0 vs. Ha:(not H0:) with significance level 0.0000. Each
© 2002 by CRC Press LLC
Multiple Covariants on One-Way Treatment Structure
23
TABLE 4.19
Code and Analysis of Variance Table for Testing
the Parallelism Hypothesis for Each Covariate
PROC GLM DATA=RESPSUR; CLASS FORMULA;
MODEL FORCE = CONC1 CONC2 C11 C22 C12
FORMULA CONC1*FORMULA CONC2*FORMULA C11*FORMULA
C22*FORMULA C12*FORMULA;
Source
Model
Error
Corrected Total
df
11
12
23
SS
456.55
9.17
465.72
MS
41.50
0.76
FValue
54.29
ProbF
0.0000
Source
CONC1
CONC2
C11
C22
C12
FORMULA
CONC1*FORMULA
CONC2*FORMULA
C11*FORMULA
C22*FORMULA
C12*FORMULA
df
1
1
1
1
1
1
1
1
1
1
1
SS (Type III)
56.91
16.61
201.29
26.01
105.42
13.16
0.06
0.60
0.65
4.96
6.27
MS
56.91
16.61
201.29
26.01
105.42
13.16
0.06
0.60
0.65
4.96
6.27
FValue
74.44
21.72
263.31
34.03
137.89
17.21
0.08
0.78
0.85
6.49
8.20
ProbF
0.0000
0.0006
0.0000
0.0001
0.0000
0.0013
0.7764
0.3941
0.3754
0.0256
0.0143
of these sum of squares is based on two degrees of freedom since the hypothesis
corresponds to specifying two parameters are equal to zero. The results of these
F statistics indicate the respective sets of slopes are not all equal to zero.
The next step in the analysis is to check for parallelism in the direction of each
covariate. The SAS® system code for testing for the parallelism hypotheses for each
of the covariates in the model is in Table 4.19. The model includes each of the
individual terms as well as the interactions, e.g., CONC1 and CONC1*FORMULA.
When both terms are included in the model, the resulting F statistics provide tests
that the slopes for the two levels of formula are equal. In this case, CONC1*FORMULA tests H0: β11 = β12 vs. Ha:(not H0:) with significance level 0.7764,
CONC2*FORMULA tests H0: β21 = β22 vs. Ha:(not H0:) with significance level
0.3941, C11*FORMULA tests H0: β31 = β32 vs. Ha:(not H0:) with significance level
0.3754, C22*FORMULA tests H0: β41 = β42 vs. Ha:(not H0:) with significance level
0.0256, and C12*FORMULA tests H0: β51 = β52 vs. Ha:(not H0:) with significance
level 0.0143. Each of these sum of squares is based on one degree of freedom since
they correspond to specifying the difference between two parameters is equal to
zero. The information in Table 4.19 indicates that (1) β41 ≠ β42 (C22*FORMULA)
and 2) β51 ≠ β52 (C12*FORMULA).
The sum of squares corresponding to each specific line in Table 4.19 provides
a conditional sum of squares due to that effect given the other terms in the model.
© 2002 by CRC Press LLC
24
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 4.20
Code and Analysis of Variance Table for the Final Model Used
to Compare the Response Surfaces for the Soybean Data
PROC GLM DATA=RESPSUR; CLASS FORMULA;
MODEL FORCE = CONC1 CONC2 C11 FORMULA C22*FORMULA
C12*FORMULA/SOLUTION NOINT P;
Source
Model
Error
Uncorrected Total
df
9
15
24
SS (Type III)
3022.87
10.48
3033.35
MS
335.87
0.70
FValue
480.56
ProbF
0.0000
Source
CONC1
CONC2
C11
FORMULA
C22*FORMULA
C12*FORMULA
df
1
1
1
2
2
2
SS
56.91
16.61
201.29
1570.24
31.95
111.68
MS
56.91
16.61
201.29
785.12
15.98
55.84
FValue
81.42
23.76
288.01
1123.34
22.86
79.90
ProbF
0.0000
0.0002
0.0000
0.0000
0.0000
0.0000
Thus one needs to be careful when deleting more than one term from the model at
a time. If several terms are deleted at once, the fit of the resulting model needs to
be checked, i.e., make sure that the MS Residual does not increase too much, etc.
For this particular problem, a stepwise deletion process was used to simplify the
model which resulted in
y ijkm = µ i + β1 x1 j + β2 x 2 k + β3 x12j + β 4 i x 22k + β5i x1 j x 2 k + ε ijkm
where
i =
j =
k =
m=
(4.14)
1,2 for the two formulations
1,2,..,5, for the levels of x1 (CONC1),
1,2,..,5, for the levels of x2 (CONC2),
1 or 4 for the replications per formulation.
The SAS® system code for fitting Model 4.14 to the pill data is in Table 4.20.
The analysis of variance table is in Table 4.20 where the mean square error is
0.70 and it is based on 15 degrees of freedom. The estimates of the parameters of
the model are in Table 4.21. Table 4.22 contains SAS® system code for computing
a test for lack of fit for Model 4.14. The process is to define two new variables, C1 =
CONC1 and C2 = CONC2. Include C1 and C2 in the model in place of CONC1
and CONC2, but include CONC1 and CONC2 in the Class statement. Finally include
CONC1*CONC2*FORMULA in the model. The Type III sum of squares for
CONC1*CONC2*FORMULA (which is the same here as the Type I sum of squares
since CONC1*CONC2*FORMULA is the last term in the model) is the sum of
squares due to deviations of the model from the cells’ means given all of the other
© 2002 by CRC Press LLC
Multiple Covariants on One-Way Treatment Structure
25
TABLE 4.21
Parameter Estimates for the Final Model
Parameter
CONC1
CONC2
C11
FORMULA U3Y
FORMULA X2Z
C22*FORMULA U3Y
C22*FORMULA X2Z
C12*FORMULA U3Y
C12*FORMULA X2Z
Estimate
1.886
1.019
–3.966
12.835
15.040
–2.093
–0.758
4.515
2.745
StdErr
0.209
0.209
0.234
0.374
0.374
0.327
0.327
0.418
0.418
tValue
9.02
4.87
–16.97
34.33
40.23
–6.40
–2.32
10.80
6.57
Probt
0.0000
0.0002
0.0000
0.0000
0.0000
0.0000
0.0350
0.0000
0.0000
TABLE 4.22
Code and Analysis of Variance Table to Provide a Test of Lack
of Fit for the Final Model
PROC GLM DATA=RESPSUR; CLASS FORMULA CONC1 CONC2;
MODEL FORCE = C1 C2 C11 FORMULA C22*FORMULA C12*FORMULA
FORMULA*CONC1*CONC2;
Source
Model
Error
Corrected Total
df
17
6
23
SS
458.69
7.04
465.72
MS
26.98
1.17
FValue
23.01
ProbF
0.0004
Source
C1
C2
C11
FORMULA
C22*FORMULA
C12*FORMULA
FORMULA*CONC1*CONC2
df
0
0
0
1
0
0
9
SS (Type III)
0.00
0.00
0.00
8.41
0.00
0.00
3.45
MS
FValue
ProbF
8.41
7.17
0.0366
0.38
0.33
0.9356
terms are in the model. Thus, the sum of squares corresponding to
CONC1*CONC2*FORMULA is the sum of squares due to the lack of the model
fitting the cell means. The lack of fit sum of squares is 3.45 and the value of the F
statistic is 0.33 with significance level being 0.9356, indicating there is no evidence
that the model fails to fit the data.
Using the estimates of the parameters in Table 4.20, the equations for estimating
the the two response surfaces are
yˆ U 3Y = 12.835 + 1.886 X1 + 1.019 X 2 − 3.966 X12 − 2.093 X 22 + 4.515 X1X 2
© 2002 by CRC Press LLC
26
Analysis of Messy Data, Volume III: Analysis of Covariance
and
yˆ X 2 Z = 15.040 + 1.886 X1 + 1.019 X 2 − 3.966 X12 − 0.758 X 22 + 2.745 X1X 2 .
These response surfaces for the two formulations have common slopes in the X1,
X2, and X21 directions and different slopes in X22 and X1X2 directions, where X1 =
CONC1 and X2 = CONC2.
The combination of the binders at which the response surface estimates the
maximum strength is determined by differentiating the model with respect to X1 and
X2, equating the derivatives to zero, and solving the resulting equations. The two
sets of equations are:
U3Y: − 2(3.966) X1 + 4.515 X 2 = −1.886
4.515 X1 − 2(2.093) X 2 = −1.019
and
X2 Z: − 2(3.966) X1 + 2.745 X 2 = −1.886
2.745 X1 − 2(0.758) X 2 = −1.019 .
The concentrations at which the models estimate the maximum strength occurs are
(1) for U3Y (CONC1 = 0.975, CONC2 = 1.295) and for Z2X (CONC1 = 1.260,
CONC2 = 2.953). The maximum for X2Z occurs outside of the range of experimentation, so one needs to be careful in assessing the usefulness of the estimate.
Table 4.23 contains the predicted values (PFORCE) for each point in the design
space and the two formulations. The maximum observed response means occur at
TABLE 4.23
Predicted Values of Force for the Two Formulations
at Each of the Design Points in the Experiment
U3Y
CONC1
1.00
0.00
–1.00
–1.41
–1.00
0.00
1.00
1.41
0.00
0.00
0.00
0.00
© 2002 by CRC Press LLC
CONC2
1.00
1.41
1.00
0.00
–1.00
–1.41
–1.00
0.00
0.00
0.00
0.00
0.00
PFORCE
14.20
10.09
1.39
2.24
8.39
7.21
3.13
7.57
12.84
12.84
12.84
12.84
X2Z
STE
0.58
0.58
0.58
0.52
0.58
0.58
0.58
0.52
0.37
0.37
0.37
0.37
PFORCE
15.97
14.96
6.70
4.44
10.16
12.08
8.44
9.78
15.04
15.04
15.04
15.04
STE
0.58
0.58
0.58
0.52
0.58
0.58
0.58
0.52
0.37
0.37
0.37
0.37
Multiple Covariants on One-Way Treatment Structure
27
FIGURE 4.11 Predicted regression surface for formula U3Y over the range of concentrations.
FIGURE 4.12 Predicted regression surface for formula X2Z over the range of concentrations.
(CONC1, CONC2) = (1, 1) for each of the formulations. The estimated standard
errors associated with the predicted values are also included (STE). Remember these
are estimated standard errors for the mean of the model and not estimated standard
deviations associated with an individual measurement at a given combination of
CONC1 and CONC2. The major conclusion that can be made by looking the
predicted values is that X2Z produces stronger pills at each of the observed combinations of the two binders.
Figures 4.11 and 4.12 are plots of the predicted response surfaces for each of the
two formulations. To make further comparisons between the two response surfaces, a
set of estimate statements were generated to construct a grid over the range of (CONC1,
CONC2) so that estimates of the differences of the two models could be evaluated
(along with the estimated standard errors). The difference between the two models is
Diff( U 3Y − X 2 Z ) = (µ1 − µ 2 ) + (β 41 − β 42 ) X 22 + (β 51 − β 52 ) X1X 2
© 2002 by CRC Press LLC
28
Analysis of Messy Data, Volume III: Analysis of Covariance
FIGURE 4.13 Surface of the difference between the two treatments’ response surfaces where
the dark areas are regions where the surfaces are not significantly different.
FIGURE 4.14 Contour plot for the surface of the differences between the two treatments’
response surfaces.
which is a comparison that involves three linearly independent parameters (combinations of parameters) (µ1 – µ2), (Β41 – β42), and (β51 – β52). To compare the response
surfaces over the grid, a Scheffe approach was utilized where a value D was computed
as D = 3F .05,3,15 . If the computed difference divided by its estimated standard error
at a grid point was less in absolute value than D, then it was declared that the two
formulations produce similar mean responses at that point. Figure 4.13 is a graph of
the difference of the two response surfaces, where the large symbols indicate grid
points where the response is not significantly different from zero (points where the
two formulations produce similar mean responses). Figure 4.14 is a contour plot of
the differences with contours in the regions where the differences are closest to zero.
© 2002 by CRC Press LLC
Multiple Covariants on One-Way Treatment Structure
29
REFERENCE
Special Issue on the Analysis of Covariance, Biometrics, September 1957, and Biometrics,
September 1982.
EXERCISES
EXERCISE 4.1: The average daily gain ADG during 3 to 9 months of age of
3 breeds of calves was studied. It was thought that ADG is partly an inherited trait;
thus the ADG of each calf’s sire (SADG) and the ADG of each calf’s dam (DADG)
were determined when they were growing during 3 to 9 months of age and were
used as possible covariates. The data are in the following table. Use the analysis of
covariance strategy to construct an appropriate model and then use the model to
carry out comparisons of the three breeds.
Average Daily Gain Data for Exercise 4.1
Breed 1
Breed 2
Breed 3
© 2002 by CRC Press LLC
ADG
2.80
3.36
3.12
2.75
2.82
3.08
2.49
1.83
2.54
2.53
2.87
2.19
3.03
2.60
3.84
3.93
4.17
3.96
4.31
4.45
4.02
4.45
4.38
SADG
2.1
2.8
3.0
2.0
2.3
2.8
1.2
1.4
1.8
2.3
1.9
2.3
1.2
1.8
3.0
3.1
3.9
3.3
2.6
3.0
2.6
3.0
3.7
DADG
2.4
2.2
1.7
2.1
1.8
2.0
2.1
1.0
1.7
2.0
2.5
1.1
3.0
2.3
2.3
2.0
2.3
2.2
3.2
3.2
2.7
3.3
3.1
30
Analysis of Messy Data, Volume III: Analysis of Covariance
EXERCISE 4.2: Carry out an analysis of covariance for the following data set by
determining the appropriate model and then making the needed treatment comparisons. Y is the response variable and X and Z are the covariates.
Data for Exercise 4.2
Treatment A
X
90.5
79.4
94.7
75.4
95.9
79.6
83.7
96.1
72.0
95.6
© 2002 by CRC Press LLC
Y
23.3
7.0
11.8
21.2
14.8
14.5
16.6
19.5
20.3
21.8
Z
13.4
35.1
34.1
14.0
31.4
27.1
26.5
24.1
15.4
19.6
Treatment B
X
90.1
74.2
93.6
93.0
90.9
85.7
88.2
96.9
89.6
82.2
83.9
73.1
Y
15.7
21.8
14.9
9.7
23.3
11.4
11.1
18.5
12.4
18.1
22.0
20.7
Z
39.9
32.6
41.1
43.7
34.1
41.1
42.5
41.3
42.3
37.0
33.5
33.1
Treatment C
X
75.0
86.6
87.7
91.5
95.2
85.9
79.8
91.2
91.8
Y
14.3
10.3
11.3
4.0
16.4
23.6
14.8
6.1
22.3
Z
48.3
49.1
50.9
42.9
55.8
59.8
51.1
45.6
61.0
C0317ch05 frame Page 123 Monday, June 25, 2001 10:08 PM
5
Two-Way Treatment
Structure and Analysis
of Covariance in a
Completely Randomized
Design Structure
5.1 INTRODUCTION
The main difference between the analysis of a one-way treatment structure and that
of a two-way treatment structure is that the sums of squares for intercepts and slopes
can be partitioned into row treatment effects, column treatment effects, and row
treatment by column treatment interaction effects. One of the objectives of a good
analysis is to determine if a different slope is needed for each treatment combination
and, if not, to determine whether a different slope is needed for each row treatment
and whether a different slope is needed for each column treatment. Once an adequate
model has been determined in terms of the slope parameters, the analysis is completed
by making comparisons of interest between the planes or lines at various selected
values of the covariate. If there are unequal slopes for the levels of the row treatments
or column treatments or both, the analysis of the intercepts provides a comparison
of the regression surfaces at the value of the covariate equal to zero. Great care must
be used in the interpretation of results when there are unequal slopes in two-way
and higher order treatment structures. Several examples involving one covariate are
used to demonstrate these concepts.
5.2 THE MODEL
The cell means model for a two-way treatment structure in a completely randomized
design structure with one covariate is
y ijk = α ij + βijx ijk + ε ijk
i = 1, 2, …, s, j = 1, 2, …, t, k = 1, 2, …, n ij
(5.1)
where the ijth cell consists of the ith level of the row treatment in combination with
the jth level of the column treatment, αij denotes the intercept for the ijth cell, and βij
© 2002 by CRC Press LLC