6 EXAMPLE: MODELS THAT ARE QUADRATIC FUNCTIONS OF THE COVARIATE
Tải bản đầy đủ - 0trang
14
Analysis of Messy Data, Volume III: Analysis of Covariance
Parameter Estimates
Term
Estimate Std Error t Ratio Prob> t
Intercept
HERB[1]
HERB[2]
HERB[3]
HERB[4]
HERB[5]
HERB[6]
HERB[7]
SILT
CLAY
OM
29.235454
4.3416627
0.0033234
-3.562265
-1.910561
1.307541
-4.310853
-0.778272
0.2946062
-0.276215
-2.243604
2.483561
1.734624
1.723398
1.732177
1.7287
1.74023
1.741336
1.757427
0.079012
0.062854
1.067268
11.77
2.50
0.00
-2.06
-1.11
0.75
-2.48
-0.44
3.73
-4.39
-2.10
<.0001
0.0142
0.9985
0.0428
0.2722
0.4545
0.0153
0.6590
0.0003
<.0001
0.0385
FIGURE 4.6 Estimates of the parameters for the model with common slopes in each of the
three covariates directions.
Least Squares Means Table
Level
1
2
3
4
5
6
7
8
Least Sq Mean
29.160413
24.822073
21.256485
22.908189
26.126291
20.507897
24.040478
29.728174
Std Error
Mean
1.8526300
1.8421232
1.8503389
1.8470845
1.8578805
1.8589159
1.8739979
1.8501630
29.1500
25.3750
22.3917
22.0583
26.5417
18.9583
25.0750
29.0000
FIGURE 4.7 Least squares means for each of the herbicides.
FIGURE 4.8 Plot of the herbicide least squares means.
© 2002 by CRC Press LLC
Multiple Covariants on One-Way Treatment Structure
15
LSMeans Differences Tukey HSD
Alpha= 0.050
Q= 3.10773
LSMean[i]
Mean[i]-Mean[i] 1
2
3
4
5
6
7
8
Std Err Dif
Lower CL Dif
Upper CL Dif
1
0 4.33834 7.90393 6.25222 3.03412 8.65252 5.11993 -0.5678
0 2.61267 -2.6116 2.62354 2.60354 2.6249 2.66077 2.62879
0 -3.7811 -0.2122 -1.901 -5.057 0.49501 -3.149 -8.7374
0 12.4578 16.0201 14.4055 11.1252 16.81 13.3889 7.60183
0 3.56559 1.91388 -1.3042 4.31418 0.7816 -4.9061
-4.3383
0 2.60555 2.61271 2.61438 2.62435 2.62288 2.61434
2.61267
0 -4.5318 -6.2057 -9.429 -3.8416 -7.3696 -13.031
-12.458
0 11.663 10.0335 6.82058 12.47 8.93281 3.21858
3.78114
2
3
-7.9039 -3.5656
-2.6116 2.60555
0 -1.6517 -4.8698 0.74859 -2.784 -8.4717
0 2.62599 2.61022 2.63655 2.63413 2.62904
-11.663
4.53178
-1.9139
2.61271
-10.034
6.20574
0 -9.8126 -12.982 -7.4451 -10.97 -16.642
0 6.50918 3.24206 8.9423 5.40219 -0.6013
1.6517
0 -3.2181 2.40029 -1.1323
-6.82
0 2.63143 2.60998 2.62741 2.60371
2.62599
0 -11.396 -5.7108 -9.2976 -14.912
-6.5092
0 4.95969 10.5114 7.03302 1.27165
9.81259
-16.02
0.21223
-6.2522
2.62354
-14.405
1.90103
4
LSMeans Differences Tukey HSD
LSMean[i]
Mean[i]-Mean[i] 1
Std Err Dif
2
3
4
5
6
7
8
Lower CL Dif
LSMean[
Upper CL Dif
5
6
7
8
0 5.61839 2.08581 -3.6019
-3.0341 1.30422 4.86981 3.2181
0 2.63483 2.66455 2.63645
2.60354 2.61438 2.61022 2.63143
-11.125 -6.8206 -3.2421 -4.9597
0
-2.57 -6.1949 -11.795
5.05698 9.42901 12.9817 11.3959
0 13.8068 10.3665 4.59149
-8.6525 -4.3142 -0.7486 -2.4003 -5.6184
0 -3.5326 -9.2203
2.6248 2.62435 2.63655 2.60998 2.63483
0 2.65408 2.6166
-16.81 -12.47 -8.9423 -10.511 -13.807
0 -11.781 -17.353
0 4.71558 -1.088
-0.495 3.8416 7.44512 5.71084 2.56997
-5.1199 -0.7816 2.78399 1.13229 -2.0858 3.53258
0 -5.6877
0 2.623
2.66077 2.62288 2.63413 2.62741 2.66455 2.65408
-13.389 -8.9328 -5.4022 -7.033 -10.367 -4.7156
0 -13.839
0 2.4639
3.14904 7.36961 10.9702 9.2976 6.1949 11.7807
0
0.56776 4.9061 8.47169 6.81999 3.60188 9.22028 5.6877
0
2.62879 2.61434 2.62904 2.60371 2.63645 2.6166 2.623
-7.6018 -3.2186 0.30133 -1.2716 -4.5915 1.08795 -2.4639
0
0
8.73736 13.0308 16.642 14.9116 11.7953 17.3526 13.8393
FIGURE 4.9 Comparisons of the herbicide least squares means using Tukey’s HSD multiple
comparison procedure.
At this point it has already been established that there is a relationship between Yield
and X and X2, i.e., the corresponding slopes are different than zero (analysis not
shown). The next step in the analysis of covariance is to decide whether common
© 2002 by CRC Press LLC
16
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 4.9
Yields for Three Treatments and the Values
of the Covariate
Treatment 1
Yield
3.7
4.7
5.3
5.5
5.7
3.8
1.0
3.3
5.3
5.3
6.1
5.8
X
2.2
3.5
4.1
4.1
7.4
3.0
0.5
2.4
8.4
8.5
8.0
6.3
Treatment 2
Yield
7.2
2.5
4.3
6.9
8.0
3.7
5.2
8.0
6.9
4.3
6.8
7.6
X
5.8
9.9
1.2
6.0
5.5
9.7
8.6
5.0
2.6
8.4
3.0
5.2
Treatment 3
Yield
5.7
6.1
6.4
9.6
9.4
8.2
10.2
6.0
9.4
5.0
7.5
5.3
X
8.8
9.0
8.8
4.1
2.4
4.1
4.4
8.9
5.3
9.5
1.1
0.1
linear slopes and/or common quadratic slopes are possible simplifications of the
model. The first hypotheses to be tested are
H 01 : β11 = β12 = β13 vs. H: ( not H 01 ) and
H 02 : β 21 = β 22 = β 23 vs. H: ( not H 02 ) .
Table 4.10 contains the SAS® system code statements necessary to test H01 and H02
as well as the results. The significance levels for the two tests are 0.6463 for X*TRT
and 0.1364 for X*X*TRT. Thus, given there are unequal slopes for the X2 terms,
then the model does not have to be unequal slopes for the X terms, or given there
are unequal slopes for the X terms, then the model does not have to be unequal
slopes for the X2 terms. Since the significance level for X*TRT is larger than that
for X*X*TRT, X*TRT was deleted. The resulting model has a common slope for
the X term and unequal slopes for the X2 term. Table 4.11 contains the SAS® system
code for fitting the above model to provide a test of the equality of the quadratic
term slopes. The results are in the lower part of Table 4.11. The significance level
corresponding to X*X*TRT is 0.0000, indicating that the slopes are not equal for
the three treatments in the X2 direction.
The final model with a common slope for the X direction and unequal slopes
in the X2 direction is
y ij = α i + β1 X ij + β 2 i X ij2 + ε ij , i = 1, 2, 3 and j = 1, 2, …, 12.
(4.12)
The code in Table 4.12 fits Model 4.12 and also contains the analysis of variance
table. Since the model used the NOINT option, the F statistic corresponding to TRT
© 2002 by CRC Press LLC
Multiple Covariants on One-Way Treatment Structure
17
TABLE 4.10
Code to Fit a Model with Unequal Slopes in Both
the X and X2 Directions and Provide Tests
for Equality of Each of the Two Sets of Slopes
PROC GLM DATA=QUAD; CLASS TRT;
MODEL YIELD= TRT X X*TRT X*X X*X*TRT/SS3;
Source
Model
Error
Corr Total
df
8
27
35
SS
140.22
6.53
146.75
MS
17.53
0.24
FValue
72.53
ProbF
0.0000
Source
TRT
X
X*TRT
X*X
X*X*TRT
df
2
1
2
1
2
SS (Type III)
16.00
47.03
0.21
50.30
1.04
MS
8.00
47.03
0.11
50.30
0.52
FValue
33.11
194.58
0.44
208.14
2.15
ProbF
0.0000
0.0000
0.6463
0.0000
0.1364
TABLE 4.11
Code to Fit a Model with a Common Slope
in the X Direction and Unequal Slopes in the
X2 Direction and Test Equality of the X2 Slopes
PROC GLM DATA=QUAD; CLASS TRT;
MODEL YIELD= TRT X X*X X*X*TRT /SS3;
Source
Model
Error
Corr Total
df
6
29
35
SS
140.01
6.74
146.75
MS
23.33
0.23
FValue
100.41
ProbF
0.0000
Source
TRT
X
X*X
X*X*TRT
df
2
1
1
2
SS (Type III)
73.63
49.50
53.86
17.67
MS
36.81
49.50
53.86
8.83
FValue
158.41
212.98
231.76
38.01
ProbF
0.0000
0.0000
0.0000
0.0000
provides a test H0: α1 = α2 = α3=0 vs. Ha:(not H0). The F statistic corresponding to
X provides a test of H0: β1 = 0 vs. Ha:(not H0) and the F statistic corresponding to
X*X*TRT provides a test of H0: β1 = β2 = β3 = 0 vs. Ha:(not H0). The significance
levels are very small, indicating there is sufficient evidence to reject the null hypotheses. The estimates of the parameters of the model are in Table 4.13.
Table 4.14 contains the code to provide estimates of the regression lines at several
values of X. The first statement, LSMEANS TRT/STDERR PDIFF, provides
adjusted means that are not interpretable since they are computed by evaluating the
© 2002 by CRC Press LLC
18
Analysis of Messy Data, Volume III: Analysis of Covariance
TABLE 4.12
Code to Fit Model 4.12 with Analysis of Variance
Table
PROC GLM DATA=QUAD; CLASS TRT;
MODEL YIELD=TRT X X*X*TRT /NOINT SOLUTION SS3;
Source
Model
Error
Uncor Tot
df
7
29
36
SS
1432.41
6.74
1439.15
MS
204.63
0.23
FValue
880.49
ProbF
0.0000
Source
TRT
X
X*X*TRT
df
3
1
3
SS (Type III)
96.98
49.50
91.60
MS
32.33
49.50
30.53
FValue
139.10
212.98
131.38
ProbF
0.0000
0.0000
0.0000
TABLE 4.13
Estimates of the Parameters of Model 4.12
Parameter
TRT 1
TRT 2
TRT 3
X
X*X*TRT 1
X*X*TRT 2
X*X*TRT 3
Estimate
–0.0102
2.9889
5.3598
1.8520
–0.1432
–0.1894
–0.2000
StdErr
0.3224
0.3885
0.3163
0.1269
0.0133
0.0112
0.0122
tValue
–0.03
7.69
16.94
14.59
–10.75
–16.88
–16.45
Probt
0.9751
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
model at the average value of X (5.4389) and the average value of X2 (37.9478),
which does not correspond to any points on the regression models since the square
of the mean of X is 5.43892 = 29.5816. The estimate statements in Table 4.15 are
included to demonstrate the computation of the LSMEANS. The last estimate statement in Table 4.15 is used to provide the computation for Treatment 1 as used by
LSMEANS TRT/STDERR PDIFF. The average value of X is 5.4388889 and the
average value of X2 is 37.9477778, while the square of the average value of X is
29.581512665. The last statement uses 37.9477778 for X2 in the computation to
provide a value of 4.630. That is the same result corresponding to LSMEAN in
Table 4.14 for Treatment 1. Thus the usual LSMEAN statement provides incorrect
adjusted values. The third estimate statement in Table 4.15 uses the average value
of X and the square of the average value of X in the computations, providing 5.828
as the adjusted mean for Treatment 1. The last two LSMEAN statements in
Table 4.14, LSMEANS TRT/PDIFF at MEANS and LSMEANS TRT/PDIFF at
X=5.4388889, use the correct computations and provide the adjusted mean for
Treatment 1 of 5.828. The other two LSMEAN statements are used to obtain adjusted
means at a large value of X (X = 9) and a small value of X (X = 1). The first two
© 2002 by CRC Press LLC
Multiple Covariants on One-Way Treatment Structure
19
TABLE 4.14
Code and Results for Computing Adjusted Means
LSMEANS
LSMEANS
LSMEANS
LSMEANS
LSMEANS
TRT/STDERR PDIFF;
TRT/PDIFF AT X=1;
TRT/PDIFF AT X=9;
TRT/PDIFF AT MEANS;
TRT/PDIFF AT X=5.4388889;
LSMEAN
Incorrect
X=1
X=9
MEANS
X=5.4388889
TRT
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
LSMEAN
4.63
5.88
7.84
1.70
4.65
7.01
5.06
4.32
5.83
5.83
7.46
9.52
5.83
7.46
9.52
RowName
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
_1
0.0000
0.0000
0.0000
0.0000
0.0590
0.0493
0.0000
0.0000
0.0000
0.0000
_2
0.0000
0.0000
0.0000
0.0000
0.0590
0.0000
0.0000
0.0000
0.0000
_3
0.0000
0.0000
0.0000
0.0000
0.0493
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
TABLE 4.15
Estimate Statements and Results Demonstrating
the Computation of the LSMEANS
ESTIMATE ‘TRT 1 AT X=1’ TRT 1 0 0 X 1 X*X*TRT 1 0 0;
ESTIMATE ‘TRT 1 AT X=9’ TRT 1 0 0 X 9 X*X*TRT 81 0 0;
ESTIMATE ‘TRT 1 AT X=5.4388889’ TRT 1 0 0 X 5.4388889
X*X*TRT 29.581512665 0 0;
ESTIMATE ‘TRT 1 LSM’ TRT 1 0 0 X 5.4388889 X*X*TRT
37.9477778 0 0;
Parameter
TRT 1 AT X=1
TRT 1 AT X=9
TRT 1 AT X=5.4388889
TRT 1 LSM
Estimate
1.699
5.062
5.828
4.630
StdErr
0.244
0.312
0.163
0.145
tValue
6.95
16.21
35.74
31.83
Probt
0.0000
0.0000
0.0000
0.0000
estimate statements in Table 4.15 are also used to demonstrate the computations of
the adjusted means at X = 1 and X = 9. Pairwise comparisons of the treatment means
for each set of adjusted means are in the columns labeled _1, _2, and _3 of Table 4.14.
The graph in Figure 4.10 displays the data and estimated regression models for the
three treatments with vertical lines at X = 1, X = 5.4388889 (the mean of X), and
© 2002 by CRC Press LLC
Analysis of Messy Data, Volume III: Analysis of Covariance
Yield
20
12
11
10
9
8
7
6
5
4
3
2
1
0
o
o
o
o
* *
o
*
++ +
+
o
o
** *
+
**
o
+ + ++ ooo
* o
+
*
*
*
+
0
1
2
3
4
5
6
7
8
* * *
Data
Model
9
10
Concentration of X
+ ++
Data
Model
TRT 1
TRT 2
ooo
Model
Data
TRT 1
TRT 3
TRT 2
TRT 3
FIGURE 4.10 Graph of the quadratic regression models with the data points for Example 4.5.
X = 9. The pairwise comparison significance levels in Table 4.14 indicate the three
regression lines are significantly different at X = 1 and at the mean of X, while at
X = 9, Treatment 1 is not signicantly different from the other two treatments while
Treatment 3 provides a significantly higher response than Treatment 2.
4.7 EXAMPLE: COMPARING RESPONSE SURFACE MODELS
A drug company had two formulation processes with which to make a certain type
of pill and they wanted to determine the concentrations of two binders which produce
the stronger pills. For each formulation, a treatment structure with design points
from a two factor rotatable central composite design with four center points in a
completely randomized design structure was used to collect information to study
the relationship between strength and the concentrations of the two binders. The
force (lb/in.) required to fracture the pill is the dependent measure. The data for the
two formulations, coded U3Y and X2Z, the force, and the coded concentrations of
the two binders (CONC1, CONC2) are in Table 4.16. The objectives of the experiment are (1) for each formulation determine an adequate model to describe the data
and estimate the combination of the binders that will produce a pill with maximum
strength and (2) to compare the response surfaces via analysis of covariance.
A quadratic response surface model was selected to describe the responses for
each of the formulations and is expressed as
y ijkm = µ i + β1i x1 j + β 2 i x 2 k + β 3i x ij2 + β 4 i x 22k + β 5i x ij x 2 k + ε ijkm
© 2002 by CRC Press LLC
(4.13)
Multiple Covariants on One-Way Treatment Structure
21
TABLE 4.16
Strength of Pills Made by Two Formulations
with Concentrations of Two Binders
CONC1
1.00
0.00
–1.00
–1.41
–1.00
0.00
1.00
1.41
0.00
0.00
0.00
0.00
CONC2
1.00
1.41
1.00
0.00
–1.00
–1.41
–1.00
0.00
0.00
0.00
0.00
0.00
U3Y Force
14.14
10.64
1.47
2.28
8.53
6.57
3.14
8.16
13.58
11.73
13.07
12.24
X2Z Force
16.02
14.07
7.20
3.85
10.30
12.78
8.14
9.45
13.98
15.66
16.83
14.41
TABLE 4.17
Analysis of Variance Table to Provide the Estimate of Pure
Error for the Response Surface Models Used to Compare
Pill Strength
PROC GLM DATA=RESPSUR; CLASS FORMULA CONC1 CONC2;
MODEL FORCE = FORMULA*CONC1*CONC2;
where
i =
j =
k =
m=
Source
Model
Error
Corrected Total
df
17
6
23
SS
458.69
7.04
465.72
MS
26.98
1.17
FValue
23.01
ProbF
0.0004
Source
Formula*Conc1*Conc2
df
17
SS(Type III)
458.69
MS
26.98
FValue
23.01
ProbF
0.0004
1, 2 for the two formulations
1, 2, …, 5, for the levels of x1 (CONC1),
1, 2, …, 5, for the levels of x2 (CONC2),
1 or 4 for the replications per formulation.
Since there are four replications of the center point for each formulation, the
variation between these observations within a formulation provides an estimate of
pure error (or a model free estimate of the variance). The estimate of the pure error
is σˆ 2PE = 1.17, which is the Mean Square Error from the analysis of variance in
Table 4.17. To compute the estimate of 2PE using PROC GLM of the SASđ system,
â 2002 by CRC Press LLC