Tải bản đầy đủ - 0 (trang)
4 EFFECT OF DIET ON CHOLESTEROL LEVEL: AN EXCEPTION TO THE BASIC ANALYSIS OF COVARIANCE STRATEGY

4 EFFECT OF DIET ON CHOLESTEROL LEVEL: AN EXCEPTION TO THE BASIC ANALYSIS OF COVARIANCE STRATEGY

Tải bản đầy đủ - 0trang

C0317 ch03 frame Page 67 Monday, June 25, 2001 10:04 PM



Examples: One-Way Analysis of Covariance



67



TABLE 3.25

PROC GLM Code and Analysis of Variance Table to Provide a Test

of the Hypothesis That All of the Slopes are Zero for Example 3.4

proc glm data=one; class diet;

model post_chol = diet pre_chol*diet/noint ss3 solution;

Source

Model

Error

Uncor Total



df

8

24

32



SS

1428650.09

5852.91

1434503.00



MS

178581.26

243.87



FValue

732.28



ProbF

0.0000



Source

diet

Pre_Chol*diet



df

4

4



SS (Type III)

59368.57

2336.71



MS

14842.14

584.18



FValue

60.86

2.40



ProbF

0.0000

0.0785



TABLE 3.26

PROC GLM Code and Analysis of Variance for the Less

Than Full Rank Model to Test the Equal Slopes Hypothesis

for Example 3.4

proc glm; class diet;

model Post_chol = diet pre_chol

Pre_chol*diet/noint;

Source

Model

Error

Uncorrected Total



df

8

24

32



SS

1428650.09

5852.91

1434503.00



MS

178581.26

243.87



FValue

732.28



ProbF

0.0000



Source

diet

Pre_Chol

Pre_Chol*diet



df

4

1

3



SS (Type III)

59368.57

1.82

2334.81



MS

14842.14

1.82

778.27



FValue

60.86

0.01

3.19



ProbF

0.0000

0.9318

0.0417



lines reveals that the slopes for diets 1 and 2 are positive while the slopes for diets

3 and 4 are negative. The negative slopes and the positive slopes are not quite

significantly different from zero (using a Bonferroni adjustment), but the positive

slopes are significantly different from the negative slopes. Thus, a model with

unequal slopes is needed to adequately describe the data set. Comparisons among

the diets are accomplished by using the unequal slopes model. Since the slopes are

unequal, the diets need to be compared at least at three values of PRE_CHOL. For

this study, the three values are the 75th percentile, median, and 25th percentile of the

studies PRE_CHOL data, which are 281, 227.5, and 180. The least squares means

computed at the three above values are in Table 3.28. Pair- wise comparisons among

the levels of DIET were carried using the Tukey method for multiple comparisons



© 2002 by CRC Press LLC



C0317 ch03 frame Page 68 Monday, June 25, 2001 10:04 PM



68



Analysis of Messy Data, Volume III: Analysis of Covariance



TABLE 3.27

Estimates of the Intercepts and Slopes for Full

Rank Model of Table 3.25

Parameter

diet 1

diet 2

diet 3

diet 4

Pre_Chol*diet

Pre_Chol*diet

Pre_Chol*diet

Pre_Chol*diet



Estimate

137.63

195.74

223.73

276.60

0.2333

0.0450

–0.0232

–0.2333



1

2

3

4



StdErr

27.27

32.03

33.71

23.67

0.1113

0.1367

0.1476

0.1038



tValue

5.05

6.11

6.64

11.69

2.10

0.33

–0.16

–2.25



Probt

0.0000

0.0000

0.0000

0.0000

0.0467

0.7448

0.8763

0.0341



TABLE 3.28

PROC GLM Code and Corresponding Least

Squares or Adjusted Means Evaluated at Three

Values of PRE_CHOL of Example 3.4

lsmeans diet/pdiff stderr at pre_chol=281

adjust=Tukey; ***75th percentile;

lsmeans diet/pdiff stderr at pre_chol=227.5

adjust=Tukey; ***median or 50th percentile;

lsmeans diet/pdiff stderr at pre_chol=180

adjust=Tukey; ***25th percentile;



Post Chol

281



Diet

1

2

3

4



LSMean

203.19

208.39

217.20

211.05



StdErr

7.16

8.81

9.91

8.26



LSMean

Number

1

2

3

4



227.5



1

2

3

4



190.71

205.98

218.45

223.53



5.69

5.54

5.53

5.55



1

2

3

4



180



1

2

3

4



179.63

203.84

219.55

234.61



8.66

8.87

8.67

7.02



1

2

3

4



within each level of PRE_CHOL. The significance levels of the Tukey comparisons

are in Table 3.29.

There were no significant differences among the DIET’s means of POST_CHOL

at PRE_CHOL=281: the mean of DIET 1 is significantly lower than the means of

© 2002 by CRC Press LLC



C0317 ch03 frame Page 69 Monday, June 25, 2001 10:04 PM



Examples: One-Way Analysis of Covariance



69



TABLE 3.29

Tukey Significance Levels for Comparing the Levels

of Diet at Each Value of Pre_Chol

Pre Chol

281



Row Name

1

2

3

4



0.9675

0.6653

0.8886



1

2

3

4



0.2456

0.0094

0.0020



1

2

3

4



0.2334

0.0164

0.0003



227.5



180



1



2

0.9675

0.9092

0.9961

0.2456

0.4013

0.1416

0.2334

0.5918

0.0541



3

0.6653

0.9092



4

0.8886

0.9961

0.9635



0.9635

0.0094

0.4013



0.0020

0.1416

0.9149



0.9149

0.0164

0.5918



0.0003

0.0541

0.5411



0.5411



Diet and Cholesterol



Post Cholesterol Level



250



#

#



220



#



*

*

190



+



* +



*



*

+



160



180



210



*



#



+

#



+



+

*



+

150



#



*



#



240



270



300



Pre Cholesterol Level



+ + +



###



Diet 1

Diet 4

Diet 3



Data

Data

Model



* * *



Diet 2

Diet 1

Diet 4



Data

Model

Model



Diet 3

Diet 2



Data

Model



FIGURE 3.12 Graph of the diet models and data as a function of the pre-diet values for

Example 3.4.



DIETs 3 and 4 at PRE_CHOL=227.5 and 180. A graph of the data and of the

estimated models is in Figure 3.12, indicating there are large diet differences at low

pre-diet cholesterol levels and negligible differences between the diets at high prediet cholesterol levels.

© 2002 by CRC Press LLC



C0317 ch03 frame Page 70 Monday, June 25, 2001 10:04 PM



70



Analysis of Messy Data, Volume III: Analysis of Covariance



This example demonstrates that, even though there is not enough information

from the individual models to conclude that any slopes are different from zero, the

slopes could be significantly different from each other when some are positive and

some are negative. Hence, the analyst must be careful to check for this case when

doing analysis of covariance.



3.5 CHANGE FROM BASE LINE ANALYSIS USING

EFFECT OF DIET ON CHOLESTEROL LEVEL DATA

There is a lot of confusion about the analysis of change from base line data. It might

be of much interest to the dietician to evaluate the change in cholesterol level from

the base line measurement or pre-diet cholesterol level discussed in Section 3.4.

Some researchers think that by calculating the change of base line and then using

analysis of variance to analyze that change, there is no need to consider base line

as a covariate in the modeling process. The data set from Example 3.4 is used in

the following to shed some light on the analysis of change from base line data.

Table 3.30 contains the analysis of variance of the change from base line data

calculated for each person as post cholesterol minus pre cholesterol. The estimate

of the variance from Table 3.30 is 2695.58 compared to 243.87 for the analysis of

covariance model from Table 3.15. In fact, the estimate of the variance based on the

analysis of variance of just the post cholesterol values (ignoring the pre measurements) is 292.49 (analysis is not shown). So the change from base line data has

tremendously more variability than the post diet cholesterol data. The analysis in

Table 3.30 provides an F statistic for comparing diet means with a significance level

of 0.2621. The analysis of variance on the post cholesterol (without the covariate)

provides an F statistic with a significance level of 0.0054 and using the multiple

comparisons, one discovers that the mean cholesterol level of diet 1 is significantly

less than the means of diets 3 and 4. Therefore, the analysis of change from base

line data is not necessarily providing appropriate information about the effect of

diets on a person’s cholesterol level.



TABLE 3.30

PROC GLM Code and Analysis of Variance of Change

from Baseline, Pre Minus Post without the Covariate

proc glm data=one;class diet;

model change=diet;

Source

Model

Error

Corrected Total



df

3

28

31



SS

11361.09

75476.13

86837.22



MS

3787.03

2695.58



FValue

1.40



ProbF

0.2621



Source

diet



df

3



SS(Type III)

11361.09



MS

3787.03



FValue

1.40



ProbF

0.2621



© 2002 by CRC Press LLC



C0317 ch03 frame Page 71 Monday, June 25, 2001 10:04 PM



Examples: One-Way Analysis of Covariance



71



What happens if the covariate is also used in the analysis of the change from

base line data? Assume there are t treatments where y represents the response variable

or post measurement and x denotes the covariate or pre measurement. Also, assume

the simple linear regression model describes the relationship between the mean of

y given x and x for each of the treatments. Then the model is

y ij = α i + βi x ij + ε ij , i = 1, …, t, j = 1, …, n i

Next compute the change from base line data as cij = yij – xij. The corresponding

model for cij is

cij = y ij − x ij = α i + (βi − 1) x ij + ε ij , i = 1, …, t, j = 1, …, n i

= α i + γ i x ij + ε ij ,

where the slope of the model for cij is equal to the slope for the yij model minus 1.

Thus testing H0:γ1 = … = γt = 0 vs. Ha: (not Ho:) is equivalent to testing H0:β1 = …

= βt = 1 vs. Ha: (not H0:). Therefore, in order for the analysis on the change from

base line data without the covariate to be appropriate is for the slopes of the models

for yij to all be equal to 1. The following analyses demonstrate the importance of

using the pre value as a covariate in the analysis of the change from base line. Tables

3.31 and 3.32 contain the results of fitting Model 2.1 to the change=post cholesterol

minus pre cholesterol values where the full rank model is fit to get the results in

Table 3.31 and the less than full rank model is fit to get the results in Table 3.32.

The estimate of the variance from Table 3.31 is 243.87, the same as the estimate of

the variance obtained from the analysis of covariance model in Table 3.25. The

F statistic for source Pre_Chol*diet tests the equal slopes hypothesis of Equation 2.11.

The significance level is 0.0417, the same as in Table 3.26.



TABLE 3.31

PROC GLM Code and Analysis of Variance Table

for Change from Base Line Data with the Covariate

to Test Slopes Equal Zero

proc glm data=one;class diet;

model change=diet Pre_chol*diet/solution noint;

Source

Model

Error

Uncorrected Total



df

8

24

32



SS

92122.09

5852.91

97975.00



MS

11515.26

243.87



FValue

47.22



ProbF

0.0000



Source

diet

Pre_Chol*diet



df

4

4



SS (Type III)

59368.57

69623.21



MS

14842.14

17405.80



FValue

60.86

71.37



ProbF

0.0000

0.0000



© 2002 by CRC Press LLC



C0317 ch03 frame Page 72 Monday, June 25, 2001 10:04 PM



72



Analysis of Messy Data, Volume III: Analysis of Covariance



TABLE 3.32

PROC GLM Code and Analysis of Variance

for the Change from Base Line Data to Provide

the Test of the Slopes Equal Hypothesis

proc glm data=one;class diet;

model change=diet Pre_chol Pre_chol*diet;

Source

Model

Error

Corrected Total



df

7

24

31



SS

80984.31

5852.91

86837.22



MS

11569.19

243.87



FValue

47.44



ProbF

0.0000



Source

diet

Pre_Chol

Pre_Chol*diet



df

3

1

3



SS (Type III)

3718.77

60647.40

2334.81



MS

1239.59

60647.40

778.27



FValue

5.08

248.69

3.19



ProbF

0.0073

0.0000

0.0417



TABLE 3.33

Estimates of the Parameter from Full Rank Analysis

of Covariance Model for Change from Base Line

Parameter

diet 1

diet 2

diet 3

diet 4

Pre_Chol*diet

Pre_Chol*diet

Pre_Chol*diet

Pre_Chol*diet



1

2

3

4



Estimate

137.63

195.74

223.73

276.60

–0.767

–0.955

–1.023

–1.233



StdErr

27.27

32.03

33.71

23.67

0.111

0.137

0.148

0.104



tValue

5.05

6.11

6.64

11.69

–6.89

–6.98

–6.93

–11.88



Probt

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000



The estimates of the intercepts and slopes for the model in Table 3.31 are

displayed in Table 3.33. The intercepts are identical to those in Table 3.27 and the

slopes are the slopes in Table 3.27 minus 1. Just like in the analysis of the post

cholesterol data, an unequal slopes model is needed to adequately describe the data

set. The adjusted means or least squares means are computed at pre cholesterol

levels of 281, 227.5, and 180. Those least squares means are presented in Table 3.34.

Most of the time it is not of interest to consider the t-tests associated with the least

squares means. The t-statistic is a test of the hypothesis that the respective population

mean is equal to zero. The adjusted means are changed from base line values and

diets 3 and 4 at Pre_chol=227.5 and diet 1 at Pre_chol=180 are not significantly

different from zero. Table 3.35 consists of the significance levels for Tukey adjusted

multiple comparisons for all pairwise comparisons of the diets means within the

Pre_chol values of 281, 227.5, and 180. These significance levels are identical to

those computed from Table 3.29.

© 2002 by CRC Press LLC



C0317 ch03 frame Page 73 Monday, June 25, 2001 10:04 PM



Examples: One-Way Analysis of Covariance



73



TABLE 3.34

PROC GLM Code and Least Squares Means

for the Change from Base Line Analysis

lsmeans diet/pdiff stderr at pre_chol=281

adjust=Tukey;***75th percentile;

lsmeans diet/pdiff stderr at pre_chol=227.5

adjust=Tukey;***median or 50th percentile;

lsmeans diet/pdiff stderr at pre_chol=180

adjust=Tukey;***25th percentile;

Pre_CHOL

281



Diet

1

2

3

4



227.5



1

2

3

4



180



1

2

3

4



LSMean

–77.81

–72.61

–63.80

–69.95



StdErr

7.16

8.81

9.91

8.26



Probt

0.0000

0.0000

0.0000

0.0000



LSMeanNumber

1

2

3

4



–36.79

–21.52

–9.05

–3.97



5.69

5.54

5.53

5.55



0.0000

0.0007

0.1148

0.4820



1

2

3

4



–0.37

23.84

39.55

54.61



8.66

8.87

8.67

7.02



0.9660

0.0128

0.0001

0.0000



1

2

3

4



TABLE 3.35

Tukey Adjusted Significance Levels for Pairwise

Comparisons of the Diets’ Means at Three Levels

of Pre Cholesterol

PRE_CHOL

281



227.5



180



© 2002 by CRC Press LLC



RowName

1

2

3

4



_1

0.9675

0.6653

0.8886



1

2

3

4



0.2456

0.0094

0.0020



1

2

3

4



0.2334

0.0164

0.0003



_2

0.9675

0.9092

0.9961

0.2456

0.4013

0.1416

0.2334

0.5918

0.0541



_3

0.6653

0.9092



_4

0.8886

0.9961

0.9635



0.9635

0.0094

0.4013



0.0020

0.1416

0.9149



0.9149

0.0164

0.5918

0.5411



0.0003

0.0541

0.5411



C0317 ch03 frame Page 74 Monday, June 25, 2001 10:04 PM



74



Analysis of Messy Data, Volume III: Analysis of Covariance



In summary, when it is of interest to evaluate change from base line data, do

the analysis, but still consider the base line values as possible covariates. The only

time the analysis of variance of change from base line data is appropriate is when

the slopes are all equal to 1. As this example shows, change from base line values

can have considerable effect on the estimate of the variance and thus on the resulting

conclusions one draws from the data analysis. So, carry out the analysis on the

change from base line variables, but also consider the base line values as possible

covariates.



3.6 SHOE TREAD DESIGN DATA FOR EXCEPTION

TO THE BASIC STRATEGY

The data in Table 3.36 are the times it took males to run an obstacle course with a

particular tread design on the soles of their shoes (Tread Time, sec). To help remove

the effect of person-to-person differences, the time required to run the same course

while wearing a slick-soled shoe was also measured (Slick Time, sec). Fifteen

subjects were available for the study and were randomly assigned to one of three

tread designs, five per design. The data are from a one-way treatment structure with

one covariate in a completely randomized design structure. It is of interest to compare

mean times for tread designs for a constant time to run the course with the slick

sole shoes. Table 3.37 contains the analysis to test the hypothesis that the slopes are

all equal to zero (Equation 2.6), which one fails to reject (p = 0.2064). None of the

individual slopes are significantly different from zero, but they are all in the magnitude of 0.3.

The main problem is that there are only five observations per treatment group

and it is difficult to detect a non-zero slope when the sample size is small. The basic



TABLE 3.36

Obstacle Course Time Data for Three Shoe

Tread Designs

Tread Design

1

Slick

Time

34

40

48

35

42



2

Tread

Time

36

36

38

32

39



Slick

Time

37

50

38

52

45



3

Tread

Time

29

40

35

34

29



Slick

Time

58

57

36

55

48



Tread

Time

38

32

29

34

31



Note: Tread Time (sec) denotes time to run the course with the

assigned tread and Slick Time (sec) denotes time to run the same

course using a slick-soled shoe to be considered as a possible

covariate for Example 3.6.



© 2002 by CRC Press LLC



C0317 ch03 frame Page 75 Monday, June 25, 2001 10:04 PM



Examples: One-Way Analysis of Covariance



75



TABLE 3.37

PROC GLM Code, Analysis of Variance Table, and Estimates

of the Parameters for the Full Rank Model for Example 3.6

proc glm data=two; class tread_ds;

model Tread_time = tread_ds Slick_Time*tread_ds/noint solution ss3;

Source

Model

Error

Uncorrected Total



df

6

9

15



SS

17620.18

93.51

17713.69



MS

2936.70

10.39



FValue

282.65



ProbF

0.0000



Source

tread_ds

Slick_Time*tread_ds



df

3

3



SS (TypeIII)

122.92

58.04



MS

40.97

19.35



FValue

3.94

1.86



ProbF

0.0476

0.2064



Estimate

23.51

20.07

18.24

0.325

0.301

0.286



StdErr

11.75

10.55

8.88

0.294

0.236

0.173



tValue

2.00

1.90

2.05

1.11

1.28

1.65



Probt

0.0765

0.0897

0.0703

0.2975

0.2342

0.1324



Parameter

tread_ds 1

tread_ds 2

tread_ds 3

Slick_Time*tread_ds 1

Slick_Time*tread_ds 2

Slick_Time*tread_ds 3



TABLE 3.38

PROC GLM Code and the Analysis of Variance Table for the

Analysis of Tread Time without the Covariate for Example 3.6

proc glm data=two; class tread_ds;where tread_time ne .;

model Tread_time = tread_ds /solution;

Source

Model

Error

Corrected Total



df

2

12

14



SS

38.05

151.55

189.60



MS

19.03

12.63



FValue

1.51



ProbF

0.2608



Source

tread_ds



df

2



SS

38.05



MS

19.03



FValue

1.51



ProbF

0.2608



strategy says to continue the analysis of the shoe tread designs via analysis of

variance, i.e., without the covariate. The analysis of variance of the time to run the

obstacle course to compare the tread designs without using the covariate is in

Table 3.38. The analysis of variance indicates there are no significant differences

among the shoe tread design means. Table 3.39 displays the means for each of the

tread designs and the significance levels indicate the means of the tread designs are

not significantly different. If the basic strategy is ignored and other models are used

(such as a common slope model), it becomes evident that the covariate is important



© 2002 by CRC Press LLC



C0317 ch03 frame Page 76 Monday, June 25, 2001 10:04 PM



76



Analysis of Messy Data, Volume III: Analysis of Covariance



TABLE 3.39

PROC GLM Code, Least Squares Means and p-Values

for Making Pairwise Comparisons among Shoe Tread

Design Means for Tread Time (sec) of Example 3.6

without the Covariate

lsmeans tread_ds/stderr pdiff;

tread_ds

1

2

3



LSMean

36.40

33.40

32.74



StdErr

1.59

1.59

1.59



Probt

0.0000

0.0000

0.0000



RowName

1

2

3



_1



_2

0.2067



_3

0.1294

0.7740



0.2067

0.1294



LSMeanNumber

1

2

3



0.7740



TABLE 3.40

PROC GLM Code, Analysis of Variance Table and Parameter

Estimates for the Common Slope Model of Example 3.6

proc glm data=two; class tread_ds;

model Tread_time = tread_ds Slick_Time/solution;

Source

Model

Error

Corrected Total



df

3

11

14



SS

95.96

93.64

189.60



MS

31.99

8.51



FValue

3.76



ProbF

0.0444



Source

tread_ds

Slick_Time



df

2

1



SS (Type III)

85.95

57.91



MS

42.97

57.91



FValue

5.05

6.80



ProbF

0.0278

0.0243



Parameter

Intercept

tread_ds 1

tread_ds 2

tread_ds 3

Slick_Time



Estimate

17.67

6.92

2.54

0.00

0.298



StdErr

5.92

2.23

1.98



tValue

2.98

3.10

1.28



Probt

0.0125

0.0100

0.2261



0.114



2.61



0.0243



in the comparison of the tread designs. The common slope model analysis is displayed in Table 3.40, which indicates there is a significant effect due to the covariate

(p = 0.0243), i.e., indicating the common slope is significantly different from zero.

Thus while there is not enough information from the individual tread design’s data

to conclude their slopes are different from zero, the combined data sets for a common

slope does provide an estimate of the slope which is significantly different than zero.



© 2002 by CRC Press LLC



C0317 ch03 frame Page 77 Monday, June 25, 2001 10:04 PM



Examples: One-Way Analysis of Covariance



77



TABLE 3.41

PROC GLM Code, Least Squares Means, and p-Values

for Comparing the Shoe Tread Design Means

lsmeans tread_ds/stderr pdiff e;

tread_ds

1

2

3



LSMean

37.94

33.57

31.03



StdErr

1.43

1.31

1.46



Probt

0.0000

0.0000

0.0000



RowName

1

2

3



_1



_2

0.0436



_3

0.0100

0.2261



0.0436

0.0100



LSMeanNumber

1

2

3



0.2261



The overall test of TREAD_DS in Table 3.40 indicates there is enough information

to conclude that the tread designs yield different times, a different decision than

from the analysis without the covariate (Table 3.38).

The adjusted means (LSMEANS at Slick Time = 44.88667) in Table 3.41 indicate

that designs 2 and 3 are possibly better than design 1. If a Bonferroni adjustment

is used, then runners using design 3 run significantly faster than design 1 (α = 0.05).

The graph of the estimated regression lines with a common slope is in Figure 3.13.

This example shows two important aspects of analysis of covariance. First, there

could be enough evidence to conclude that a common slope model is necessary to



Shoe Tread Designs



Time (sec) with Tread Soles



45

40



+



35



+

30



+



+



+



*



*



*



*



*



25

30



40



50



60



Time (sec) with Slick Soles

+ + +



Design 1

Design 1



Data

Model



* * *



Design 2

Design 2



Data

Model



Design 3

Design 3



Data

Model



FIGURE 3.13 Plot of data and estimated models for each of the three tread designs for the

common slope model of Example 3.6.



© 2002 by CRC Press LLC



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

4 EFFECT OF DIET ON CHOLESTEROL LEVEL: AN EXCEPTION TO THE BASIC ANALYSIS OF COVARIANCE STRATEGY

Tải bản đầy đủ ngay(0 tr)

×