Tải bản đầy đủ
9 Sample Size for Power = .80 in Single-Sample Case

9 Sample Size for Power = .80 in Single-Sample Case

Tải bản đầy đủ

Chapter 12

↜渀屮

↜渀屮

Robey and Barcikowski (1984) have given power tables for various alpha levels for the
single group repeated-measures design. Their tables assume a common correlation for the
repeated measures, which generally will not be tenable (especially in longitudinal studies);
however, a later paper by Green (1990) indicated that use of an estimated average correlation (from all the correlations among the repeated measures) is fine. Selected results from
their work are presented in Table€12.9, which indicates sample size needed for power€=€.80
for small, medium, and large effect sizes at alpha€=€.01, .05, .10, and .20 for two through
seven repeated measures. We give two examples to show how to use the table.
 Table€12.9:╇ Sample Sizes Needed for Power€=€.80 in Single-Group Repeated
Measures
Number of repeated measures
Effect sizea

2

.12
.30
.49
.14
.35
.57
.22
.56
.89

404
68
28
298
51
22
123
22
11

.12
.30
.49
.14
.35
.57
.22
.56
.89

268
45
19
199
34
14
82
15
8

.30

.12
.30
.49

.50

.14
.35
.57
.22
.56
.89

Average corr.
.30

.50

.80

.30

.50

.80

.80

3

4

5

6

7

273
49
22
202
38
18
86
19
11

238
44
21
177
35
18
76
18
12

214
41
21
159
33
18
69
18
12

195
39
21
146
31
18
65
18
13

192
35
16
142
27
13
60
13
8

170
32
16
126
25
13
54
13
9

154
30
16
114
24
13
50
14
10

141
29
16
106
23
14
47
14
10

209
35
14

α€=€.01
324
56
24
239
43
19
100
20
11
α€=€.05
223
39
17
165
30
14
69
14
8
α€=€.10
178
31
14

154
28
13

137
26
13

125
25
13

116
24
13

154
26
11
64
12
6

131
24
11
55
11
7

114
22
11
49
11
7

102
20
11
44
11
8

93
20
11
41
12
9

87
19
12
39
12
9

(Continuedâ•›)

495

496

↜渀屮

↜渀屮

Repeated-Measures Analysis

 Table€12.9:╇ (Continued)
Number of repeated measures
Average corr.
.30

.50

.80

Effect sizea

2

3

4

5

6

7

.12
.30
.49
.14
.35
.57
.22
.56
.89

149
25
10
110
19
8
45
8
4

α€=€.20
130
23
10
96
17
8
40
8
5

114
21
10
85
16
8
36
9
6

103
20
10
76
16
9
33
9
7

94
19
11
70
15
9
31
10
8

87
19
11
65
15
10
30
10
8

a

These are small, medium, and large effect sizes, and are obtained from the corresponding effect size
measures for independent samples ANOVA (i.e., .10, .25, and .40) by dividing by 1- correl. Thus, for example,
.10
.40
, and .57 =
14 =
.
1- .50
1- .50

Example 12.1
An investigator has a three treatment design: That is, each of the subjects is exposed
to three treatments. He uses r€=€.80 as his estimate of the average correlation of the
subjects’ responses to the three treatments. How many subjects will he need for
power€=€.80 at the .05 level, if he anticipates a medium effect size?
Reference to Table€12.9 with correl€=€.80, effect size€=€.56, k€=€3, and α€=€.05, shows
that only 14 subjects are needed.
Example 12.2
An investigator will be carrying out a longitudinal study, measuring the subjects at five
points in time. She wishes to detect a large effect size at the .10 level of significance,
and estimates that the average correlation among the five measures will be about .50.
How many subjects will she need?
Reference to Table€12.9 with correl€=€.50, effect size€=€.57, k€=€5, and α€=€.10, shows
that 11 subjects are needed.
12.10 MULTIVARIATE MATCHED-PAIRS ANALYSIS
It was mentioned in Chapter€4 that often in comparing intact groups the subjects are
matched or paired on variables known or presumed to be related to performance on

Chapter 12

↜渀屮

↜渀屮

the dependent variable(s). This is done so that if a significant difference is found, the
investigator can be more confident it was the treatment(s) that “caused” the difference.
In Chapter€4 we gave a univariate example, where kindergarteners were compared
against nonkindergarteners on first-grade readiness, after they were matched on IQ,
SES, and number of children in the family.
Now consider a multivariate example, that is, where there are several dependent
variables. Kvet (1982) was interested in determining whether excusing elementary
school children from regular classroom instruction for the study of instrumental
music affected sixth-grade reading, language, and mathematics achievement. These
were the three dependent variables. Instrumental and noninstrumental students from
four public school districts were used in the study. We consider the analysis from just
one of the districts. The instrumental and noninstrumental students were matched
on the following variables: sex, race, IQ, cumulative achievement in fifth grade,
elementary school attended, sixth-grade classroom teacher, and instrumental music
outside the school.
Table€12.10 shows the control lines for running the analysis on SAS and SPSS. Note
that we compute three difference variables, on which the multivariate analysis is done,
and that it is these difference variables that are used in the MODEL (SAS) and GLM
(SPSS) statements. We are testing whether these three difference variables (considered
jointly) differ significantly from the 0 vector, that is, whether the group mean differences on all three variables are jointly 0.
Again we obtain a Tâ•›2 value, as for the single sample multivariate repeated-measures
analysis; however, the exact F transformation is somewhat different:
F=

N-p 2
T , with p and ( N - p ) df ,
( N - 1) p

where N is the number of matched pairs and p is the number of difference variables.
The multivariate test results shown in Table€12.11 indicate that the instrumental
group does not differ from the noninstrumental group on the set of three difference
variables (F€=€.9115, p < .46). Thus, the classroom time taken by the instrumental
group does not appear to adversely affect their achievement in these three basic academic areas.
12.11 ONE-BETWEEN AND ONE-WITHIN DESIGN
We now add a grouping (between) variable to the one-way repeated measures design.
This design, having one-between and one-within subjects factor, is often called a
split plot design. For this design, we consider hypothetical data from a study comparing the relative efficacy of a behavior modification approach to dieting versus a

497

82 83 69 99 63 66â•… 69 60 87 80 69
55 61 52 74 55 67â•… 87 87 88 99 95
91 99 99 99 99 87â•… 78 72 66 76 52
78 62 79 69 54 65â•… 72 58 74 69 59
85 99 99 75 66 61
END DATA.
COMPUTE Readdiff€=€read1-read2.
COMPUTE Langdiff€=€lang1-lang2.
COMPUTE Mathdiff€=€math1-math2.
LIST.
GLM Readdiff Langdiff Mathdiff
/INTERCEPT=INCLUDE
/EMMEANS=TABLES(OVERALL)
/PRINT=DESCRIPTIVE.

71
82
74
58

DATA LIST FREE/read1 read2 lang1 lang2 math1 math2.
BEGIN DATA.
62 67 72 66 67 35â•… 95 87 99 96 82 82
66 66 96 87 74 63â•… 87 91 87 82 98 85
70 74 69 73 85 63â•… 96 99 96 76 74 61
85 99 99 71 91 60â•… 54 60 69 80 66 71

DATA MatchedPairs;
INPUT read1 read2 lang1 lang2 math1 math2;
LINES;
62 67 72 66 67 35
66 66 96 87 74 63
70 74 69 73 85 63
85 99 99 71 91 60
82 83 69 99 63 66
55 61 52 74 55 67
91 99 99 99 99 87

78 62 79 69 54 65
85 99 99 75 66 61
95 87 99 96 82 82
87 91 87 82 98 85
96 99 96 76 74 61
54 60 69 80 66 71
69 60 87 80 69 71
87 87 88 99 95 82
78 72 66 76 52 74
72 58 74 69 59 58
PROC PRINT DATA€=€MatchedPairs;
RUN;
DATA MatchedPairs; SET MatchedPairs;
Readdiff€=€read1-read2;
Langdiff€=€lang1-lang2;
Mathdiff€=€math1-math2;
RUN;
PROC GLM;
MODEL Readdiff Langdiff Mathdiff€=€/;
MANOVA H =INTERCEPT;
RUN;

SPSS

SAS

 Table€12.10:╇ SAS and SPSS Control Lines for Multivariate Matched-Pairs Analysis

Chapter 12

↜渀屮

↜渀屮

 Table€12.11:╇ Multivariate Test Results for Matched Pairs Example
SAS Output
MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Intercept Effect
H€=€Type III SSCP Matrix for Intercept
E€=€Error SSCP Matrix
S=1€M=0.5 N=6
Statistic

Value

F Value

Num€DF

Den€DF

Pr€>€F

Wilks’ Lambda
Pillai’s Trace
Hotelling-Lawley
Trace
Roy’s Greatest Root

0.83658794
0.16341206
0.19533160

0.91
0.91
0.91

3
3
3

14
14
14

0.4604
0.4604
0.4604

0.19533160

0.91

3

14

0.4604

SPSS Output
Multivariate Testsa
Effect
Intercept

a
b

Pillai’s Trace
Wilks’ Lambda
Hotelling’s Trace
Roy’s Largest Root

Value

F

Hypothesis df

Error df

Sig.

.163
.837
.195
.195

.912b
.912b
.912b
.912b

3.000
3.000
3.000
3.000

14.000
14.000
14.000
14.000

.460
.460
.460
.460

Design: Intercept
Exact statistic

behavior modification plus exercise approach (combination treatment) on weight
loss for a group of overweight women. There is also a control group in this study.
In this experimental design, 12 women are randomly assigned to one of the three
treatment conditions, and weight loss is measured 2 months, 4 months, and 6 months
after the program begins. Note that weight loss is relative to the weight measured at
the previous occasion.
When a between-subjects variable is included in this design, there are two additional
assumptions. One new assumption is the homogeneity of the covariance matrices on
the repeated measures for the groups. That is, the population variances and covariances
for the repeated measures are assumed to be the same for all groups. In our example,
the group sizes are equal, and in this case a violation of the equal covariance matrices
assumption is not serious. That is, the within-subjects tests (for the within-subject
main effect and the interaction) are robust (with respect to type I€error) against a
violation of this assumption (see Stevens, 1986, chap. 6). However, if the group sizes
are substantially unequal, then a violation is serious, and Stevens (1986) indicated
in Table€6.5 what should be added to test this assumption. A€key assumption for the

499

500

↜渀屮

↜渀屮

Repeated-Measures Analysis

validity of the within-subjects tests that was also in place for the single-group repeated
measures is the assumption of sphericity that now applies to the repeated measures
within each of the groups. It is still the case here that the unadjusted univariate F tests
for the within-subjects effects are not robust to a violation of sphericity. Note that the
combination of the sphericity and homogeneity of the covariance matrices assumption
has been called multisample sphericity. The second new assumption is homogeneity
of variance for the between-subjects main effect test. This assumption applies not to
the raw scores but to the average of the outcome scores across the repeated measures
for each subject. As with the typical between-subjects homogeneity assumption, the
procedure is robust when the between-subjects group sizes are similar, but a liberal or
conservative F test may result if group sizes are quite discrepant and these variances
are not the same.
Table€12.12 provides the SAS and SPSS commands for the overall tests associated
with this analysis. Table€12.13 provides selected SAS and SPSS results. Note that
this analysis can be considered as a two-way ANOVA. As such, we will test main
effects for diet and time, as well as the interaction between these two factors. The
time main effect and the time-by-diet interaction are within-subjects effects because
they involve change in means or change in treatment effects across time. The univariate tests for these effects appear in the first output selections for SAS and SPSS
in Table€12.13. Using the Greenhouse–Geisser procedure, the main effect of time
is statistically significant (p < .001) as is the time-by-diet interaction (p = .003).
(Note that these effects are also significant using the multivariate approach, which
is not shown to conserve space.) The last output selections for SAS and SPSS in
Table€12.13 indicate that the main effect of diet is also statistically significant, F(2,
33)€=€4.69, p = .016.
To interpret the significant effects, we display in Table€12.14 the means involved
in the main effects and interaction as well as a plot of the cell means for the two
factors. Recall that graphically an interaction is evidenced by nonparallel lines. In
this graph you can see that the profiles for diets 1 and 2 are essentially parallel;
however, the profile for diet 3 is definitely not parallel with the profiles for diets
1 and 2. And, in particular, it is the relatively greater weight loss at time 2 for
diet 3 (i.e., 5.9 pounds) that is making the profile distinctly nonparallel. The main
effect of diet, evident in Table€12.14, indicates that the population row means are
not equal. The sample means suggest that, weight loss averaging across time, is
greatest for diet 3. The main effect of time indicates that the population column
means differ. The sample column means suggest that weight loss is greater after
month 2 and 4, than after month 6. In addition to the graph, the cell means in
Table€12.14 can also be used to describe the interaction. Note that weight loss for
each treatment was relatively large at 2 months, but only those in the diet 3 condition experienced essentially the same weight loss at 2 and 4 months, whereas the
weight loss for the other two treatments tapered off at the 4-month period. This
created much larger differences between the diet groups at 4 months relative to
the other months.

DATA LIST FREE/diet wgtloss1 wgtloss2 wgtloss3.
BEGIN DATA.
1 4 3 3 1 4 4 3 1 4 3 1
1 3 2 1 1 5 3 2 1 6 5 4
1 6 5 4 1 5 4 1 1 3 3 2
1 5 4 1 1 4 2 2 1 5 2 1
2 6 3 2 2 5 4 1 2 7 6 3
2 6 4 2 2 3 2 1 2 5 5 4
2 4 3 1 2 4 2 1 2 6 5 3
2 7 6 4 2 4 3 2 2 7 4 3
3 8 4 2 3 3 6 3 3 7 7 4
3 4 7 1 3 9 7 3 3 2 4 1
3 3 5 1 3 6 5 2 3 6 6 3
3 9 5 2 3 7 9 4 3 8 6 1
END DATA.
(2) GLM wgtloss1 wgtloss2 wgtloss3 BY diet
â•…â•… /WSFACTOR=time 3
(3) /PLOT=PROFILE(time*diet)
(4) /EMMEANS=TABLES(time) COMPARE ADJ(BONFERRONI)
â•…â•… /PRINT=DESCRIPTIVE
(5) /WSDESIGN=time
â•…â•… /DESIGN=diet.

DATA weight;
INPUT diet wgtloss1 wgtloss2 wgtloss3;
LINES;
1 4 3 3
1 4 4 3
1 4 3 1
1 3 2 1
1 5 3 2
1 6 5 4
1 6 5 4
1 5 4 1
1 3 3 2
1 5 4 1
1 4 2 2
1 5 2 1
2 6 3 2
2 5 4 1
2 7 6 3
2 6 4 2
2 3 2 1
2 5 5 4
2 4 3 1
2 4 2 1
2 6 5 3
2 7 6 4

(Continuedâ•›)

SPSS

SAS

 Table€12.12:╇ SAS and SPSS Control Lines for One-Between and One-Within Repeated Measures Analysis

SPSS

(1) CLASS indicates diet is a grouping (or classification) variable.
(2) MODEL (in SAS) and GLM (in SPSS) indicates that the weight scores are function of diet.
(3) PLOT requests a profile plot.
(4) The EMMEANS statement requests the marginal means (pooling over diet) and Bonferroni-adjusted multiple comparisons associated with the within-subjects factor (time).
(5) WSDESIGN requests statistical testing associated with the within-subjects time factor, and the DESIGN command requests testing results for the between-subjects diet
factor.

2 4 3 2
2 7 4 3
3 8 4 2
3 3 6 3
3 7 7 4
3 4 7 1
3 9 7 3
3 2 4 1
3 3 5 1
3 6 5 2
3 6 6 3
3 9 5 2
3 7 9 4
3 8 6 1
PROC GLM;
(1) CLASS diet;
(2) MODEL wgtloss1 wgtloss2 wgtloss3€=€diet/
NOUNI;
REPEATED time 3 /SUMMARY MEAN;
RUN;

SAS

 Table€12.12:╇ (Continued)

2
33

diet
Error

181.352
181.352
181.352
181.352

SPSS Results

18.4537037
3.9351852

Mean Square

2
1.556
1.717
1.000

Df

<.0001
0.0012
€

Pr > F

90.676
116.574
105.593
181.352

Mean Square

Tests of Within-Subjects Effects

36.9074074
129.8611111

Type III SS

Time

Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound

88.37
5.10
€

F Value

The GLM Procedure
Repeated Measures Analysis of Variance
Tests of Hypotheses for Between Subjects Effects

90.6759259
5.2314815
1.0260943

Mean Square

Type III Sum of Squares

181.3518519
20.9259259
67.7222222

Type III SS

Source

Measure: MEASURE_1

DF

2
4
66

Time
time*diet
Error(time)

Source

DF

Source

SAS Results
The GLM Procedure
Repeated Measures Analysis of Variance
Univariate Tests of Hypotheses for Within Subject Effects

 Table€12.13:╇ Selected Output for One-Between One-Within Design

4.69
€

88.370
88.370
88.370
88.370

F

F Value

<.0001
0.0033
€

G–G

0.0161
€

Pr€>€F

€

<.0001
0.0029

H-F-L

(Continuedâ•›)

.000
.000
.000
.000

Sig.

Adj Pr > F

Type III Sum of Squares

1688.231
36.907
129.861

Source

Intercept
Diet
Error

1
2
33

Df

20.926
20.926
20.926
20.926
67.722
67.722
67.722
67.722

time * diet Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Error(time) Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound

Measure: MEASURE_1
Transformed Variable: Average

Type III Sum of Squares

Source

 Table€12.13:╇ (Continued)

5.231
6.726
6.092
10.463
1.026
1.319
1.195
2.052

Mean Square

1688.231
18.454
3.935

Mean Square

Tests of Between-Subjects Effects

4
3.111
3.435
2.000
66
51.337
56.676
33.000

Df

429.009
4.689

F

5.098
5.098
5.098
5.098

F

.000
.016

Sig.

.001
.003
.002
.012

Sig.

Chapter 12

↜渀屮

↜渀屮

 Table€12.14:╇ Cell and Marginal Means for the One-Between One-Within Design
TIME

1
2
3

DIETS
COLUMN MEANS

1

2

3

ROW MEANS

4.50
5.33
6.00
5.278

3.33
3.917
5.917
4.389

2.083
2.250
2.250
2.194

3.304
3.832
4.722

Diet 3

6

Weight loss

Diet 2
4
Diet 1

2

1

2
Time

3

12.12 POST HOC PROCEDURES FOR THE ONE-BETWEEN
AND ONE-WITHIN DESIGN
In the previous section, we presented and discussed statistical test results for the main
effects and interaction. We also used cell and marginal means and a graph to describe
results. When three or more levels of a factor are present in a design, researchers may also
wish to conduct follow-up tests for specific effects of interest. In our example, an investigator would likely focus on simple effects given the interaction between diet and time. We
will provide testing procedures for such simple effects, but for completeness, we briefly
discuss pairwise comparisons associated with the diet and time main effects. Note that for
the follow-up procedures discussed in this section, there is more than one way to obtain
results via SAS and SPSS. In this section, we use procedures, while not always the most
efficient, are intended to help you better understand the comparisons you are making.
12.12.1 Comparisons Involving Main Effects
As an example of this, to conduct pairwise comparisons for the means involved in a
statistically significant main effect of the between-subjects factor (here, diet), you can
simply compute the average of each participant’s scores across the time points of the

505