9 Sample Size for Power = .80 in Single-Sample Case
Tải bản đầy đủ
Chapter 12
â†œæ¸€å±®
â†œæ¸€å±®
Robey and Barcikowski (1984) have given power tables for various alpha levels for the
single group repeated-measures design. Their tables assume a common correlation for the
repeated measures, which generally will not be tenable (especially in longitudinal studies);
however, a later paper by Green (1990) indicated that use of an estimated average correlation (from all the correlations among the repeated measures) is fine. Selected results from
their work are presented in TableÂ€12.9, which indicates sample size needed for powerÂ€=Â€.80
for small, medium, and large effect sizes at alphaÂ€=Â€.01, .05, .10, and .20 for two through
seven repeated measures. We give two examples to show how to use the table.
TableÂ€12.9:â•‡ Sample Sizes Needed for PowerÂ€=Â€.80 in Single-Group Repeated
Measures
Number of repeated measures
Effect sizea
2
.12
.30
.49
.14
.35
.57
.22
.56
.89
404
68
28
298
51
22
123
22
11
.12
.30
.49
.14
.35
.57
.22
.56
.89
268
45
19
199
34
14
82
15
8
.30
.12
.30
.49
.50
.14
.35
.57
.22
.56
.89
Average corr.
.30
.50
.80
.30
.50
.80
.80
3
4
5
6
7
273
49
22
202
38
18
86
19
11
238
44
21
177
35
18
76
18
12
214
41
21
159
33
18
69
18
12
195
39
21
146
31
18
65
18
13
192
35
16
142
27
13
60
13
8
170
32
16
126
25
13
54
13
9
154
30
16
114
24
13
50
14
10
141
29
16
106
23
14
47
14
10
209
35
14
αÂ€=Â€.01
324
56
24
239
43
19
100
20
11
αÂ€=Â€.05
223
39
17
165
30
14
69
14
8
αÂ€=Â€.10
178
31
14
154
28
13
137
26
13
125
25
13
116
24
13
154
26
11
64
12
6
131
24
11
55
11
7
114
22
11
49
11
7
102
20
11
44
11
8
93
20
11
41
12
9
87
19
12
39
12
9
(Continuedâ•›)
495
496
â†œæ¸€å±®
â†œæ¸€å±®
Repeated-Measures Analysis
TableÂ€12.9:â•‡ (Continued)
Number of repeated measures
Average corr.
.30
.50
.80
Effect sizea
2
3
4
5
6
7
.12
.30
.49
.14
.35
.57
.22
.56
.89
149
25
10
110
19
8
45
8
4
αÂ€=Â€.20
130
23
10
96
17
8
40
8
5
114
21
10
85
16
8
36
9
6
103
20
10
76
16
9
33
9
7
94
19
11
70
15
9
31
10
8
87
19
11
65
15
10
30
10
8
a
These are small, medium, and large effect sizes, and are obtained from the corresponding effect size
measures for independent samples ANOVA (i.e., .10, .25, and .40) by dividing by 1- correl. Thus, for example,
.10
.40
, and .57 =
14 =
.
1- .50
1- .50
Example 12.1
An investigator has a three treatment design: That is, each of the subjects is exposed
to three treatments. He uses rÂ€=Â€.80 as his estimate of the average correlation of the
subjects’ responses to the three treatments. How many subjects will he need for
powerÂ€=Â€.80 at the .05 level, if he anticipates a medium effect size?
Reference to TableÂ€12.9 with correlÂ€=Â€.80, effect sizeÂ€=Â€.56, kÂ€=Â€3, and αÂ€=Â€.05, shows
that only 14 subjects are needed.
Example 12.2
An investigator will be carrying out a longitudinal study, measuring the subjects at five
points in time. She wishes to detect a large effect size at the .10 level of significance,
and estimates that the average correlation among the five measures will be about .50.
How many subjects will she need?
Reference to TableÂ€12.9 with correlÂ€=Â€.50, effect sizeÂ€=Â€.57, kÂ€=Â€5, and αÂ€=Â€.10, shows
that 11 subjects are needed.
12.10 MULTIVARIATE MATCHED-PAIRS ANALYSIS
It was mentioned in ChapterÂ€4 that often in comparing intact groups the subjects are
matched or paired on variables known or presumed to be related to performance on
Chapter 12
â†œæ¸€å±®
â†œæ¸€å±®
the dependent variable(s). This is done so that if a significant difference is found, the
investigator can be more confident it was the treatment(s) that “caused” the difference.
In ChapterÂ€4 we gave a univariate example, where kindergarteners were compared
against nonkindergarteners on first-grade readiness, after they were matched on IQ,
SES, and number of children in the family.
Now consider a multivariate example, that is, where there are several dependent
variables. Kvet (1982) was interested in determining whether excusing elementary
school children from regular classroom instruction for the study of instrumental
music affected sixth-grade reading, language, and mathematics achievement. These
were the three dependent variables. Instrumental and noninstrumental students from
four public school districts were used in the study. We consider the analysis from just
one of the districts. The instrumental and noninstrumental students were matched
on the following variables: sex, race, IQ, cumulative achievement in fifth grade,
elementary school attended, sixth-grade classroom teacher, and instrumental music
outside the school.
TableÂ€12.10 shows the control lines for running the analysis on SAS and SPSS. Note
that we compute three difference variables, on which the multivariate analysis is done,
and that it is these difference variables that are used in the MODEL (SAS) and GLM
(SPSS) statements. We are testing whether these three difference variables (considered
jointly) differ significantly from the 0 vector, that is, whether the group mean differences on all three variables are jointly 0.
Again we obtain a Tâ•›2 value, as for the single sample multivariate repeated-measures
analysis; however, the exact F transformation is somewhat different:
F=
N-p 2
T , with p and ( N - p ) df ,
( N - 1) p
where N is the number of matched pairs and p is the number of difference variables.
The multivariate test results shown in TableÂ€12.11 indicate that the instrumental
group does not differ from the noninstrumental group on the set of three difference
variables (FÂ€=Â€.9115, p < .46). Thus, the classroom time taken by the instrumental
group does not appear to adversely affect their achievement in these three basic academic areas.
12.11 ONE-BETWEEN AND ONE-WITHIN DESIGN
We now add a grouping (between) variable to the one-way repeated measures design.
This design, having one-between and one-within subjects factor, is often called a
split plot design. For this design, we consider hypothetical data from a study comparing the relative efficacy of a behavior modification approach to dieting versus a
497
82 83 69 99 63 66â•… 69 60 87 80 69
55 61 52 74 55 67â•… 87 87 88 99 95
91 99 99 99 99 87â•… 78 72 66 76 52
78 62 79 69 54 65â•… 72 58 74 69 59
85 99 99 75 66 61
END DATA.
COMPUTE ReaddiffÂ€=Â€read1-read2.
COMPUTE LangdiffÂ€=Â€lang1-lang2.
COMPUTE MathdiffÂ€=Â€math1-math2.
LIST.
GLM Readdiff Langdiff Mathdiff
/INTERCEPT=INCLUDE
/EMMEANS=TABLES(OVERALL)
/PRINT=DESCRIPTIVE.
71
82
74
58
DATA LIST FREE/read1 read2 lang1 lang2 math1 math2.
BEGIN DATA.
62 67 72 66 67 35â•… 95 87 99 96 82 82
66 66 96 87 74 63â•… 87 91 87 82 98 85
70 74 69 73 85 63â•… 96 99 96 76 74 61
85 99 99 71 91 60â•… 54 60 69 80 66 71
DATA MatchedPairs;
INPUT read1 read2 lang1 lang2 math1 math2;
LINES;
62 67 72 66 67 35
66 66 96 87 74 63
70 74 69 73 85 63
85 99 99 71 91 60
82 83 69 99 63 66
55 61 52 74 55 67
91 99 99 99 99 87
78 62 79 69 54 65
85 99 99 75 66 61
95 87 99 96 82 82
87 91 87 82 98 85
96 99 96 76 74 61
54 60 69 80 66 71
69 60 87 80 69 71
87 87 88 99 95 82
78 72 66 76 52 74
72 58 74 69 59 58
PROC PRINT DATAÂ€=Â€MatchedPairs;
RUN;
DATA MatchedPairs; SET MatchedPairs;
ReaddiffÂ€=Â€read1-read2;
LangdiffÂ€=Â€lang1-lang2;
MathdiffÂ€=Â€math1-math2;
RUN;
PROC GLM;
MODEL Readdiff Langdiff MathdiffÂ€=Â€/;
MANOVA H =INTERCEPT;
RUN;
SPSS
SAS
TableÂ€12.10:â•‡ SAS and SPSS Control Lines for Multivariate Matched-Pairs Analysis
Chapter 12
â†œæ¸€å±®
â†œæ¸€å±®
TableÂ€12.11:â•‡ Multivariate Test Results for Matched Pairs Example
SAS Output
MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Intercept Effect
HÂ€=Â€Type III SSCP Matrix for Intercept
EÂ€=Â€Error SSCP Matrix
S=1Â€M=0.5 N=6
Statistic
Value
F Value
NumÂ€DF
DenÂ€DF
PrÂ€>Â€F
Wilks’ Lambda
Pillai’s Trace
Hotelling-Lawley
Trace
Roy’s Greatest Root
0.83658794
0.16341206
0.19533160
0.91
0.91
0.91
3
3
3
14
14
14
0.4604
0.4604
0.4604
0.19533160
0.91
3
14
0.4604
SPSS Output
Multivariate Testsa
Effect
Intercept
a
b
Pillai’s Trace
Wilks’ Lambda
Hotelling’s Trace
Roy’s Largest Root
Value
F
Hypothesis df
Error df
Sig.
.163
.837
.195
.195
.912b
.912b
.912b
.912b
3.000
3.000
3.000
3.000
14.000
14.000
14.000
14.000
.460
.460
.460
.460
Design: Intercept
Exact statistic
behavior modification plus exercise approach (combination treatment) on weight
loss for a group of overweight women. There is also a control group in this study.
In this experimental design, 12 women are randomly assigned to one of the three
treatment conditions, and weight loss is measured 2 months, 4 months, and 6 months
after the program begins. Note that weight loss is relative to the weight measured at
the previous occasion.
When a between-subjects variable is included in this design, there are two additional
assumptions. One new assumption is the homogeneity of the covariance matrices on
the repeated measures for the groups. That is, the population variances and covariances
for the repeated measures are assumed to be the same for all groups. In our example,
the group sizes are equal, and in this case a violation of the equal covariance matrices
assumption is not serious. That is, the within-subjects tests (for the within-subject
main effect and the interaction) are robust (with respect to type IÂ€error) against a
violation of this assumption (see Stevens, 1986, chap. 6). However, if the group sizes
are substantially unequal, then a violation is serious, and Stevens (1986) indicated
in TableÂ€6.5 what should be added to test this assumption. AÂ€key assumption for the
499
500
â†œæ¸€å±®
â†œæ¸€å±®
Repeated-Measures Analysis
validity of the within-subjects tests that was also in place for the single-group repeated
measures is the assumption of sphericity that now applies to the repeated measures
within each of the groups. It is still the case here that the unadjusted univariate F tests
for the within-subjects effects are not robust to a violation of sphericity. Note that the
combination of the sphericity and homogeneity of the covariance matrices assumption
has been called multisample sphericity. The second new assumption is homogeneity
of variance for the between-subjects main effect test. This assumption applies not to
the raw scores but to the average of the outcome scores across the repeated measures
for each subject. As with the typical between-subjects homogeneity assumption, the
procedure is robust when the between-subjects group sizes are similar, but a liberal or
conservative F test may result if group sizes are quite discrepant and these variances
are not the same.
TableÂ€12.12 provides the SAS and SPSS commands for the overall tests associated
with this analysis. TableÂ€12.13 provides selected SAS and SPSS results. Note that
this analysis can be considered as a two-way ANOVA. As such, we will test main
effects for diet and time, as well as the interaction between these two factors. The
time main effect and the time-by-diet interaction are within-subjects effects because
they involve change in means or change in treatment effects across time. The univariate tests for these effects appear in the first output selections for SAS and SPSS
in TableÂ€12.13. Using the Greenhouse–Geisser procedure, the main effect of time
is statistically significant (p < .001) as is the time-by-diet interaction (p = .003).
(Note that these effects are also significant using the multivariate approach, which
is not shown to conserve space.) The last output selections for SAS and SPSS in
TableÂ€12.13 indicate that the main effect of diet is also statistically significant, F(2,
33)Â€=Â€4.69, p = .016.
To interpret the significant effects, we display in TableÂ€12.14 the means involved
in the main effects and interaction as well as a plot of the cell means for the two
factors. Recall that graphically an interaction is evidenced by nonparallel lines. In
this graph you can see that the profiles for diets 1 and 2 are essentially parallel;
however, the profile for diet 3 is definitely not parallel with the profiles for diets
1 and 2. And, in particular, it is the relatively greater weight loss at time 2 for
diet 3 (i.e., 5.9 pounds) that is making the profile distinctly nonparallel. The main
effect of diet, evident in TableÂ€12.14, indicates that the population row means are
not equal. The sample means suggest that, weight loss averaging across time, is
greatest for diet 3. The main effect of time indicates that the population column
means differ. The sample column means suggest that weight loss is greater after
month 2 and 4, than after month 6. In addition to the graph, the cell means in
TableÂ€12.14 can also be used to describe the interaction. Note that weight loss for
each treatment was relatively large at 2 months, but only those in the diet 3 condition experienced essentially the same weight loss at 2 and 4 months, whereas the
weight loss for the other two treatments tapered off at the 4-month period. This
created much larger differences between the diet groups at 4 months relative to
the other months.
DATA LIST FREE/diet wgtloss1 wgtloss2 wgtloss3.
BEGIN DATA.
1 4 3 3 1 4 4 3 1 4 3 1
1 3 2 1 1 5 3 2 1 6 5 4
1 6 5 4 1 5 4 1 1 3 3 2
1 5 4 1 1 4 2 2 1 5 2 1
2 6 3 2 2 5 4 1 2 7 6 3
2 6 4 2 2 3 2 1 2 5 5 4
2 4 3 1 2 4 2 1 2 6 5 3
2 7 6 4 2 4 3 2 2 7 4 3
3 8 4 2 3 3 6 3 3 7 7 4
3 4 7 1 3 9 7 3 3 2 4 1
3 3 5 1 3 6 5 2 3 6 6 3
3 9 5 2 3 7 9 4 3 8 6 1
END DATA.
(2) GLM wgtloss1 wgtloss2 wgtloss3 BY diet
â•…â•… /WSFACTOR=time 3
(3) /PLOT=PROFILE(time*diet)
(4) /EMMEANS=TABLES(time) COMPARE ADJ(BONFERRONI)
â•…â•… /PRINT=DESCRIPTIVE
(5) /WSDESIGN=time
â•…â•… /DESIGN=diet.
DATA weight;
INPUT diet wgtloss1 wgtloss2 wgtloss3;
LINES;
1 4 3 3
1 4 4 3
1 4 3 1
1 3 2 1
1 5 3 2
1 6 5 4
1 6 5 4
1 5 4 1
1 3 3 2
1 5 4 1
1 4 2 2
1 5 2 1
2 6 3 2
2 5 4 1
2 7 6 3
2 6 4 2
2 3 2 1
2 5 5 4
2 4 3 1
2 4 2 1
2 6 5 3
2 7 6 4
(Continuedâ•›)
SPSS
SAS
TableÂ€12.12:â•‡ SAS and SPSS Control Lines for One-Between and One-Within Repeated Measures Analysis
SPSS
(1) CLASS indicates diet is a grouping (or classification) variable.
(2) MODEL (in SAS) and GLM (in SPSS) indicates that the weight scores are function of diet.
(3) PLOT requests a profile plot.
(4) The EMMEANS statement requests the marginal means (pooling over diet) and Bonferroni-adjusted multiple comparisons associated with the within-subjects factor (time).
(5) WSDESIGN requests statistical testing associated with the within-subjects time factor, and the DESIGN command requests testing results for the between-subjects diet
factor.
2 4 3 2
2 7 4 3
3 8 4 2
3 3 6 3
3 7 7 4
3 4 7 1
3 9 7 3
3 2 4 1
3 3 5 1
3 6 5 2
3 6 6 3
3 9 5 2
3 7 9 4
3 8 6 1
PROC GLM;
(1) CLASS diet;
(2) MODEL wgtloss1 wgtloss2 wgtloss3Â€=Â€diet/
NOUNI;
REPEATED time 3 /SUMMARY MEAN;
RUN;
SAS
TableÂ€12.12:â•‡ (Continued)
2
33
diet
Error
181.352
181.352
181.352
181.352
SPSS Results
18.4537037
3.9351852
Mean Square
2
1.556
1.717
1.000
Df
<.0001
0.0012
Â€
Pr > F
90.676
116.574
105.593
181.352
Mean Square
Tests of Within-Subjects Effects
36.9074074
129.8611111
Type III SS
Time
Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
88.37
5.10
Â€
F Value
The GLM Procedure
Repeated Measures Analysis of Variance
Tests of Hypotheses for Between Subjects Effects
90.6759259
5.2314815
1.0260943
Mean Square
Type III Sum of Squares
181.3518519
20.9259259
67.7222222
Type III SS
Source
Measure: MEASURE_1
DF
2
4
66
Time
time*diet
Error(time)
Source
DF
Source
SAS Results
The GLM Procedure
Repeated Measures Analysis of Variance
Univariate Tests of Hypotheses for Within Subject Effects
TableÂ€12.13:â•‡ Selected Output for One-Between One-Within Design
4.69
Â€
88.370
88.370
88.370
88.370
F
F Value
<.0001
0.0033
Â€
G–G
0.0161
Â€
PrÂ€>Â€F
Â€
<.0001
0.0029
H-F-L
(Continuedâ•›)
.000
.000
.000
.000
Sig.
Adj Pr > F
Type III Sum of Squares
1688.231
36.907
129.861
Source
Intercept
Diet
Error
1
2
33
Df
20.926
20.926
20.926
20.926
67.722
67.722
67.722
67.722
time * diet Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Error(time) Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Measure: MEASURE_1
Transformed Variable: Average
Type III Sum of Squares
Source
TableÂ€12.13:â•‡ (Continued)
5.231
6.726
6.092
10.463
1.026
1.319
1.195
2.052
Mean Square
1688.231
18.454
3.935
Mean Square
Tests of Between-Subjects Effects
4
3.111
3.435
2.000
66
51.337
56.676
33.000
Df
429.009
4.689
F
5.098
5.098
5.098
5.098
F
.000
.016
Sig.
.001
.003
.002
.012
Sig.
Chapter 12
â†œæ¸€å±®
â†œæ¸€å±®
TableÂ€12.14:â•‡ Cell and Marginal Means for the One-Between One-Within Design
TIME
1
2
3
DIETS
COLUMN MEANS
1
2
3
ROW MEANS
4.50
5.33
6.00
5.278
3.33
3.917
5.917
4.389
2.083
2.250
2.250
2.194
3.304
3.832
4.722
Diet 3
6
Weight loss
Diet 2
4
Diet 1
2
1
2
Time
3
12.12 POST HOC PROCEDURES FOR THE ONE-BETWEEN
AND ONE-WITHIN DESIGN
In the previous section, we presented and discussed statistical test results for the main
effects and interaction. We also used cell and marginal means and a graph to describe
results. When three or more levels of a factor are present in a design, researchers may also
wish to conduct follow-up tests for specific effects of interest. In our example, an investigator would likely focus on simple effects given the interaction between diet and time. We
will provide testing procedures for such simple effects, but for completeness, we briefly
discuss pairwise comparisons associated with the diet and time main effects. Note that for
the follow-up procedures discussed in this section, there is more than one way to obtain
results via SAS and SPSS. In this section, we use procedures, while not always the most
efficient, are intended to help you better understand the comparisons you are making.
12.12.1 Comparisons Involving Main Effects
As an example of this, to conduct pairwise comparisons for the means involved in a
statistically significant main effect of the between-subjects factor (here, diet), you can
simply compute the average of each participant’s scores across the time points of the
505