1: Single-Factor ANOVA and the F Test
Tải bản đầy đủ - 0trang
15.1
Mean of
Sample 1
Mean of
Sample 2
Single-Factor ANOVA and the F Test
705
Mean of
Sample 3
(a)
FIGURE 15.1
Two possible ANOVA data sets when
three populations are under investigation: green circle 5 observation from
Population 1; orange circle 5 observation from Population 2; blue circle 5
observation from Population 3.
Mean of
Sample 1
Mean of
Sample 2
Mean of
Sample 3
(b)
After looking at the data in Figure 15.1(a), almost anyone would readily agree
that the claim m1 ϭ m2 ϭ m3 appears to be false. Not only are the three sample means
different, but also the three samples are clearly separated. In other words, differences
between the three sample means are quite large relative to the variability within each
sample. (If all data sets gave such obvious messages, statisticians would not be in such
great demand!)
The situation pictured in Figure 15.1(b) is much less clear-cut. The sample
means are as different as they were in the first data set, but now there is considerable
overlap among the three samples. The separation between sample means here can
plausibly be attributed to substantial variability in the populations (and therefore the
samples) rather than to differences between m1, m2, and m3. The phrase analysis of
variance comes from the idea of analyzing variability in the data to see how much can
be attributed to differences in the m’s and how much is due to variability in the individual populations. In Figure 15.1(a), the within-sample variability is small relative
to the between-sample variability, whereas in Figure 15.1(b), a great deal more of the
total variability is due to variation within each sample. If differences between the
sample means can be explained by within-sample variability, there is no compelling
reason to reject H0.
Notations and Assumptions
Notation in single-factor ANOVA is a natural extension of the notation used in
Chapter 11 for comparing two population or treatment means.
ANOVA Notation
k 5 number of populations or treatments being compared
Population or treatment
Population or treatment mean
Population or treatment variance
Sample size
Sample mean
Sample variance
1
m1
s21
n1
x1
s 21
2
m2
s22
n2
x2
s 22
p
p
p
p
p
p
k
mk
s2k
nk
xk
s 2k
N 5 n1 1 n2 1 % 1 nk (the total number of observations in the data set)
T 5 grand total 5 sum of all N observations in the data set 5 n1 x1 1 n2 x2 1 % 1 nk xk
T
x 5 grand mean 5
N
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
706
Chapter 15 Analysis of Variance
A decision between H0 and Ha is based on examining the x values to see whether
observed discrepancies are small enough to be attributable simply to sampling variability or whether an alternative explanation for the differences is more plausible.
© Science Faction/
David Scharf/Getty Images
EXAMPLE 15.1
Activated platelet
TABLE 1 5. 1
Group
Number
1
2
3
4
An Indicator of Heart Attack Risk
The article “Could Mean Platelet Volume Be a Predictive Marker for Acute Myocardial
Infarction?” (Medical Science Monitor [2005]: 387-392) described an experiment in
which four groups of patients seeking treatment for chest pain were compared with respect to mean platelet volume (MPV, measured in fL). The four groups considered were
based on the clinical diagnosis and were (1) noncardiac chest pain, (2) stable angina
pectoris, (3) unstable angina pectoris, and (4) myocardial infarction (heart attack). The
purpose of the study was to determine if the mean MPV differed for the four groups,
and in particular if the mean MPV was different for the heart attack group, because then
MPV could be used as an indicator of heart attack risk and an antiplatelet treatment
could be administered in a timely fashion, potentially reducing the risk of heart attack.
To carry out this study, patients seen for chest pain were divided into groups
according to diagnosis. The researchers then selected a random sample of 35 from
each of the resulting k ϭ 4 groups. The researchers believed that this sampling process
would result in samples that were representative of the four populations of interest
and that could be regarded as if they were random samples from these four populations. Table 15.1 presents summary values given in the paper.
Summary Values for MPV Data of Example 15.1
Sample
Size
Group Description
Noncardiac chest pain
Stable angina pectoris
Unstable angina pectoris
Myocardial infarction (heart attack)
Sample
Mean
35
35
35
35
Sample Standard
Deviation
10.89
11.25
11.37
11.75
0.69
0.74
0.91
1.07
With mi denoting the true mean MPV for group i (i ϭ 1, 2, 3, 4), let’s consider
the null hypothesis H0: m1 ϭ m2 ϭ m3 ϭ m4. Figure 15.2 shows a comparative boxplot for the four samples (based on data consistent with summary values given in the
paper). The mean MPV for the heart attack sample is larger than for the other three
samples and the boxplot for the heart attack sample appears to be shifted a bit higher
Noncardiac
Stable angina
Unstable angina
Myocardial infarction
FIGURE 15.2
Boxplots for Example 15.1.
9
10
11
MPV
12
13
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
15.1
Single-Factor ANOVA and the F Test
707
than the boxplots for the other three samples. However, because the four boxplots
show substantial overlap, it is not obvious whether H0 is true or false. In situations
such as this, we need a formal test procedure.
As with the inferential methods of previous chapters, the validity of the ANOVA
test for H0: m1 5 m2 5 c5 mk requires some assumptions.
Assumptions for ANOVA
1. Each of the k population or treatment response distributions is normal.
2. s1 5 s2 5 % 5 sk (The k normal distributions have identical standard
deviations.)
3. The observations in the sample from any particular one of the k populations or treatments are independent of one another.
4. When comparing population means, the k random samples are selected
independently of one another. When comparing treatment means, treatments are assigned at random to subjects or objects (or subjects are
assigned at random to treatments).
In practice, the test based on these assumptions works well as long as the assumptions are not too badly violated. If the sample sizes are reasonably large, normal probability plots or boxplots of the data in each sample are helpful in checking the assumption of normality. Often, however, sample sizes are so small that a separate
normal probability plot or boxplot for each sample is of little value in checking normality. In this case, a single combined plot can be constructed by first subtracting x1
from each observation in the first sample, x2 from each value in the second sample,
and so on and then constructing a normal probability or boxplot of all N deviations
from their respective means. The plot should be reasonably straight. Figure 15.3
shows such a normal probability plot for the data of Example 15.1.
13
Residual
12
11
10
9
FIGURE 15.3
A normal probability plot using the
combined data of Example 15.1.
−3
−2
−1
0
Normal score
1
2
3
There is a formal procedure for testing the equality of population standard deviations. Unfortunately, it is quite sensitive to even a small departure from the normality
assumption, so we do not recommend its use. Instead, we suggest that the ANOVA
F test (to be described later in this section) can safely be used if the largest of the
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
708
Chapter 15 Analysis of Variance
sample standard deviations is at most twice the smallest one. The largest standard
deviation in Example 15.1 is s4 ϭ 1.07, which is only about 1.5 times the smallest
standard deviation (s1 ϭ 0.69). The book Beyond ANOVA: The Basics of Applied Statistics by Rupert (see the references in the back of the book) is a good source for alternative methods of analysis if there appears to be a violation of assumptions.
The analysis of variance test procedure is based on the following measures of
variation in the data.
DEFINITION
A measure of differences among the sample means is the treatment sum of
squares, denoted by SSTr and given by
SSTr 5 n1 1x1 2 x 2 2 1 n2 1x2 2 x 2 2 1 % 1 nk 1xk 2 x 2 2
A measure of variation within the k samples, called error sum of squares and
denoted by SSE, is
SSE 5 1n1 2 12 s21 1 1n2 2 12 s22 1 % 1 1nk 2 12 s2k
Each sum of squares has an associated df:
treatment df 5 k 2 1
error df 5 N 2 k
A mean square is a sum of squares divided by its df. In particular,
SSTr
k21
SSE
mean square for error 5 MSE 5
N2k
mean square for treatments 5 MSTr 5
The number of error degrees of freedom comes from adding the number of degrees of freedom associated with each of the sample variances:
1n1 2 12 1 1n2 2 12 1 % 1 1nk 2 12 5 n1 1 n2 1 % 1 nk 2 1 2 1 2 % 2 1
5N2k
EXAMPLE 15.2
Heart Attack Calculations
Let’s return to the mean platelet volume (MPV) data of Example 15.1. The grand
mean x was computed to be 11.315. Notice that because the sample sizes are all
equal, the grand mean is just the average of the four sample means (this will not usually be the case when the sample sizes are unequal). With x1 ϭ 10.89, x2 ϭ 11.25,
x3 ϭ 11.37, x4 ϭ 11.75, and n1 ϭ n2 ϭ n3 ϭ n4 ϭ 35,
SSTr 5 n1 1x1 2 x 2 2 1 n2 1x2 2 x 2 2 1 % 1 nk 1xk 2 x 2 2
5 35 110.89 2 11.3152 2 1 35 111.25 2 11.3152 2 1 35 111.37 2 11.3152 2
1 35 111.75 2 11.3152 2
5 6.322 1 0.148 1 0.106 1 6.623
5 13.199
Because s1 ϭ 0.69, s2 ϭ 0.74, s3 ϭ 0.91, and s4 ϭ 1.07
SSE 5 1n1 2 12 s 21 1 1n2 2 12 s 22 1 % 1 1nk 2 12 s 2k
5 135 2 12 10.692 2 1 135 2 12 10.742 2 1 135 2 12 10.912 2 1 135 2 12 11.072 2
5 101.888
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
15.1
Single-Factor ANOVA and the F Test
709
The numbers of degrees of freedom are
treatment df ϭ k Ϫ 1 ϭ 3
error df ϭ N Ϫ k ϭ 35 ϩ 35 ϩ 35 ϩ 35 Ϫ 4 ϭ 136
from which
MSTr 5
MSE 5
13.199
SSTr
5
5 4.400
k21
3
SSE
101.888
5
5 0.749
N2k
136
Both MSTr and MSE are quantities whose values can be calculated once sample
data are available; that is, they are statistics. Each of these statistics varies in value
from data set to data set. Both statistics MSTr and MSE have sampling distributions,
and these sampling distributions have mean values. The following box describes the
key relationship between MSTr and MSE and the mean values of these two
statistics.
When H0 is true 1m1 5 m2 5 % 5 mk2 ,
mMSTr 5 mMSE
However, when H0 is false,
mMSTr . mMSE
and the greater the differences among the m’s, the larger mMSTr will be relative to mMSE.
According to this result, when H0 is true, we expect the two mean squares to be
close to one another, whereas we expect MSTr to substantially exceed MSE when
some m’s differ greatly from others. Thus, a calculated MSTr that is much larger than
MSE casts doubt on H0. In Example 15.2, MSTr ϭ 4.400 and MSE ϭ 0.749, so
MSTr is about 6 times as large as MSE. Can this difference be attributed solely to
sampling variability, or is the ratio MSTr/MSE large enough to suggest that H0 is
false? Before we can describe a formal test procedure, it is necessary to revisit F distributions, first introduced in multiple regression analysis (Chapter 14).
Many ANOVA test procedures are based on a family of probability distributions
called F distributions. An F distribution always arises in connection with a ratio. A
particular F distribution is obtained by specifying both numerator degrees of freedom
(df1) and denominator degrees of freedom (df2). Figure 15.4 shows an F curve for a
particular choice of df1 and df2. All F tests in this book are upper-tailed, so P-values
are areas under the F curve to the right of the calculated values of F.
F curve for particular df1, df2
Shaded area = P-value for upper-tailed F test
FIGURE 15.4
An F curve and P-value for an uppertailed test.
Calculated F
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
710
Chapter 15
Analysis of Variance
Tabulation of these upper-tail areas is cumbersome, because there are two degrees
of freedom rather than just one (as in the case of t distributions). For selected (df1,
df2) pairs, the F table (Appendix Table 6) gives only the four numbers that capture
tail areas .10, .05, .01, and .001, respectively. Here are the four numbers for df1 ϭ 4,
df2 ϭ 10 along with the statements that can be made about the P-value:
Tail area
Value
a.
b.
c.
d.
e.
.10
2.61
↑
a
↑
b
.05
3.48
↑
c
.01
5.99
↑
d
.001
11.28
↑
e
F Ͻ 2.61 → tail area ϭ P-value Ͼ .10
2.61 Ͻ F Ͻ 3.48 → .05 Ͻ P-value Ͻ .10
3.48 Ͻ F Ͻ 5.99 → .01 Ͻ P-value Ͻ .05
5.99 Ͻ F Ͻ 11.28 → .001 Ͻ P-value Ͻ .01
F Ͼ 11.28 → P-value Ͻ .001
If F ϭ 7.12, then .001 Ͻ P-value Ͻ .01. If a test with a Ͻ .05 is used, H0 should be
rejected, because P-value Յ a. The most frequently used statistical computer packages can provide exact P-values for F tests.
The Single-Factor ANOVA F Test
Null hypothesis:
Test statistic:
H 0: m 1 5 m 2 5 % 5 m k
F5
MSTr
MSE
When H0 is true and the ANOVA assumptions are reasonable, F has an F distribution with df1 5 k 2 1 and df2 5 N 2 k.
Values of F more inconsistent with H0 than what was observed in the data are
values even farther out in the upper tail, so the P-value is the area captured in
the upper tail of the corresponding F curve. Appendix Table 6, a statistical
software package, or a graphing calculator can be used to determine P-values for
F tests.
\ EXAMPLE 15.3
Heart Attacks Revisited
The two mean squares for the MPV data given in Example 15.1 were calculated in
Example 15.2 as
MSTr ϭ 4.400
MSE ϭ 0.749
The value of the F statistic is then
F5
4.400
MSTr
5
5 5.87
MSE
0.749
with df1 ϭ k Ϫ 1 ϭ 3 and df2 ϭ N Ϫ k ϭ 140 Ϫ 4 ϭ 136. Using df1 ϭ 3 and
df2 ϭ 120 (the closest value to 136 that appears in the table), Appendix Table 6
shows that 5.78 captures tail area .001. Since 5.87 Ͼ 5.78, it follows that P-value ϭ
captured tail area Ͻ .001. The P-value is smaller than any reasonable a, so there is
compelling evidence for rejecting H0: m1 5 m2 5 % 5 m4. We can conclude that the
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
15.1
Single-Factor ANOVA and the F Test
711
mean MPV is not the same for all four patient populations. Techniques for determining which means differ are introduced in Section 15.2.
EXAMPLE 15.4
Hormones and Body Fat
The article “Growth Hormone and Sex Steroid Administration in Healthy Aged
Women and Men” (Journal of the American Medical Association [2002]: 2282–
2292) described an experiment to investigate the effect of four treatments on various
body characteristics. In this double-blind experiment, each of 57 female subjects age
65 or older was assigned at random to one of the following four treatments: (1) placebo “growth hormone” and placebo “steroid” (denoted by P ϩ P); (2) placebo
“growth hormone” and the steroid estradiol (denoted by P ϩ S); (3) growth hormone
and placebo “steroid” (denoted by G ϩ P); and (4) growth hormone and the steroid
estradiol (denoted by G ϩ S).
The following table lists data on change in body fat mass over the 26-week period
following the treatments that are consistent with summary quantities given in the
article.
CHANGE IN BODY FAT MASS (KG)
Treatment
n
x
s
s2
P؉P
P؉S
G؉P
G؉S
0.1
0.6
2.2
0.7
Ϫ2.0
0.7
0.0
Ϫ2.6
Ϫ1.4
1.5
2.8
0.3
Ϫ1.0
Ϫ1.0
Ϫ0.1
0.2
0.0
Ϫ0.4
Ϫ0.9
Ϫ1.1
1.2
0.1
0.7
Ϫ2.0
Ϫ0.9
Ϫ3.0
1.0
1.2
Ϫ1.6
Ϫ0.4
0.4
Ϫ2.0
Ϫ3.4
Ϫ2.8
Ϫ2.2
Ϫ1.8
Ϫ3.3
Ϫ2.1
Ϫ3.6
Ϫ0.4
Ϫ3.1
0.014
0.064
1.545
2.387
0.014
Ϫ0.286
1.218
1.484
0.013
Ϫ2.023
1.264
1.598
Ϫ3.1
Ϫ3.2
Ϫ2.0
Ϫ2.0
Ϫ3.3
Ϫ0.5
Ϫ4.5
Ϫ0.7
Ϫ1.8
Ϫ2.3
Ϫ1.3
Ϫ1.0
Ϫ5.6
Ϫ2.9
Ϫ1.6
Ϫ0.2
0.016
Ϫ2.250
1.468
2.155
265.4
5 21.15.
57
Let’s carry out an F test to see whether actual mean change in body fat mass differs for the four treatments.
Also, N ϭ 57, grand total ϭ Ϫ65.4, and x 5
Step-by-Step technology
instructions available online
1. Let m1, m2, m3, and m4 denote the true mean change in body fat for treatments
P ϩ P, P ϩ S, G ϩ P, and G ϩ S, respectively.
Data set available online
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
712
Chapter 15
Analysis of Variance
2. H0: m1 ϭ m2 ϭ m3 ϭ m4
3. Ha: At least two among m1, m2, m3, and m4 are different.
4. Significance level: a ϭ .01
5. Test statistic: F 5
MSTr
MSE
6. Assumptions: Figure 15.5 shows boxplots of the data from each of the four
samples. The boxplots are roughly symmetric, and there are no outliers. The largest standard deviation (s1 ϭ 1.545) is not more than twice as big as the smallest
(s2 ϭ 1.264). The subjects were randomly assigned to treatments. The assumptions of ANOVA are reasonable.
P+P
P+S
G+P
G+S
FIGURE 15.5
−6
−5
−4
Boxplots for the data of Example 15.4.
−3 −2 −1
0
1
Change in body fat mass
2
3
7. Computation:
SSTr 5 n1 1x1 2 x 2 2 1 n2 1x2 2 x 2 2 1 % 1 nk 1xk 2 x 2 2
5 14 10.064 2 121.152 2 2 1 14 120.286 2 121.152 2 2
1 13 122.023 2 121.152 2 2 1 16 122.250 2 121.152 2 2
5 60.37
treatment df ϭ k Ϫ 1 ϭ 3
SSE 5 1n1 2 12 s 21 1 1n2 2 12 s 22 1 % 1 1nk 2 12 s 2k
5 13 12.3872 1 13 11.4842 1 12 11.5982 1 15 12.1552
5 101.81
error df 5 N 2 k 5 57 2 4 5 53
Thus,
F5
MSTr
SSTr/treatment df
60.37 /3
20.12
5
5
5
5 10.48
MSE
SSE/error df
101.81 /53
1.92
8. P-value: Appendix Table 6 shows that for df1 ϭ 3 and df2 ϭ 60 (the closest
tabled df to df 5 53), the value 6.17 captures upper-tail area .001. Because F ϭ
10.48 Ͼ 6.17, it follows P-value Ͻ .001.
9. Conclusion: Since P-value Յ a, we reject H0. The mean change in body fat mass
is not the same for all four treatments.
Summarizing an ANOVA
ANOVA calculations are often summarized in a tabular format called an ANOVA
table. To understand such a table, we must define one more sum of squares.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
15.1
Single-Factor ANOVA and the F Test
713
Total sum of squares, denoted by SSTo, is given by
SSTo 5 a 1x 2 x 2 2
all N obs.
with associated df 5 N 2 1.
The relationship between the three sums of squares SSTo, SSTr, and SSE is
SSTo 5 SSTr 1 SSE
which is called the fundamental identity for single-factor ANOVA.
The quantity SSTo, the sum of squared deviations about the grand mean, is a
measure of total variability in the data set consisting of all k samples. The quantity
SSE results from measuring variability separately within each sample and then combining as indicated in the formula for SSE. Such within-sample variability is present
regardless of whether or not H0 is true. The magnitude of SSTr, on the other hand,
has much to do with whether the null hypothesis is true or false. The more the m’s
differ from one another, the larger SSTr will tend to be. Thus, SSTr represents variation that can (at least to some extent) be explained by any differences between means.
An informal paraphrase of the fundamental identity for single-factor ANOVA is
total variation ϭ explained variation ϩ unexplained variation
Once any two of the sums of squares have been calculated, the remaining one is
easily obtained from the fundamental identity. Often SSTo and SSTr are calculated
first (using computational formulas given in the online appendix to this chapter), and
then SSE is obtained by subtraction: SSE ϭ SSTo Ϫ SSTr. All the degrees of freedom, sums of squares, and mean squares are entered in an ANOVA table, as displayed
in Table 15.2. The P-value usually appears to the right of F when the analysis is done
by a statistical software package.
T A B L E 15 .2 General Format for a Single-Factor ANOVA Table
Source of
Variation
df
Sum of Squares
Treatments
kϪ1
SSTr
Error
NϪk
SSE
Total
NϪ1
SSTo
F
Mean Square
SSTr
k21
SSE
MSE 5
N2k
MSTr 5
F5
MSTr
MSE
An ANOVA table from Minitab for the change in body fat mass data of Example
15.4 is shown in Table 15.3. The reported P-value is .000, consistent with our previous conclusion that P-value Ͻ .001.
T A B L E 15 .3 An ANOVA Table from Minitab for the Data of Example 15.4
One-way ANOVA
Source
Factor
Error
Total
DF
3
53
56
SS
60.37
101.81
162.18
MS
20.12
1.92
F
10.48
P
0.000
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
714
Chapter 15 Analysis of Variance
EX E RC I S E S 1 5 . 1 - 1 5 . 1 3
15.1 Give as much information as you can about the
P-value for an upper-tailed F test in each of the following
situations.
a. df1 ϭ 4, df2 ϭ 15, F ϭ 5.37
b. df1 ϭ 4, df2 ϭ 15, F ϭ 1.90
c. df1 ϭ 4, df2 ϭ 15, F ϭ 4.89
d. df1 ϭ 3, df2 ϭ 20, F ϭ 14.48
e. df1 ϭ 3, df2 ϭ 20, F ϭ 2.69
f. df1 ϭ 4, df2 ϭ 50, F ϭ 3.24
15.2 Give as much information as you can about the
P-value of the single-factor ANOVA F test in each of
the following situations.
a. k ϭ 5, n1 ϭ n2 ϭ n3 ϭ n4 ϭ n5 ϭ 4, F ϭ 5.37
b. k ϭ 5, n1 ϭ n2 ϭ n3 ϭ 5, n4 ϭ n5 ϭ 4, F ϭ 2.83
c. k ϭ 3, n1 ϭ 4, n2 ϭ 5, n3 ϭ 6, F ϭ 5.02
d. k ϭ 3, n1 ϭ n2 ϭ 4, n3 ϭ 6, F ϭ 15.90
e. k ϭ 4, n1 ϭ n2 ϭ 15, n3 ϭ 12, n4 ϭ 10, F ϭ 1.75
15.3 Employees of a certain state university system can
choose from among four different health plans. Each
plan differs somewhat from the others in terms of hospitalization coverage. Four samples of recently hospitalized
individuals were selected, each sample consisting of
people covered by a different health plan. The length of
the hospital stay (number of days) was determined for
each individual selected.
a. What hypotheses would you test to decide whether
mean length of stay was related to health plan?
(Note: Carefully define the population characteristics
of interest.)
b. If each sample consisted of eight individuals and the
value of the ANOVA F statistic was F ϭ 4.37, what
conclusion would be appropriate for a test with
a ϭ .01?
c. Answer the question posed in Part (b) if the F value
given there resulted from sample sizes n1 ϭ 9,
n2 ϭ 8, n3 ϭ 7, and n4 ϭ 8.
15.4 The accompanying summary statistics for a measure of social marginality for samples of youths, young
adults, adults, and seniors appeared in the paper “Per-
ceived Causes of Loneliness in Adulthood” (Journal of
Social Behavior and Personality [2000]: 67–84). The
social marginality score measured actual and perceived
social rejection, with higher scores indicating greater social rejection. For purposes of this exercise, assume that it
is reasonable to regard the four samples as representative
of the U.S. population in the corresponding age groups
Bold exercises answered in back
Data set available online
and that the distributions of social marginality scores for
these four groups are approximately normal with the
same standard deviation. Is there evidence that the mean
social marginality score is not the same for all four age
groups? Test the relevant hypotheses using a 5 .01.
Age Group
Youths
Young
Adults
Adults
Seniors
Sample Size
x
s
106
2.00
1.56
255
3.40
1.68
314
3.07
1.66
36
2.84
1.89
15.5
The authors of the paper “Age and Violent
Content Labels Make Video Games Forbidden Fruits
for Youth” (Pediatrics [2009]: 870–876) carried out an
experiment to determine if restrictive labels on video
games actually increased the attractiveness of the game for
young game players. Participants read a description of a
new video game and were asked how much they wanted
to play the game. The description also included an age
rating. Some participants read the description with an age
restrictive label of 71, indicating that the game was not
appropriate for children under the age of 7. Others read
the same description, but with an age restrictive label of
121, 161, or 181. The data below for 12- to 13-year-old
boys are fictitious, but are consistent with summary statistics given in the paper. (The sample sizes in the actual
experiment were larger.) For purposes of this exercise, you
can assume that the boys were assigned at random to one
of the four age label treatments (71, 121, 161, and
181). Data shown are the boys’ ratings of how much they
wanted to play the game on a scale of 1 to 10. Do the data
provide convincing evidence that the mean rating associated with the game description by 12- to 13-year-old boys
is not the same for all four restrictive rating labels? Test the
appropriate hypotheses using a significance level of .05.
71 label
121 label
161 label
181 label
6
6
6
5
4
8
6
1
2
4
8
7
8
5
7
9
5
8
4
7
7
9
8
6
7
4
8
9
6
7
10
9
6
8
7
6
8
9
10
8
Video Solution available
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
15.1
The paper referenced in the previous exercise
also gave data for 12- to 13-year-old girls. Data consistent with summary values in the paper are shown below.
Do the data provide convincing evidence that the mean
rating associated with the game description for 12- to
13-year-old girls is not the same for all four age restrictive rating labels? Test the appropriate hypotheses using
a 5 .05.
15.6
71 label
121 label
161 label
181 label
4
7
6
5
3
6
4
5
10
5
4
5
4
6
3
5
3
8
5
9
6
4
8
6
10
8
6
6
8
5
8
6
6
5
7
4
10
6
8
7
The paper “Women’s and Men’s Eating Behavior Following Exposure to Ideal-Body Images and
Text” (Communication Research [2006]: 507–529)
Single-Factor ANOVA and the F Test
715
Treatment 1 Treatment 2 Treatment 3 Treatment 4
1
5
8
11
5
1
0
6
4
10
7
0
12
8
6
2
7
8
8
5
14
9
0
6
3
12
5
6
10
8
6
2
10
0
3
4
4
5
5
7
8
4
0
6
3
2
0
0
3
4
2
4
1
1
15.7
describes an experiment in which 74 men were assigned
at random to one of four treatments:
1. Viewed slides of fit, muscular men
2. Viewed slides of fit, muscular men accompanied by
diet and fitness-related text
3. Viewed slides of fit, muscular men accompanied by
text not related to diet and fitness
4. Did not view any slides
The participants then went to a room to complete a
questionnaire. In this room, bowls of pretzels were set
out on the tables. A research assistant noted how many
pretzels were consumed by each participant while completing the questionnaire. Data consistent with summary
quantities given in the paper are given in the accompanying table. Do these data provide convincing evidence
that the mean number of pretzels consumed is not the
same for all four treatments? Test the relevant hypotheses
using a significance level of .05.
15.8 Can use of an online plagiarism-detection system
reduce plagiarism in student research papers? The paper
“Plagiarism and Technology: A Tool for Coping with
Plagiarism” (Journal of Education for Business [2005]:
149–152) describes a study in which randomly selected research papers submitted by students during five semesters
were analyzed for plagiarism. For each paper, the percentage
of plagiarized words in the paper was determined by an
online analysis. In each of the five semesters, students were
told during the first two class meetings that they would have
to submit an electronic version of their research papers and
that the papers would be reviewed for plagiarism. Suppose
that the number of papers sampled in each of the five semesters and the means and standard deviations for percentage of plagiarized words are as given in the accompanying
table. For purposes of this exercise, assume that the conditions necessary for the ANOVA F test are reasonable. Do
these data provide evidence to support the claim that mean
percentage of plagiarized words is not the same for all five
semesters? Test the appropriate hypotheses using a 5 .05.
Semester
n
Mean
Standard deviation
1
2
3
4
5
39
42
32
32
34
6.31
3.31
1.79
1.83
1.50
3.75
3.06
3.25
3.13
2.37
Treatment 1 Treatment 2 Treatment 3 Treatment 4
8
7
4
13
2
6
8
0
4
9
1
5
2
0
3
5
2
5
7
5
(continued)
Bold exercises answered in back
Data set available online
Video Solution available
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.