6 Testing the Utility of a Model: The Analysis of Variance F-Test
Tải bản đầy đủ
176 Chapter 4 Multiple Regression Models
Note that the denominator of the F statistic, MSE, represents the unexplained
(or error) variability in the model. The numerator, MS(Model), represents the
variability in y explained (or accounted for) by the model. (For this reason,
the test is often called the ‘‘analysis-of-variance’’ F -test.) Since F is the ratio of
the explained variability to the unexplained variability, the larger the proportion
of the total variability accounted for by the model, the larger the F statistic.
To determine when the ratio becomes large enough that we can conﬁdently
reject the null hypothesis and conclude that the model is more useful than no model
at all for predicting y, we compare the calculated F statistic to a tabulated F -value
with k df in the numerator and [n − (k + 1)] df in the denominator. Recall that
tabulations of the F -distribution for various values of α are given in Tables 3, 4, 5,
and 6 of Appendix D.
Rejection region: F > Fα , where F is based on k numerator and n − (k + 1)
denominator degrees of freedom (see Figure 4.4).
However, since statistical software printouts report the observed signiﬁcance level
(p-value) of the test, most researchers simply compare the selected α value to the
p-value to make the decision.
The analysis of variance F -test for testing the usefulness of the model is
summarized in the next box.
Testing Global Usefulness of the Model: The Analysis of Variance
F-Test
H0 : β1 = β2 = · · · = βk = 0 (All model terms are unimportant for predicting y)
Ha : At least one βi = 0
(At least one model term is useful for predicting y)
Test statistic: F =
=
(SSyy − SSE)/k
R 2 /k
=
SSE/[n − (k + 1)]
(1 − R 2 )/[n − (k + 1)]
Mean square (Model)
Mean square (Error)
where n is the sample size and k is the number of terms in the model.
Rejection region: F > Fα , with k numerator degrees of freedom and [n − (k + 1)]
denominator degrees of freedom.
or
α > p-value, where p-value = P (F > Fc ), Fc is the computed value of the test
statistic.
Assumptions: The standard regression assumptions about the random error
component (Section 4.2).
Example
4.3
Refer to Example 4.2, in which an antique collector modeled the auction price y
of grandfather clocks as a function of the age of the clock, x1 , and the number of
bidders, x2 . The hypothesized ﬁrst-order model is
y = β0 + β1 x1 + β2 x2 + ε
Testing the Utility of a Model: The Analysis of Variance F-Test
177
Figure 4.4 Rejection
region for the global F -test
a
0
F
Fa
Rejection region
A sample of 32 observations is obtained, with the results summarized in the
MINITAB printout in Figure 4.5. Conduct the global F -test of model usefulness at
the α = .05 level of signiﬁcance.
Figure 4.5 MINITAB
regression printout for
grandfather clock model
Solution
The elements of the global test of the model follow:
H0 : β1 = β2 = 0
[Note: k = 2]
Ha : At least one of the two model coefﬁcients is nonzero
Test statistic: F = 120.19 (shaded in Figure 4.5)
p-value = .000 (shaded in Figure 4.5)
Conclusion: Since α = .05 exceeds the observed signiﬁcance level, p = .000, the data
provide strong evidence that at least one of the model coefﬁcients is nonzero. The
overall model appears to be statistically useful for predicting auction prices.
178 Chapter 4 Multiple Regression Models
Can we be sure that the best prediction model has been found if the global
F -test indicates that a model is useful? Unfortunately, we cannot. The addition
of other independent variables may improve the usefulness of the model. (See
the accompanying box.) We consider more complex multiple regression models in
Sections 4.10–4.12.
Caution
A rejection of the null hypothesis H0 : β1 = β2 = · · · = βk in the global
F -test leads to the conclusion [with 100(1 − α)% conﬁdence] that the model
is statistically useful. However, statistically ‘‘useful’’ does not necessarily mean
‘‘best.’’ Another model may prove even more useful in terms of providing more
reliable estimates and predictions. This global F -test is usually regarded as a
test that the model must pass to merit further consideration.
4.7 Inferences About the Individual β Parameters
Inferences about the individual β parameters in a model are obtained using either a
conﬁdence interval or a test of hypothesis, as outlined in the following two boxes.∗
Test of an Individual Parameter Coefﬁcient in the Multiple Regression
Model
ONE-TAILED TESTS
Test statistic:
Rejection region:
H0 : βi = 0
Ha : βi < 0
βˆi
t=
sβˆt
t < −tα
H0 : βi = 0
Ha : βi > 0
TWO-TAILED TEST
H0 : βi = 0
Ha : βi = 0
|t| > tα/2
t > tα
where tα and tα/2 are based on n − (k + 1) degrees of freedom and
n = Number of observations
k + 1 = Number of β parameters in the model
Note: Most statistical software programs report two-tailed p-values on their
output. To ﬁnd the appropriate p-value for a one-tailed test, make the following
adjustment to P = two-tailed p-value:
For Ha : βi > 0, p-value =
P /2
if t > 0
1 − P /2 if t < 0
For Ha : βi < 0, p-value =
1 − P /2 if t > 0
P /2
if t < 0
Assumptions: See Section 4.2 for assumptions about the probability distribution
for the random error component ε.
∗ The formulas for computing βˆ and its standard error are so complex, the only reasonable way to present them
i
is by using matrix algebra. We do not assume a prerequisite of matrix algebra for this text and, in any case, we
think the formulas can be omitted in an introductory course without serious loss. They are programmed into all
statistical software packages with multiple regression routines and are presented in some of the texts listed in
the references.
Inferences About the Individual β Parameters
179
A 100 (1 − α)% Conﬁdence Interval for a β Parameter
βˆi ± (tα/2 )sβˆt
where tα/2 is based on n − (k + 1) degrees of freedom and
n = Number of observations
k + 1 = Number of β parameters in the model
We illustrate these methods with another example.
Example
4.4
Refer to Examples 4.1–4.3. A collector of antique grandfather clocks knows that the
price (y) received for the clocks increases linearly with the age (x1 ) of the clocks.
Moreover, the collector hypothesizes that the auction price (y) of the clocks will
increase linearly as the number of bidders (x2 ) increases. Use the information on
the SAS printout, Figure 4.6, to:
(a) Test the hypothesis that the mean auction price of a clock increases as the
number of bidders increases when age is held constant, that is, β2 > 0. Use
α = .05.
(b) Form a 95% conﬁdence interval for β1 and interpret the result.
Solution
(a) The hypotheses of interest concern the parameter β2 . Speciﬁcally,
H0 : β2 = 0
Ha : β2 > 0
Figure 4.6 SAS regression output for the auction price model, Example 4.4
180 Chapter 4 Multiple Regression Models
The test statistic is a t statistic formed by dividing the sample estimate βˆ2 of
the parameter β2 by the estimated standard error of βˆ2 (denoted sβˆ2 ). These
estimates, βˆ2 = 85.953 and sβˆ2 = 8.729, as well as the calculated t-value, are
highlighted on the SAS printout, Figure 4.6
βˆ2
85.953
= 9.85
Test statistic: t =
=
sβˆ2
8.729
The p-value for the two-tailed test of hypothesis, Ha : β2 ± 0, is also shown
on the printout under Pr > |t|. This value (highlighted) is less than .0001. To
obtain the p-value for the one-tailed test, Ha : β2 > 0, we divide this p-value in
half. Consequently, the observed signiﬁcance level for our upper-tailed test is
.0001
= .00005.
p-value =
2
Since α = .05 exceeds p-value = .0005, we have sufﬁcient evidence to
reject H0 . Thus, the collector can conclude that the mean auction price of a
clock increases as the number of bidders increases, when age is held constant.
(b) A 95% conﬁdence interval for β1 is (from the box):
βˆ1 ± (tα/2 )sβˆ1 = βˆ1 ± (t.05 )sβˆ1
Substituting βˆ1 = 12.74, sβˆi = .905 (both obtained from the SAS printout,
Figure 4.6) and t.025 = 2.045 (from Table C.2) into the equation, we obtain
12.74 ± (2.045)(.905) = 12.74 ± 1.85
or (10.89, 14.59). This interval is also shown (highlighted) on the SAS printout.
Thus, we are 95% conﬁdent that β1 falls between 10.89 and 14.59. Since β1
is the slope of the line relating auction price (y) to age of the clock (x1 ),
we conclude that price increases between $10.89 and $14.59 for every 1-year
increase in age, holding number of bidders (x2 ) constant.
After we have determined that the overall model is useful for predicting y
using the F -test (Section 4.6), we may elect to conduct one or more t-tests on
the individual β parameters (as in Example 4.4). However, the test (or tests) to
be conducted should be decided a priori, that is, prior to ﬁtting the model. Also,
we should limit the number of t-tests conducted to avoid the potential problem
of making too many Type I errors. Generally, the regression analyst will conduct
t-tests only on the ‘‘most important’’ β’s. We provide insight in identifying the most
important β’s in a linear model in the next several sections.
Recommendation for Checking the Utility of a Multiple Regression
Model
1. First, conduct a test of overall model adequacy using the F -test, that is, test
H0 : β1 = β2 = · · · = βk = 0
If the model is deemed adequate (i.e., if you reject H0 ), then proceed to
step 2. Otherwise, you should hypothesize and ﬁt another model. The new
model may include more independent variables or higher-order terms.
2. Conduct t-tests on those β parameters in which you are particularly interested
(i.e., the ‘‘most important’’ β’s). These usually involve only the β’s associated
with higher-order terms (x 2 , x1 x2 , etc.). However, it is a safe practice to limit
the number of β’s that are tested. Conducting a series of t-tests leads to a
high overall Type I error rate α.
Multiple Coefﬁcients of Determination: R2 and R2a
181
We conclude this section with a ﬁnal caution about conducting t-tests on
individual β parameters in a model.
Caution
Extreme care should be exercised when conducting t-tests on the individual β
parameters in a ﬁrst-order linear model for the purpose of determining which
independent variables are useful for predicting y and which are not. If you fail
to reject H0 : βi = 0, several conclusions are possible:
1. There is no relationship between y and xi .
2. A straight-line relationship between y and x exists (holding the other x’s
in the model ﬁxed), but a Type II error occurred.
3. A relationship between y and xi (holding the other x’s in the model
ﬁxed) exists, but is more complex than a straight-line relationship (e.g.,
a curvilinear relationship may be appropriate). The most you can say
about a β parameter test is that there is either sufﬁcient (if you reject
H0 : βi = 0) or insufﬁcient (if you do not reject H0 : βi = 0) evidence of a
linear (straight-line) relationship between y and xi .
4.8 Multiple Coefﬁcients of Determination:
R2 and R2a
Recall from Chapter 3 that the coefﬁcient of determination, r 2 , is a measure of how
well a straight-line model ﬁts a data set. To measure how well a multiple regression
model ﬁts a set of data, we compute the multiple regression equivalent of r 2 , called
the multiple coefﬁcient of determination and denoted by the symbol R2 .
Deﬁnition 4.1 The multiple coefﬁcient of determination, R2 , is deﬁned as
SSE
0 ≤ R2 ≤ 1
R2 = 1 −
SSyy
where SSE = (yi − yˆ i )2 , SSyy = (yi − y)
¯ 2 , and yˆ i is the predicted value of
yi for the multiple regression model.
Just as for the simple linear model, R 2 represents the fraction of the sample
variation of the y-values (measured by SSyy ) that is explained by the least squares
regression model. Thus, R 2 = 0 implies a complete lack of ﬁt of the model to the
data, and R 2 = 1 implies a perfect ﬁt, with the model passing through every data
point. In general, the closer the value of R 2 is to 1, the better the model ﬁts the data.
To illustrate, consider the ﬁrst-order model for the grandfather clock auction
price presented in Examples 4.1–4.4. A portion of the SPSS printout of the analysis
is shown in Figure 4.7. The value R 2 = .892 is highlighted on the printout. This
relatively high value of R 2 implies that using the independent variables age and
number of bidders in a ﬁrst-order model explains 89.2% of the total sample variation
(measured by SSyy ) in auction price y. Thus, R 2 is a sample statistic that tells how
well the model ﬁts the data and thereby represents a measure of the usefulness of
the entire model.
182 Chapter 4 Multiple Regression Models
Figure 4.7 A portion of
the SPSS regression output
for the auction price model
A large value of R 2 computed from the sample data does not necessarily mean
that the model provides a good ﬁt to all of the data points in the population. For
example, a ﬁrst-order linear model that contains three parameters will provide a
perfect ﬁt to a sample of three data points and R 2 will equal 1. Likewise, you will
always obtain a perfect ﬁt (R 2 = 1) to a set of n data points if the model contains
exactly n parameters. Consequently, if you want to use the value of R 2 as a measure
of how useful the model will be for predicting y, it should be based on a sample
that contains substantially more data points than the number of parameters in the
model.
Caution
In a multiple regression analysis, use the value of R 2 as a measure of how useful
a linear model will be for predicting y only if the sample contains substantially
more data points than the number of β parameters in the model.
As an alternative to using R 2 as a measure of model adequacy, the adjusted
multiple coefﬁcient of determination, denoted Ra2 , is often reported. The formula
for Ra2 is shown in the box.
Deﬁnition 4.2 The adjusted multiple coefﬁcient of determination is given by
Ra2 = 1 −
=1−
(n − 1)
n − (k + 1)
SSE
SSyy
(n − 1)
(1 − R 2 )
n − (k + 1)
Note: Ra2 ≤ R 2 and, for poor-ﬁtting models Ra2 may be negative.
R 2 and Ra2 have similar interpretations. However, unlike R 2 , Ra2 takes into account
(‘‘adjusts’’ for) both the sample size n and the number of β parameters in the model.
Ra2 will always be smaller than R 2 , and more importantly, cannot be ‘‘forced’’ to 1 by
simply adding more and more independent variables to the model. Consequently,
analysts prefer the more conservative Ra2 when choosing a measure of model
adequacy. The value of Ra2 is also highlighted in Figure 4.7. Note that Ra2 = .885, a
value only slightly smaller than R 2 .
Despite their utility, R 2 and Ra2 are only sample statistics. Consequently, it is
dangerous to judge the usefulness of the model based solely on these values. A
prudent analyst will use the analysis-of-variance F -test for testing the global utility
of the multiple regression model. Once the model has been deemed ‘‘statistically’’
useful with the F -test, the more conservative value of Ra2 is used to describe the
proportion of variation in y explained by the model.
Multiple Coefﬁcients of Determination: R2 and R2a
183
4.8 Exercises
4.1 Degrees of freedom.
How is the number of
degrees of freedom available for estimating σ 2 , the
variance of ε, related to the number of independent
variables in a regression model?
4.2 Accounting and Machiavellianism. Refer to the
Behavioral Research in Accounting (January 2008)
study of Machiavellian traits (e.g., manipulation,
cunning, duplicity, deception, and bad faith) in
accountants, Exercise 1.47 (p. 41). Recall that a
Machiavellian (‘‘Mach’’) rating score was determined for each in a sample of accounting alumni
of a large southwestern university. For one portion
of the study, the researcher modeled an accountant’s Mach score (y) as a function of age (x1 ),
gender (x2 ), education (x3 ), and income (x4 ). Data
on n = 198 accountants yielded the results shown
in the table.
INDEPENDENT
VARIABLE
t-VALUE FOR H0 : βi = 0
p-VALUE
Age (x1 )
0.10
> .10
−0.55
> .10
Gender (x2 )
Education (x3 )
1.95
< .01
Income (x4 )
0.52
> .10
Overall model: R 2 = .13, F = 4.74 (p-value < .01)
(a) Write the equation of the hypothesized model
relating y to x1 , x2 , x3 , and x4 .
(b) Conduct a test of overall model utility. Use
α = .05.
(c) Interpret the coefﬁcient of determination, R 2 .
(d) Is there sufﬁcient evidence (at α = .05) to say
that income is a statistically useful predictor of
Mach score?
4.3 Study of adolescents with ADHD. Children with
attention-deﬁcit/hyperactivity disorder (ADHD)
were monitored to evaluate their risk for substance
(e.g., alcohol, tobacco, illegal drug) use (Journal
of Abnormal Psychology, August 2003). The following data were collected on 142 adolescents
diagnosed with ADHD:
y = frequency of marijuana use the past 6 months
x1 = severity of inattention (5-point scale)
x2 = severity of impulsivity–hyperactivity
(5-point scale)
x3 = level of oppositional–deﬁant and conduct
disorder (5-point scale)
(a) Write the equation of a ﬁrst-order model for
E(y).
(b) The coefﬁcient of determination for the model
is R 2 = .08. Interpret this value.
(c) The global F -test for the model yielded a
p-value less than .01. Interpret this result.
(d) The t-test for H0 : β1 = 0 resulted in a p-value
less than .01. Interpret this result.
(e) The t-test for H0 : β2 = 0 resulted in a p-value
greater than .05. Interpret this result.
(f) The t-test for H0 : β3 = 0 resulted in a p-value
greater than .05. Interpret this result.
4.4 Characteristics of lead users. During new product development, companies often involve ‘‘lead
users’’ (i.e., creative individuals who are on the
leading edge of an important market trend).
Creativity and Innovation Management (February
2008) published an article on identifying the social
network characteristics of lead users of children’s
computer games. Data were collected for n = 326
children and the following variables measured:
lead-user rating (y, measured on a 5-point scale),
gender (x1 = 1 if female, 0 if male), age (x2 , years),
degree of centrality (x3 , measured as the number of direct ties to other peers in the network),
and betweenness centrality (x4 , measured as the
number of shortest paths between peers). A ﬁrstorder model for y was ﬁt to the data, yielding the
following least squares prediction equation:
yˆ = 3.58 + .01x1 − .06x2 − .01x3 + .42x4
(a) Give two properties of the errors of prediction that result from using the method of least
squares to obtain the parameter estimates.
(b) Give a practical interpretation the estimate of
β4 in the model.
(c) A test of H0 : β4 = 0 resulted in a two-tailed
p-value of .002. Make the appropriate conclusion at α = .05.
4.5 Runs scored in baseball. In Chance (Fall 2000),
statistician Scott Berry built a multiple regression
model for predicting total number of runs scored
by a Major League Baseball team during a season.
Using data on all teams over a 9-year period (a
sample of n = 234), the results in the next table
(p. 184) were obtained.
(a) Write the least squares prediction equation for
y = total number of runs scored by a team in
a season.
(b) Conduct a test of H0 : β7 = 0 against Ha : β7 < 0
at α = .05. Interpret the results.
(c) Form a 95% conﬁdence interval for β5 . Interpret the results.
(d) Predict the number of runs scored by your
favorite Major League Baseball team last
184 Chapter 4 Multiple Regression Models
year. How close is the predicted value to the
actual number of runs scored by your team?
(Note: You can ﬁnd data on your favorite team
on the Internet at www.mlb.com.)
INDEPENDENT
VARIABLE
Intercept
Walks (x1 )
Singles (x2 )
Doubles (x3 )
Triples (x4 )
Home Runs (x5 )
Stolen Bases (x6 )
Caught Stealing (x7 )
Strikeouts (x8 )
Outs (x9 )
β ESTIMATE
3.70
.34
.49
.72
1.14
1.51
.26
−.14
−.10
−.10
STANDARD ERROR
15.00
.02
.03
.05
.19
.05
.05
.14
.01
.01
Source: Berry, S. M. ‘‘A statistician reads the sports pages:
Modeling offensive ability in baseball,’’ Chance, Vol. 13,
No. 4, Fall 2000 (Table 2).
4.6 Earnings of Mexican street vendors. Detailed
interviews were conducted with over 1,000 street
vendors in the city of Puebla, Mexico, in order
to study the factors inﬂuencing vendors’ incomes
(World Development, February 1998). Vendors
were deﬁned as individuals working in the street,
and included vendors with carts and stands on
wheels and excluded beggars, drug dealers, and
prostitutes. The researchers collected data on gender, age, hours worked per day, annual earnings,
and education level. A subset of these data appears
in the accompanying table.
(a) Write a ﬁrst-order model for mean annual
earnings, E(y), as a function of age (x1 ) and
hours worked (x2 ).
SAS output for Exercise 4.6
STREETVEN
VENDOR
NUMBER
ANNUAL
EARNINGS y
AGE x1
HOURS WORKED
PER DAY x2
21
53
60
184
263
281
354
401
515
633
677
710
800
914
997
$2841
1876
2934
1552
3065
3670
2005
3215
1930
2010
3111
2882
1683
1817
4066
29
21
62
18
40
50
65
44
17
70
20
29
15
14
33
12
8
10
10
11
11
5
8
8
6
9
9
5
7
12
Source: Adapted from Smith, P. A., and Metzger, M. R.
‘‘The return to education: Street vendors in Mexico,’’
World Development, Vol. 26, No. 2, Feb. 1998, pp.
289–296.
(b) The model was ﬁt to the data using SAS. Find
the least squares prediction equation on the
printout shown below.
(c) Interpret the estimated β coefﬁcients in your
model.
(d) Conduct a test of the global utility of the model
(at α = .01). Interpret the result.
(e) Find and interpret the value of Ra2 .
(f) Find and interpret s, the estimated standard
deviation of the error term.
(g) Is age (x1 ) a statistically useful predictor of
annual earnings? Test using α = .01.
(h) Find a 95% conﬁdence interval for β2 . Interpret the interval in the words of the problem.
Multiple Coefﬁcients of Determination: R2 and R2a
4.7 Urban population estimation using satellite
images. Can the population of an urban area be
estimated without taking a census? In Geographical Analysis (January 2007) geography professors
at the University of Wisconsin–Milwaukee and
Ohio State University demonstrated the use of
satellite image maps for estimating urban population. A portion of Columbus, Ohio, was partitioned
into n = 125 census block groups and satellite
imagery was obtained. For each census block,
the following variables were measured: population
density (y), proportion of block with low-density
residential areas (x1 ), and proportion of block with
high-density residential areas (x2 ). A ﬁrst-order
model for y was ﬁt to the data with the following
results:
yˆ = −.0304 + 2.006x1 + 5.006x2 , R 2 = .686
(a) Give a practical interpretation of each
β-estimate in the model.
(b) Give a practical interpretation of the coefﬁcient of determination, R 2 .
(c) State H0 and Ha for a test of overall model
adequacy.
(d) Refer to part c. Compute the value of the test
statistic.
(e) Refer to parts c and d. Make the appropriate
conclusion at α = .01.
4.8 Novelty of a vacation destination. Many tourists
choose a vacation destination based on the newness or uniqueness (i.e., the novelty) of the
itinerary. Texas A&M University professor J. Petrick investigated the relationship between novelty
and vacationing golfers’ demographics (Annals
of Tourism Research, Vol. 29, 2002). Data were
obtained from a mail survey of 393 golf vacationers to a large coastal resort in southeastern United States. Several measures of novelty
level (on a numerical scale) were obtained for
each vacationer, including ‘‘change from routine,’’
‘‘thrill,’’ ‘‘boredom-alleviation,’’ and ‘‘surprise.’’
The researcher employed four independent variables in a regression model to predict each of the
novelty measures. The independent variables were
x1 = number of rounds of golf per year, x2 = total
number of golf vacations taken, x3 = number of
years played golf, and x4 = average golf score.
(a) Give the hypothesized equation of a ﬁrst-order
model for y = change from routine.
(b) A test of H0 : β3 = 0 versus Ha : β3 < 0 yielded a
p-value of .005. Interpret this result if α = .01.
(c) The estimate of β3 was found to be negative.
Based on this result (and the result of part
b), the researcher concluded that ‘‘those who
have played golf for more years are less apt
to seek change from their normal routine in
185
their golf vacations.’’ Do you agree with this
statement? Explain.
(d) The regression results for the three other
dependent novelty measures are summarized
in the table below. Give the null hypothesis for
testing the overall adequacy of each ﬁrst-order
regression model.
DEPENDENT VARIABLE
F -VALUE
Thrill
Boredom-alleviation
Surprise
5.56
3.02
3.33
p-VALUE
< .001
.018
.011
R2
.055
.030
.023
Source: Reprinted from Annals of Tourism Research,
Vol. 29, Issue 2, J. F. Petrick, ‘‘An examination of golf
vacationers’ novelty,” Copyright © 2002, with permission
from Elsevier.
(e) Give the rejection region for the test, part d,
using α = .01.
(f) Use the test statistics reported in the table and
the rejection region from part e to conduct
the test for each of the dependent measures of
novelty.
(g) Verify that the p-values in the table support
your conclusions in part f.
(h) Interpret the values of R 2 reported in the table.
4.9 Highway crash data analysis.
Researchers at
Montana State University have written a tutorial
on an empirical method for analyzing before and
after highway crash data (Montana Department
of Transportation, Research Report, May 2004).
The initial step in the methodology is to develop
a Safety Performance Function (SPF)—a mathematical model that estimates crash occurrence for
a given roadway segment. Using data collected
for over 100 roadway segments, the researchers
ﬁt the model, E(y) = β0 + β1 x1 + β2 x2 , where
y = number of crashes per 3 years, x1 = roadway
length (miles), and x2 = AADT (average annual
daily trafﬁc) (number of vehicles). The results are
shown in the following tables.
Interstate Highways
VARIABLE
Intercept
Length (x1 )
AADT (x2 )
PARAMETER
ESTIMATE
STANDARD
ERROR
t-VALUE
1.81231
.10875
.00017
.50568
.03166
.00003
3.58
3.44
5.19
PARAMETER
ESTIMATE
STANDARD
ERROR
t-VALUE
1.20785
.06343
.00056
.28075
.01809
.00012
4.30
3.51
4.86
Non-Interstate Highways
VARIABLE
Intercept
Length (x1 )
AADT (x2 )
186 Chapter 4 Multiple Regression Models
(a) Give the least squares prediction equation for
the interstate highway model.
(b) Give practical interpretations of the β estimates, part a.
(c) Refer to part a. Find a 99% conﬁdence interval
for β1 and interpret the result.
(d) Refer to part a. Find a 99% conﬁdence interval
for β2 and interpret the result.
(e) Repeat parts a–d for the non-interstate highway model.
4.10 Snow geese feeding trial. Refer to the Journal
of Applied Ecology (Vol. 32, 1995) study of the
feeding habits of baby snow geese, Exercise 3.46
(p. 127). The data on gosling weight change, digestion efﬁciency, acid-detergent ﬁber (all measured
as percentages) and diet (plants or duck chow) for
42 feeding trials are saved in the SNOWGEESE
ﬁle. (The table shows selected observations.) The
botanists were interested in predicting weight
change (y) as a function of the other variables. The
ﬁrst-order model E(y) = β0 + β1 x1 + β2 x2 , where
x1 is digestion efﬁciency and x2 is acid-detergent
ﬁber, was ﬁt to the data. The MINITAB printout
is given below.
SNOWGEESE (First and last ﬁve trials)
FEEDING
TRIAL DIET
1
2
3
4
5
38
39
40
41
42
Plants
Plants
Plants
Plants
Plants
Duck Chow
Duck Chow
Duck Chow
Duck Chow
Duck Chow
WEIGHT DIGESTION
ACIDCHANGE EFFICIENCY DETERGENT
(%)
(%)
FIBER (%)
−6
−5
−4.5
0
2
9
12
8.5
10.5
14
0
2.5
5
0
0
59
52.5
75
72.5
69
28.5
27.5
27.5
32.5
32
8.5
8
6
6.5
7
Source: Gadallah, F. L., and Jefferies, R. L. ‘‘Forage quality in brood rearing areas of the lesser snow goose and the
growth of captive goslings,’’ Journal of Applied Ecology,
Vol. 32, No. 2, 1995, pp. 281–282 (adapted from Figures 2
and 3).
4.11 Deep space survey of quasars. A quasar is a
(a) Find the least squares prediction equation for
weight change, y.
(b) Interpret the β-estimates in the equation,
part a.
(c) Conduct the F -test for overall model adequacy
using α = .01.
(d) Find and interpret the values of R 2 and Ra2 .
Which is the preferred measure of model ﬁt?
(e) Conduct a test to determine if digestion efﬁciency, x1 , is a useful linear predictor of weight
change. Use α = .01.
(f) Form a 99% conﬁdence interval for β2 . Interpret the result.
MINITAB output for Exercise 4.10
distant celestial object (at least 4 billion lightyears away) that provides a powerful source of
radio energy. The Astronomical Journal (July
1995) reported on a study of 90 quasars detected
by a deep space survey. The survey enabled
astronomers to measure several different quantitative characteristics of each quasar, including
redshift range, line ﬂux (erg/cm2 · s), line luminosity (erg/s), AB1450 magnitude, absolute magnitude,
and rest frame equivalent width. The data for a
sample of 25 large (redshift) quasars is listed in the
table on p. 187.
(a) Hypothesize a ﬁrst-order model for equivalent
width, y, as a function of the ﬁrst four variables
in the table.