Tải bản đầy đủ
4 Fitting the Model: The Method of Least Squares

# 4 Fitting the Model: The Method of Least Squares

Tải bản đầy đủ

Fitting the Model: The Method of Least Squares

171

must be solved to ﬁnd the (k + 1) estimated coefﬁcients βˆ0 , βˆ1 , . . . , βˆk are often
difﬁcult (tedious and time-consuming) to solve with a calculator. Consequently, we
resort to the use of statistical computer software and present output from SAS,
SPSS, and MINITAB in examples and exercises.
Example
4.1

A collector of antique grandfather clocks sold at auction believes that the price
received for the clocks depends on both the age of the clocks and the number of
bidders at the auction. Thus, he hypothesizes the ﬁrst-order model
y = β0 + β1 x1 + β2 x2 + ε
where
y = Auction price (dollars)
x1 = Age of clock (years)
x2 = Number of bidders
A sample of 32 auction prices of grandfather clocks, along with their age and the
number of bidders, is given in Table 4.1.
(a) Use scattergrams to plot the sample data. Interpret the plots.
(b) Use the method of least squares to estimate the unknown parameters β0 , β1 ,
and β2 of the model.
(c) Find the value of SSE that is minimized by the least squares method.

GFCLOCKS

Table 4.1 Auction price data
Age, x1

Number of
Bidders, x2

Auction
Price, y

Age, x1

Number of
Bidders, x2

Auction
Price, y

127

13

\$1,235

170

14

\$2,131

115

12

1,080

182

8

1,550

127

7

845

162

11

1,884

150

9

1,522

184

10

2,041

156

6

1,047

143

6

845

182

11

1,979

159

9

1,483

156

12

1,822

108

14

1,055

132

10

1,253

175

8

1,545

137

9

1,297

108

6

729

113

9

946

179

9

1,792

137

15

1,713

111

15

1,175

117

11

1,024

187

8

1,593

137

8

1,147

111

7

785

153

6

1,092

115

7

744

117

13

1,152

194

5

1,356

126

10

1,336

168

7

1,262

172 Chapter 4 Multiple Regression Models

Solution
(a) MINITAB side-by-side scatterplots for examining the bivariate relationships
between y and x1 , and between y and x2 , are shown in Figure 4.2. Of the
two variables, age (x1 ) appears to have the stronger linear relationship with
auction price (y).

Figure 4.2 MINITAB
side-by-side scatterplots for
the data of Table 4.1

(b) The model hypothesized is ﬁt to the data in Table 4.1 with SAS. A portion
of the printout is reproduced in Figure 4.3. The least squares estimates of
the β parameters (highlighted) are βˆ0 = −1,339, βˆ1 = 12.74, and βˆ2 = 85.95.
Therefore, the equation that minimizes SSE for this data set (i.e., the least
squares prediction equation) is
yˆ = −1,339 + 12.74x1 + 85.95x2

Figure 4.3 SAS regression
output for the auction price
model, Example 4.1

Estimation of σ 2 , the Variance of ε

173

(c) The minimum value of the sum of the squared errors, also highlighted in
Figure 4.3, is SSE = 516,727.
Example
4.2

Problem
Refer to the ﬁrst-order model for auction price (y) considered in Example 4.1.
Interpret the estimates of the β parameters in the model.

Solution
The least squares prediction equation, as given in Example 4.1, is yˆ = −1,339 +
12.74x1 + 85.95x2 . We know that with ﬁrst-order models, β1 represents the slope of
the line relating y to x1 for ﬁxed x2 . That is, β1 measures the change in E(y) for
every one-unit increase in x1 when the other independent variable in the model is
held ﬁxed. A similar statement can be made about β2 : β2 measures the change in
E(y) for every one-unit increase in x2 when the other x in the model is held ﬁxed.
Consequently, we obtain the following interpretations:
βˆ1 = 12.74: We estimate the mean auction price E(y) of an antique clock to
increase \$12.74 for every 1-year increase in age (x1 ) when the number of bidders
(x2 ) is held ﬁxed.
βˆ2 = 85.95: We estimate the mean auction price E(y) of an antique clock to
increase \$85.95 for every one-bidder increase in the number of bidders (x2 )
when age (x1 ) is held ﬁxed.
The value βˆ0 = −1,339 does not have a meaningful interpretation in this example.
To see this, note that yˆ = βˆ0 when x1 = x2 = 0. Thus, βˆ0 = −1,339 represents the
estimated mean auction price when the values of all the independent variables are
set equal to 0. Because an antique clock with these characteristics—an age of 0
years and 0 bidders on the clock—is not practical, the value of βˆ0 has no meaningful
interpretation. In general, βˆ0 will not have a practical interpretation unless it makes
sense to set the values of the x’s simultaneously equal to 0.

4.5 Estimation of σ 2, the Variance of ε
Recall that σ 2 is the variance of the random error ε. As such, σ 2 is an important
measure of model utility. If σ 2 = 0, all the random errors will equal 0 and the
prediction equation yˆ will be identical to E(y), that is, E(y) will be estimated
without error. In contrast, a large value of σ 2 implies large (absolute) values of ε
and larger deviations between the prediction equation yˆ and the mean value E(y).
Consequently, the larger the value of σ 2 , the greater will be the error in estimating
the model parameters β0 , β1 , . . . , βk and the error in predicting a value of y for
a speciﬁc set of values of x1 , x2 , . . . , xk . Thus, σ 2 plays a major role in making
inferences about β0 , β1 , . . . , βk , in estimating E(y), and in predicting y for speciﬁc
values of x1 , x2 , . . . , xk .
Since the variance σ 2 of the random error ε will rarely be known, we must
use the results of the regression analysis to estimate its value. Recall that σ 2 is the
variance of the probability distribution of the random error ε for a given set of
values for x1 , x2 , . . . , xk ; hence, it is the mean value of the squares of the deviations
of the y-values (for given values of x1 , x2 , . . . , xk ) about the mean value E(y).∗ Since
∗ Because y = E(y) + ε, then ε is equal to the deviation y − E(y). Also, by deﬁnition, the variance of a random
variable is the expected value of the square of the deviation of the random variable from its mean. According
to our model, E(ε) = 0. Therefore, σ 2 = E(ε2 ).

174 Chapter 4 Multiple Regression Models
the predicted value yˆ estimates E(y) for each of the data points, it seems natural
to use
SSE =

(yi − yˆ i )2

to construct an estimator of σ 2 .

Estimator of σ 2 for Multiple Regression Model with k Independent
Variables
s 2 = MSE =
=

SSE
n − Number of estimated β parameters
SSE
n − (k + 1)

For example, in the ﬁrst-order model of Example 4.1, we found that SSE =
516,727. We now want to use this quantity to estimate the variance of ε. Recall
that the estimator for the straight-line model is s 2 = SSE/(n − 2) and note that
the denominator is (n − Number of estimated β parameters), which is (n − 2) in the
straight-line model. Since we must estimate three parameters, β0 , β1 , ﬁrst-order
model in Example 4.1, the estimator of σ 2 is
SSE
n−3
The numerical estimate for this example is
s2 =

s2 =

516,727
SSE
=
= 17,818
32 − 3
29

In many computer printouts and textbooks, s 2 is called the mean square for error
(MSE). This estimate of σ 2 is highlighted in the SAS printout in Figure 4.3.
The units of the estimated variance are squared units of the dependent variable
y. Since the dependent variable y in this example is auction price in dollars, the
units of s 2 are (dollars)2 . This makes meaningful interpretation of s 2 difﬁcult, so we
use the standard deviation s to provide a more meaningful measure of variability.
In this example,
s=

17,818 = 133.5

which is highlighted on the SAS printout in Figure 4.3 (next to Root MSE). One
useful interpretation of the estimated standard deviation s is that the interval ±2s
will provide a rough approximation to the accuracy with which the model will predict
future values of y for given values of x. Thus, in Example 4.1, we expect the model
to provide predictions of auction price to within about ±2s = ±2(133.5) = ±267
dollars.†
For the general multiple regression model
y = β0 + β1 x1 + β2 x2 + · · · + βk xk + ε
we must estimate the (k + 1) parameters β0 , β1 , β2 , . . . , βk . Thus, the estimator of
σ 2 is SSE divided by the quantity (n − Number of estimated β parameters).
† The ±2s approximation will improve as the sample size is increased. We provide more precise methodology
for the construction of prediction intervals in Section 4.9.