Tải bản đầy đủ

4 Fitting the Model: The Method of Least Squares

Fitting the Model: The Method of Least Squares

171

must be solved to ﬁnd the (k + 1) estimated coefﬁcients βˆ0 , βˆ1 , . . . , βˆk are often

difﬁcult (tedious and time-consuming) to solve with a calculator. Consequently, we

resort to the use of statistical computer software and present output from SAS,

SPSS, and MINITAB in examples and exercises.

Example

4.1

A collector of antique grandfather clocks sold at auction believes that the price

received for the clocks depends on both the age of the clocks and the number of

bidders at the auction. Thus, he hypothesizes the ﬁrst-order model

y = β0 + β1 x1 + β2 x2 + ε

where

y = Auction price (dollars)

x1 = Age of clock (years)

x2 = Number of bidders

A sample of 32 auction prices of grandfather clocks, along with their age and the

number of bidders, is given in Table 4.1.

(a) Use scattergrams to plot the sample data. Interpret the plots.

(b) Use the method of least squares to estimate the unknown parameters β0 , β1 ,

and β2 of the model.

(c) Find the value of SSE that is minimized by the least squares method.

GFCLOCKS

Table 4.1 Auction price data

Age, x1

Number of

Bidders, x2

Auction

Price, y

Age, x1

Number of

Bidders, x2

Auction

Price, y

127

13

$1,235

170

14

$2,131

115

12

1,080

182

8

1,550

127

7

845

162

11

1,884

150

9

1,522

184

10

2,041

156

6

1,047

143

6

845

182

11

1,979

159

9

1,483

156

12

1,822

108

14

1,055

132

10

1,253

175

8

1,545

137

9

1,297

108

6

729

113

9

946

179

9

1,792

137

15

1,713

111

15

1,175

117

11

1,024

187

8

1,593

137

8

1,147

111

7

785

153

6

1,092

115

7

744

117

13

1,152

194

5

1,356

126

10

1,336

168

7

1,262

172 Chapter 4 Multiple Regression Models

Solution

(a) MINITAB side-by-side scatterplots for examining the bivariate relationships

between y and x1 , and between y and x2 , are shown in Figure 4.2. Of the

two variables, age (x1 ) appears to have the stronger linear relationship with

auction price (y).

Figure 4.2 MINITAB

side-by-side scatterplots for

the data of Table 4.1

(b) The model hypothesized is ﬁt to the data in Table 4.1 with SAS. A portion

of the printout is reproduced in Figure 4.3. The least squares estimates of

the β parameters (highlighted) are βˆ0 = −1,339, βˆ1 = 12.74, and βˆ2 = 85.95.

Therefore, the equation that minimizes SSE for this data set (i.e., the least

squares prediction equation) is

yˆ = −1,339 + 12.74x1 + 85.95x2

Figure 4.3 SAS regression

output for the auction price

model, Example 4.1

Estimation of σ 2 , the Variance of ε

173

(c) The minimum value of the sum of the squared errors, also highlighted in

Figure 4.3, is SSE = 516,727.

Example

4.2

Problem

Refer to the ﬁrst-order model for auction price (y) considered in Example 4.1.

Interpret the estimates of the β parameters in the model.

Solution

The least squares prediction equation, as given in Example 4.1, is yˆ = −1,339 +

12.74x1 + 85.95x2 . We know that with ﬁrst-order models, β1 represents the slope of

the line relating y to x1 for ﬁxed x2 . That is, β1 measures the change in E(y) for

every one-unit increase in x1 when the other independent variable in the model is

held ﬁxed. A similar statement can be made about β2 : β2 measures the change in

E(y) for every one-unit increase in x2 when the other x in the model is held ﬁxed.

Consequently, we obtain the following interpretations:

βˆ1 = 12.74: We estimate the mean auction price E(y) of an antique clock to

increase $12.74 for every 1-year increase in age (x1 ) when the number of bidders

(x2 ) is held ﬁxed.

βˆ2 = 85.95: We estimate the mean auction price E(y) of an antique clock to

increase $85.95 for every one-bidder increase in the number of bidders (x2 )

when age (x1 ) is held ﬁxed.

The value βˆ0 = −1,339 does not have a meaningful interpretation in this example.

To see this, note that yˆ = βˆ0 when x1 = x2 = 0. Thus, βˆ0 = −1,339 represents the

estimated mean auction price when the values of all the independent variables are

set equal to 0. Because an antique clock with these characteristics—an age of 0

years and 0 bidders on the clock—is not practical, the value of βˆ0 has no meaningful

interpretation. In general, βˆ0 will not have a practical interpretation unless it makes

sense to set the values of the x’s simultaneously equal to 0.

4.5 Estimation of σ 2, the Variance of ε

Recall that σ 2 is the variance of the random error ε. As such, σ 2 is an important

measure of model utility. If σ 2 = 0, all the random errors will equal 0 and the

prediction equation yˆ will be identical to E(y), that is, E(y) will be estimated

without error. In contrast, a large value of σ 2 implies large (absolute) values of ε

and larger deviations between the prediction equation yˆ and the mean value E(y).

Consequently, the larger the value of σ 2 , the greater will be the error in estimating

the model parameters β0 , β1 , . . . , βk and the error in predicting a value of y for

a speciﬁc set of values of x1 , x2 , . . . , xk . Thus, σ 2 plays a major role in making

inferences about β0 , β1 , . . . , βk , in estimating E(y), and in predicting y for speciﬁc

values of x1 , x2 , . . . , xk .

Since the variance σ 2 of the random error ε will rarely be known, we must

use the results of the regression analysis to estimate its value. Recall that σ 2 is the

variance of the probability distribution of the random error ε for a given set of

values for x1 , x2 , . . . , xk ; hence, it is the mean value of the squares of the deviations

of the y-values (for given values of x1 , x2 , . . . , xk ) about the mean value E(y).∗ Since

∗ Because y = E(y) + ε, then ε is equal to the deviation y − E(y). Also, by deﬁnition, the variance of a random

variable is the expected value of the square of the deviation of the random variable from its mean. According

to our model, E(ε) = 0. Therefore, σ 2 = E(ε2 ).

174 Chapter 4 Multiple Regression Models

the predicted value yˆ estimates E(y) for each of the data points, it seems natural

to use

SSE =

(yi − yˆ i )2

to construct an estimator of σ 2 .

Estimator of σ 2 for Multiple Regression Model with k Independent

Variables

s 2 = MSE =

=

SSE

n − Number of estimated β parameters

SSE

n − (k + 1)

For example, in the ﬁrst-order model of Example 4.1, we found that SSE =

516,727. We now want to use this quantity to estimate the variance of ε. Recall

that the estimator for the straight-line model is s 2 = SSE/(n − 2) and note that

the denominator is (n − Number of estimated β parameters), which is (n − 2) in the

straight-line model. Since we must estimate three parameters, β0 , β1 , ﬁrst-order

model in Example 4.1, the estimator of σ 2 is

SSE

n−3

The numerical estimate for this example is

s2 =

s2 =

516,727

SSE

=

= 17,818

32 − 3

29

In many computer printouts and textbooks, s 2 is called the mean square for error

(MSE). This estimate of σ 2 is highlighted in the SAS printout in Figure 4.3.

The units of the estimated variance are squared units of the dependent variable

y. Since the dependent variable y in this example is auction price in dollars, the

units of s 2 are (dollars)2 . This makes meaningful interpretation of s 2 difﬁcult, so we

use the standard deviation s to provide a more meaningful measure of variability.

In this example,

s=

17,818 = 133.5

which is highlighted on the SAS printout in Figure 4.3 (next to Root MSE). One

useful interpretation of the estimated standard deviation s is that the interval ±2s

will provide a rough approximation to the accuracy with which the model will predict

future values of y for given values of x. Thus, in Example 4.1, we expect the model

to provide predictions of auction price to within about ±2s = ±2(133.5) = ±267

dollars.†

For the general multiple regression model

y = β0 + β1 x1 + β2 x2 + · · · + βk xk + ε

we must estimate the (k + 1) parameters β0 , β1 , β2 , . . . , βk . Thus, the estimator of

σ 2 is SSE divided by the quantity (n − Number of estimated β parameters).

† The ±2s approximation will improve as the sample size is increased. We provide more precise methodology

for the construction of prediction intervals in Section 4.9.

## 2011 (7th edition) william mendenhall a second course in statistics regression analysis prentice hall (2011)

## 2 Populations, Samples, and Random Sampling

## 3 Fitting the Model: The Method of Least Squares

## 6 Assessing the Utility of the Model: Making Inferences About the Slope β[sub(1)]

## 6 Testing the Utility of a Model: The Analysis of Variance F-Test

## 11 A Quadratic (Second-Order) Model with a Quantitative Predictor

## 1 Introduction: Why Model Building Is Important

## 1 Introduction: Why Use a Variable-Screening Method?

## 5 Extrapolation: Predicting Outside the Experimental Region

## 7 Follow-Up Analysis: Tukey’s Multiple Comparisons of Means

## B.7 Standard Errors of Estimators, Test Statistics, and Confidence Intervals for β[sub(0)], β[sub(1)], . . . , β[sub(k)]

Tài liệu liên quan