Tải bản đầy đủ
B.8 A Confidence Interval for a Linear Function of the β Parameters; a Confidence Interval for E(y)

B.8 A Confidence Interval for a Linear Function of the β Parameters; a Confidence Interval for E(y)

Tải bản đầy đủ

742 Appendix B The Mechanics of a Multiple Regression Analysis

Figure B.4 Marginal
productivity

when x = 2 is the rate of change of E(y) with respect to x, evaluated at x = 2.∗ The
marginal productivity for a value of x, denoted by the symbol dE(y)/dx, can be
shown (proof omitted) to be
dE(y)
= β1 + 2β2 x
dx
Therefore, the marginal productivity at x = 2 is
dE(y)
= β1 + 2β2 (2) = β1 + 4β2
dx
For x = 2, both E(y) and the marginal productivity are linear functions of the
unknown parameters β0 , β1 , β2 in the model. The problem we pose in this section
is that of finding confidence intervals for linear functions of β parameters or testing
hypotheses concerning their values. The information necessary to solve this problem
is rarely given in a standard multiple regression analysis computer printout, but we
can find these confidence intervals or values of the appropriate test statistics from
knowledge of (X X)−1 .
For the model
y = β0 + β1 x1 + · · · + βk xk + ε
we can make an inference about a linear function of the β parameters, say,
a0 β0 + a1 β1 + · · · + ak βk
where a0 , a1 , . . . , ak are known constants. We will use the corresponding linear
function of least squares estimates,
= a0 βˆ0 + a1 βˆ1 + · · · + ak βˆk
as our best estimate of a0 β0 + a1 β1 + · · · + ak βk .
Then, for the assumptions on the random error ε (stated in Section 4.2), the
sampling distribution for the estimator l will be normal, with mean and standard
error as given in the first box on page 743. This indicates that l is an unbiased
estimator of
E( ) = a0 β0 + a1 β1 + · · · + ak βk
and that its sampling distribution would appear as shown in Figure B.5.

∗ If you have had calculus, you can see that the marginal productivity for y given x is the first derivative of

E(y) = β0 + β1 x + β2 x 2 with respect to x.

A Confidence Interval for a Linear Function of the β Parameters; a Confidence Interval for E(y)

743

Figure B.5 Sampling
distribution for

Mean and Standard Error of
E( ) = a0 β0 + a1 β1 + · · · + ak βk
σl =

σ 2 a (X X)−1 a

where σ 2 is the variance of ε, (X X)−1 is the inverse matrix obtained in fitting
the least squares model to the set of data, and
⎡ ⎤
a0
⎢a1 ⎥
⎢ ⎥
⎢ ⎥
a = ⎢a2 ⎥
⎢ .. ⎥
⎣ .⎦
ak

It can be demonstrated that a 100(1 − α)% confidence interval for E( ) is as
shown in the next box.
A 100(1 − α)% Confidence Interval for E( )
± tα/2 s 2 a (X X)−1 a
where
E( ) = a0 β0 + a1 β1 + · · · + ak βk

= a0 βˆ0 + a1 βˆ1 + · · · + ak βˆk


a0
⎢a1 ⎥
⎢ ⎥
⎢ ⎥
a = ⎢a2 ⎥
⎢ .. ⎥
⎣ .⎦


ak
s2

X)−1

and (X
are obtained from the least squares procedure, and tα/2 is based
on the number of degrees of freedom associated with s 2 .

The linear function of the β parameters that is most often the focus of our
attention is
E(y) = β0 + β1 x1 + · · · + βk xk

744 Appendix B The Mechanics of a Multiple Regression Analysis
That is, we want to find a confidence interval for E(y) for specific values of
x1 , x2 , . . . , xk . For this special case,
= yˆ
and the a matrix is




1
⎢x1 ⎥
⎢ ⎥
⎢ ⎥
a = ⎢x2 ⎥
⎢ .. ⎥
⎣ .⎦
xk

where the symbols x1 , x2 , . . . , xk in the a matrix indicate the specific numerical values
assumed by these variables. Thus, the procedure for forming a confidence interval
for E(y) is as shown in the box.

A 100(1 − α)% Confidence Interval for E(y)
± tα/2 s 2 a (X X)−1 a
where
E(y) = β0 + β1 x1 + β2 x2 + · · · + βk xk

= yˆ = βˆ0 + βˆ1 x1 + · · · + βˆk xk




1
⎢x1 ⎥
⎢ ⎥
⎢ ⎥
a = ⎢x2 ⎥
⎢ .. ⎥
⎣ .⎦
xk

s 2 and (X X)−1 are obtained from the least squares analysis, and tα/2 is based on
the number of degrees of freedom associated with s 2 , namely, n − (k + 1).
Example
B.10

Refer to the data in Example B.5 for sales revenue y and advertising expenditure x.
Find a 95% confidence interval for the mean sales revenue E(y) when advertising
expenditure is x = 4.

Solution
The confidence interval for E(y) for a given value of x is
yˆ ± tα/2 s 2 a (X X)−1 a
Consequently, we need to find and substitute the values of a (X X)−1 a, tα/2 , and yˆ
into this formula. Since we wish to estimate
E(y) = β0 + β1 x
= β0 + β1 (4)

when x = 4

= β0 + 4β1
it follows that the coefficients of β0 and β1 are a0 = 1 and a1 = 4, and thus,
a=

1
4

A Confidence Interval for a Linear Function of the β Parameters; a Confidence Interval for E(y)

745

From Examples B.5 and B.7, yˆ = −.1 + .7x,
(X X)−1 =

1.1 −.3
−.3 .1

and s 2 = .367. Then,
a (X X)−1 a = [1 4]

1.1 −.3
−.3 .1

1
4

We first calculate
a (X X)−1 = [1

4]

1.1 −.3
= [−.1 .1]
−.3 .1

Then,
a (X X)−1 a = [−.1 .1]

1
4

= .3

The t value, t.025 , based on 3 df is 3.182. So, a 95% confidence interval for the mean
sales revenue with an advertising expenditure of 4 is
yˆ ± tα/2 s 2 a (X X)−1 a
Since yˆ = −.1 + .7x = −.1 + (.7)(4) = 2.7, the 95% confidence interval for E(y)
when x = 4 is
2.7 ± (3.182) (.367)(.3)
2.7 ± 1.1
Notice that this is exactly the same result as obtained in Example 3.4.
Example
B.11

An economist recorded a measure of productivity y and the size x for each of 100
companies producing cement. A regression model,
y = β0 + β1 x + β2 x 2 + ε
fit to the n = 100 data points produced the following results:
yˆ = 2.6 + .7x − .2x 2
where x is coded to take values in the interval −2 < x < 2,† and


.0025 .0005 −.0070
0⎦ s = .14
(X X)−1 = ⎣ .0005 .0055
−.0070
0 .0050
Find a 95% confidence interval for the marginal increase in productivity given that
the coded size of a plant is x = 1.5.

Solution
The mean value of y for a given value of x is
E(y) = β0 + β1 x + β2 x 2
Therefore, the marginal increase in y for x = 1.5 is
dE(y)
= β1 + 2β2 x
dx
= β1 + 2(1.5)β2
† We give a formula for coding observational data in Section 5.6.

746 Appendix B The Mechanics of a Multiple Regression Analysis

Figure B.6 A graph of
yˆ = 2.6 + .7x − .2x 2

Or,
E(l) = β1 + 3β2

when

x = 1.5

Note from the prediction equation, yˆ = 2.6 + .7x − .2x 2 , that βˆ1 = .7 and βˆ2 = −.2.
Therefore,
l = βˆ1 + 3βˆ2 = .7 + 3(−.2) = .1
and

⎡ ⎤ ⎡ ⎤
0
a0
a = ⎣a1 ⎦ = ⎣1 ⎦
3
a2

We next calculate



⎤⎡ ⎤
.0025 .0005 −.0070
0
0⎦ ⎣1 ⎦ = .0505
a (X X)−1 a = [0 1 3] ⎣ .0005 .0055
−.0070
0 .0050
3

Then, since s is based on n − (k + 1) = 100 − 3 = 97 df, t.025 ≈ 1.96, and a 95%
confidence interval for the marginal increase in productivity when x = 1.5 is
± t.025 (s 2 )a (X X)−1 a
or
.1 ± (1.96) (.14)2 (.0505)
.1 ± .062
Thus, the marginal increase in productivity, the slope of the tangent to the curve
E(y) = β0 + β1 x + β2 x 2
is estimated to lie in the interval .1 ± .062 at x = 1.5. A graph of yˆ = 2.6 + .7x − .2x 2
is shown in Figure B.6.

B.9 A Prediction Interval for Some Value of y to Be
Observed in the Future
We have indicated in Sections 3.9 and 4.12 that two of the most important
applications of the least squares predictor yˆ are estimating the mean value of
y (the topic of the preceding section) and predicting a new value of y, yet

A Prediction Interval for Some Value of y to Be Observed in the Future

747

unobserved, for specific values of x1 , x2 , . . . , xk . The difference between these two
inferential problems (when each would be pertinent) was explained in Chapters 3
and 4, but we give another example to make certain that the distinction is
clear.
Suppose you are the manager of a manufacturing plant and that y, the daily
profit, is a function of various process variables x1 , x2 , . . . , xk . Suppose you want to
know how much money you would make in the long run if the x’s are set at specific
values. For this case, you would be interested in finding a confidence interval for the
mean profit per day, E(y). In contrast, suppose you planned to operate the plant for
just one more day! Then you would be interested in predicting the value of y, the
profit associated with tomorrow’s production.
We have indicated that the error of prediction is always larger than the error
of estimating E(y). You can see this by comparing the formula for the prediction
interval (shown in the next box) with the formula for the confidence interval for
E(y) that was given in Section B.8.

A 100(1 − α)% Prediction Interval for y
yˆ ± tα/2 s 2 + s 2 a (X X)−1 a = yˆ ± tα/2 s 2 [1 + a (X X)−1 a]
where
yˆ = βˆ0 + βˆ1 x1 + · · · + βˆk xk
s 2 and (X X)−1 are obtained from the least squares analysis,
⎡ ⎤
1
⎢ x1 ⎥
⎢ ⎥
⎢ ⎥
a = ⎢x2 ⎥
⎢ .. ⎥
⎣.⎦
xk
contains the numerical values of x1 , x2 , . . . , xk , and tα/2 is based on the number
of degrees of freedom associated with s 2 , namely, n − (k + 1).
Example
B.12

Refer to the sales–advertising expenditure example (Example B.10). Find a 95%
prediction interval for the sales revenue next month, if it is known that next month’s
advertising expenditure will be x = 4.

Solution
The 95% prediction interval for sales revenue y is
yˆ ± tα/2 s 2 [1 + a (X X)−1 a]
From Example B.10, when x = 4, yˆ = −.1 + .7x = −.1 + (.7)(4) = 2.7, s 2 = .367,
t.025 = 3.182, and a (X X)−1 a = .3. Then the 95% prediction interval for y is
2.7 ± (3.182) (.367)(1 + .3)
2.7 ± 2.2
You will find that this is the same solution as obtained in Example 3.5.

748 Appendix B The Mechanics of a Multiple Regression Analysis

B.9 Exercises
B.22 Refer to Exercise B.16. Find a 90% confidence
interval for E(y) when x = 1. Interpret the interval.

B.23 Refer to Exercise B.16. Suppose you plan to

observe y for x = 1. Find a 90% prediction interval
for that value of y. Interpret the interval.

B.24 Refer to Exercise B.17. Find a 90% confidence
interval for E(y) when x = 2. Interpret the interval.

B.25 Refer to Exercise B.17. Find a 90% prediction
interval for a value of y to be observed in the
future when x = 2. Interpret the interval.

B.26 Refer to Exercise B.18. Find a 90% confidence
interval for the mean value of y when x = 1. Interpret the interval.

B.27 Refer to Exercise B.18. Find a 90% prediction
interval for a value of y to be observed in the
future when x = 1.

B.28 The productivity (items produced per hour) per
worker on a manufacturing assembly line is
expected to increase as piecework pay rate (in
dollars) increases; it is expected to stabilize after a

certain pay rate has been reached. The productivity of five different workers was recorded for each
of five piecework pay rates, $.80, $.90, $1.00, $1.10,
and $1.20, thus giving n = 25 data points. A multiple regression analysis using a second-order model,
E(y) = β0 + β1 x + β2 x 2
gave
yˆ = 2.08 + 8.42x − 1.65x 2
SSE = 26.62, SSyy = 784.11, and


.020 −.010 .015
(X X)−1 = ⎣−.010 .040 −.006⎦
.015 −.006 .028
(a) Find s 2 .
(b) Find a 95% confidence interval for the mean
productivity when the pay rate is $1.10. Interpret this interval.
(c) Find a 95% prediction interval for the production of an individual worker who is paid at a
rate of $1.10 per piece. Interpret the interval.
(d) Find R 2 and interpret the value.

Summary
Except for the tedious process of inverting a matrix (discussed in Appendix C), we
have covered the major steps performed by a computer in fitting a linear statistical
model to a set of data using the method of least squares. We have also explained
how to find the confidence intervals, prediction intervals, and values of test statistics
that would be pertinent in a regression analysis.
In addition to providing a better understanding of a multiple regression analysis,
the most important contributions of this appendix are contained in Sections B.8 and
B.9. If you want to make a specific inference concerning the mean value of y or
any linear function of the β parameters and if you are unable to obtain the results
from the computer package you are using, you will find the contents of Sections
B.8 and B.9 very useful. Since you will almost always be able to find a computer
program package to find (X X)−1 , you will be able to calculate the desired confidence
interval(s) and so forth on your own.

Supplementary Exercises
B.29. Use the method of least squares to fit a straight line
to the six data points:
x

−5

−3

−1

1

3

5

y

1.1

1.9

3.0

3.8

5.1

6.0

(a) Construct Y and X matrices for the data.
(b) Find X X and X Y.

(c) Find the least squares estimates,
βˆ = (X X)−1 X Y
[Note: See Theorem A.1 for information on
finding (X X)−1 .]
(d) Give the prediction equation.
(e) Find SSE and s 2 .

A Prediction Interval for Some Value of y to Be Observed in the Future

(f) Does the model contribute information for the
prediction of y? Test H0 : β1 = 0. Use α = .05.
(g) Find r 2 and interpret its value.
(h) Find a 90% confidence interval for E(y) when
x = .5. Interpret the interval.

(b) Find X X and X Y.
(c) Compare the X X matrix for two replications of
the experiment with the X X matrix obtained
for a single replication (part b of Exercise B.17).
What is the relationship between the elements
in the two matrices?
(d) Observe the (X X)−1 matrix for a single replication (see part c of Exercise B.17). Verify that
the (X X)−1 matrix for two replications contains
elements that are equal to 12 of the values of the
corresponding elements in the (X X)−1 matrix
for a single replication of the experiment. [Hint:
Show that the product of the (X X)−1 matrix
(for two replications) and the X X matrix from
part c equals the identity matrix I.]
(e) Find the prediction equation.
(f) Find SSE and s 2 .
(g) Do the data provide sufficient information to
indicate that x contributes information for the
prediction of y? Test using α = .05.
(h) Find r 2 and interpret its value.

B.30. An experiment was conducted to investigate the
effect of extrusion pressure P and temperature T
on the strength y of a new type of plastic. Two plastic
specimens were prepared for each of five combinations of pressure and temperature. The specimens
were then tested in random order, and the breaking strength for each specimen was recorded. The
independent variables were coded to simplify computations, that is,
P − 200
T − 400
x2 =
10
25
The n = 10 data points are listed in the table.
x1 =

y

x1

x2

5.2; 5.0
.3; −.1
−1.2; −1.1
2.2; 2.0
6.2; 6.1

−2
−1
0
1
2

2
−1
−2
−1
2

(a) Give the Y and X matrices needed to fit the
model y = β0 + β1 x1 + β2 x2 + ε.
(b) Find the least squares prediction equation.
(c) Find SSE and s 2 .
(d) Does the model contribute information for the
prediction of y? Test using α = .05.
(e) Find R 2 and interpret its value.
(f) Test the null hypothesis that β1 = 0. Use
α = .05. What is the practical implication of
the test?
(g) Find a 90% confidence interval for the mean
strength of the plastic for x1 = −2 and x2 = 2.
(h) Suppose a single specimen of the plastic is to
be installed in the engine mount of a Douglas
DC-10 aircraft. Find a 90% prediction interval
for the strength of this specimen if x1 = −2 and
x2 = 2.
B.31. Suppose we obtained two replications of the experiment described in Exercise B.17; that is, two values
of y were observed for each of the six values of x.
The data are shown at the bottom of the page.
(a) Suppose (as in Exercise B.17) you wish to fit
the model E(y) = β0 + β1 x. Construct Y and X
matrices for the data. [Hint: Remember, the Y
matrix must be of dimension 12 × 1.]
1

x
y

1.1

B.32. Refer to Exercise B.31.
(a) Find a 90% confidence interval for E(y) when
x = 4.5. Interpret the interval.
(b) Suppose we wish to predict the value of y if,
in the future, x = 4.5. Find a 90% prediction
interval for y and interpret the interval.
B.33. Refer to Exercise B.31. Suppose you replicated the
experiment described in Exercise B.17 three times;
that is, you collected three observations on y for
each value of x. Then n = 18.
(a) What would be the dimensions of the Y
matrix?
(b) Write the X matrix for three replications. Compare with the X matrices for one and for two
replications. Note the pattern.
(c) Examine the X X matrices obtained for one and
two replications of the experiment (obtained in
Exercises B.17 and B.31, respectively). Deduce
the values of the elements of the X X matrix for
three replications.
(d) Look at your answer to Exercise B.31, part
d. Deduce the values of the elements in the
(X X)−1 matrix for three replications.
(e) Suppose you wanted to find a 90% confidence
interval for E(y) when x = 4.5 based on three
replications of the experiment. Find the value of
a (X X)−1 a that appears in the confidence interval and compare with the value of a (X X)−1 a
that would be obtained for a single replication
of the experiment.

2
.5

1.8

749

3
2.0

2.0

4
2.9

3.8

5
3.4

4.1

6
5.0

5.0

5.8

750 Appendix 2 The Mechanics of a Multiple Regression Analysis
(f) Approximately how much of a reduction in the
width of the confidence interval is obtained
by using three versus two replications?

[Note: The values of s computed from the
two sets of data will almost certainly be
different.]

References
Draper, N. and Smith, H. Applied Regression Analysis.
3rd ed. New York: Wiley, 1998.
Graybill, F. A. Theory and Application of the Linear
Model. North Scituate, Mass.: Duxbury, 1976.
Kutner, M. H., Nachtsheim, C. J., Neter, J., and Li, W.
Applied Linear Statistical Models, 5th ed. New York:
McGraw-Hill, 2005.

Mendenhall, W. Introduction to Linear Models and the
Design and Analysis of Experiments. Belmont, Calif.:
Wadsworth, 1968.

Appendix

C

A Procedure
for Inverting
a Matrix

There are several different methods for inverting matrices. All are tedious and
time-consuming. Consequently, in practice, you will invert almost all matrices using
a computer. The purpose of this section is to present one method for inverting small
(2 × 2 or 3 × 3) matrices manually, thus giving you an appreciation of the enormous
computing problem involved in inverting large matrices (and, consequently, in
fitting linear models containing many terms to a set of data). In particular, you will
be able to understand why rounding errors creep into the inversion process and,
consequently, why two different computer programs might invert the same matrix
and produce inverse matrices with slightly different corresponding elements.
The procedure we will demonstrate to invert a matrix A requires us to perform
a series of operations on the rows of the A matrix. For example, suppose
A=

1 −2
−2 6

We will identify two different ways to operate on a row of a matrix:∗
1. We can multiply every element in one particular row by a constant, c. For
example, we could operate on the first row of the A matrix by multiplying every
element in the row by a constant, say, 2. Then the resulting row would be [2 − 4].
2. We can operate on a row by multiplying another row of the matrix by a
constant and then adding (or subtracting) the elements of that row to elements
in corresponding positions in the row operated upon. For example, we could
operate on the first row of the A matrix by multiplying the second row by a
constant, say, 2:
2[−2 6] = [−4 12]
Then we add this row to row 1:
[(1 − 4)(−2 + 12)] = [−3 10]
Note one important point. We operated on the first row of the A matrix. Although
we used the second row of the matrix to perform the operation, the second row
would remain unchanged. Therefore, the row operation on the A matrix that we
have just described would produce the new matrix,
−3 10
−2 6

∗ We

omit a third row operation, because it would add little and could be confusing.

751

752 Appendix C A Procedure for Inverting a Matrix
Matrix inversion using row operations is based on an elementary result from
matrix algebra. It can be shown (proof omitted) that performing a series of row
operations on a matrix A is equivalent to multiplying A by a matrix B (i.e., row
operations produce a new matrix, BA). This result is used as follows: Place the A
matrix and an identity matrix I of the same dimensions side by side. Then perform
the same series of row operations on both A and I until the A matrix has been
changed into the identity matrix I. This means that you have multiplied both A and
I by some matrix B such that:




1 0 0 ··· 0
⎢0 1 0 · · · 0⎥









I = ⎢0 0 1 · · · 0⎥
A=⎢



.
.
.
.


.. ⎥

⎣ .. .. ..
0 0 0 ··· 1


← Row operations change A to I →













I=⎢











B=⎢









BA = I and BI = B
Since BA = I, it follows that B = A−1 . Therefore, as the A matrix is transformed by
row operations into the identity matrix I, the identity matrix I is transformed into
A−1 , that is,
BI = B = A−1
We show you how this procedure works with two examples.
Example
C.1

Find the inverse of the matrix
A=

1 −2
−2 6

Solution
Place the A matrix and a 2 × 2 identity matrix side by side and then perform the
following series of row operations (we indicate by an arrow the row operated upon
in each operation):
A=
OPERATION 1:

1 −2
−2 6

1 0
0 1

Multiply the first row by 2 and add it to the second row:
1 −2
→ 0 2

OPERATION 2:

I=

1 0
2 1

Multiply the second row by 12 :
1 −2
→ 0 1

1 0
1 12