Tải bản đầy đủ - 0 (trang)
1: Simple Linear Regression Model

1: Simple Linear Regression Model

Tải bản đầy đủ - 0trang

13.1



613



Simple Linear Regression Model



y



Observation (4, 30)

e=4



26



Graph of

y = 50 – 10x + x 2



FIGURE 13.1

A deviation from the deterministic part

of a probabilistic model.



x



4



Simple Linear Regression

The simple linear regression model is a special case of the general probabilistic model

in which the deterministic function f (x) is linear (so its graph is a straight line).



DEFINITION

The simple linear regression model assumes that there is a line with vertical

or y intercept a and slope b, called the population regression line. When a

value of the independent variable x is fixed and an observation on the dependent variable y is made,

y ϭ a ϩ bx ϩ e

Without the random deviation e, all observed (x, y) points would fall exactly on

the population regression line. The inclusion of e in the model equation recognizes that points will deviate from the line by a random amount.

Figure 13.2 shows two observations in relation to the population regression line.

y

Observation when x = x1

(positive deviation)



Population regression

line (slope β)



e2

e1

Observation when x = x2

(negative deviation)



α = vertical

intercept



FIGURE 13.2

Two observations and deviations from

the population regression line.



x



0

0



x = x1



x = x2



Before we make an observation on y for any particular value of x, we are uncertain

about the value of e. It could be negative, positive, or even 0. Also, it might be quite



Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



614



Chapter 13 Simple Linear Regression and Correlation: Inferential Methods



large in magnitude (a point far from the population regression line) or quite small (a

point very close to the line). In this chapter, we make some assumptions about the

distribution of e in repeated sampling at any particular x value.



Basic Assumptions of the Simple Linear Regression Model

1. The distribution of e at any particular x value has mean value 0. That is,

me ϭ 0.

2. The standard deviation of e (which describes the spread of its distribution)

is the same for any particular value of x. This standard deviation is denoted

by s.

3. The distribution of e at any particular x value is normal.

4. The random deviations e1, e2, . . . , en associated with different observations

are independent of one another.



These assumptions about the e term in the simple linear regression model also

imply that there is variability in the y values observed at any particular value of x.

Consider y when x has some fixed value x*, so that

y 5 a 1 bx* 1 e

Because a and b are fixed numbers, a 1 bx* is also a fixed number. The sum of

a fixed number and a normally distributed variable (e) is also a normally distributed variable (the bell-shaped curve is simply relocated), so y itself has a normal

distribution. Furthermore, me 5 0 implies that the mean value of y is a 1 bx*, the

height of the population regression line above the value x*. Finally, because there

is no variability in the fixed number a 1 bx*, the standard deviation of y is the

same as the standard deviation of e. These properties are summarized in the following box.



At any fixed x value, y has a normal distribution, with

a



mean y value

height of the population

b5a

b 5 a 1 bx

for fixed x

regression line above x



and

(standard deviation of y for a fixed x) = s

The slope b of the population regression line is the average change in y associated

with a 1-unit increase in x. The y intercept a is the height of the population line when

x 5 0. The value of s determines the extent to which (x, y) observations deviate from

the population line. When s is small, most observations will be quite close to the line,

but when s is large, there are likely to be some large deviations.



The key features of the model are illustrated in Figures 13.3 and 13.4. Notice

that the three normal curves in Figure 13.3 have identical spreads. This is a consequence of se 5 s, which implies that the variability in the y values at a particular x

does not depend on the value of x.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



13.1



615



Simple Linear Regression Model



y

y = α + βx,

the population

regression line

(line of mean values)



α + βx3



Mean value α + βx3

Standard deviation σ

Normal curve



α + βx2



Mean value α + βx2

Standard deviation σ

Normal curve



α + βx1



Mean value α + βx1

Standard deviation σ

Normal curve



x1



FIGURE 13.3

Illustration of the simple linear regression model.



x2



x



x3



Three different x values



Population regression

line



Population regression

line



FIGURE 13.4

Data from the simple linear regression

model: (a) small s; (b) large s.



(a)



EXAMPLE 13.1



(b)



Stand on Your Head to Lose Weight?



The authors of the article “On Weight Loss by Wrestlers Who Have Been Stand-



© ImageState Royalty-Free/Alamy



ing on Their Heads” (paper presented at the Sixth International Conference on

Statistics, Combinatorics, and Related Areas, Forum for Interdisciplinary Mathematics, 1999, with the data also appearing in A Quick Course in Statistical Process

Control, Mick Norton, Pearson Prentice Hall, 2005) stated that “amateur wrestlers

who are overweight near the end of the weight certification period, but just barely

so, have been known to stand on their heads for a minute or two, get on their feet,

step back on the scale, and establish that they are in the desired weight class. Using

a headstand as the method of last resort has become a fairly common practice in

amateur wrestling.”

Does this really work? Data were collected in an experiment in which weight loss

was recorded for each wrestler after exercising for 15 minutes and then doing a headstand for 1 minute 45 seconds. Based on these data, the authors of the article concluded that there was in fact a demonstrable weight loss that was greater than that for

a control group that exercised for 15 minutes but did not do the headstand. (The

authors give a plausible explanation for why this might be the case based on the way

blood and other body fluids collect in the head during the headstand and the effect

of weighing while these fluids are draining immediately after standing.) The authors



Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



616



Chapter 13 Simple Linear Regression and Correlation: Inferential Methods



also concluded that a simple linear regression model was a reasonable way to describe

the relationship between the variables

y 5 weight loss (in pounds)

and

x 5 body weight prior to exercise and headstand (in pounds)

Suppose that the actual model equation has a 5 0, b 5 0.001, and s 5 0.09 (these

values are consistent with the findings in the article). The population regression line

is shown in Figure 13.5.

y



Mean y when

= 0.19

x = 190



(



)



Population

regression line

y = 0.001x

x



FIGURE 13.5

The population regression line for

Example 13.1.



x = 190



If the distribution of the random errors at any fixed weight (x value) is normal,

then the variable y 5 weight loss is normally distributed with

my 5 a 1 bx 5 0 1 0.001x

sy 5 s 5 .09

For example, when x 5 190 (corresponding to a 190-pound wrestler), weight

loss has mean value

my 5 0 1 0.001(190) 5 .19

Because the standard deviation of y is s 5 0.09, the interval 0.19 6 2(0.09) 5

(0.01, 0.37) includes y values that are within 2 standard deviations of the mean

value for y when x 5 190. Roughly 95% of the weight loss observations made for

190 pound wrestlers will be in this range.

The slope b 5 0.001 is the change in average weight loss associated with each

additional pound of body weight.



More insight into model properties can be gained by thinking of the population

of all (x, y) pairs as consisting of many smaller populations. Each one of these smaller

populations contains pairs for which x has a fixed value. For example, suppose that in

a large population of college students the variables

x 5 grade point average in major courses

and

y 5 starting salary after graduation

are related according to the simple linear regression model. Then there is the population of all pairs with x 5 3.20 (corresponding to all students with a grade point average of 3.20 in major courses), the population of all pairs having x 5 2.75, and so on.

The model assumes that for each such population, y is normally distributed with

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



13.1



617



Simple Linear Regression Model



the same standard deviation, and the mean y value (rather than y itself ) is linearly

related to x.

In practice, the judgment of whether the simple linear regression model is appropriate must be based on how the data were collected and on a scatterplot of the

data. The sample observations should be independent of one another. In addition,

the scatterplot should show a linear rather than a curved pattern, and the vertical

spread of points should be relatively homogeneous throughout the range of x values.

Figure 13.6 shows plots with three different patterns; only the first is consistent with

the model assumptions.

y



y



y



FIGURE 13.6

Some commonly encountered patterns

in scatterplots: (a) consistent with the

simple linear regression model;

(b) suggests a nonlinear probabilistic

model; (c) suggests that variability in

y changes with x.



x



x

(b)



(a)



x

(c)



Estimating the Population Regression Line

For the remainder of this chapter, we will proceed with the view that the basic assumptions of the simple linear regression model are reasonable. The values of a and

b ( y intercept and slope of the population regression line) will almost never be known

to an investigator. Instead, these values must first be estimated from the sample data

(x1, y1), . . . , (xn, yn).

Let a and b denote point estimates of a and b, respectively. These estimates are

based on the method of least squares introduced in Chapter 5. The sum of squared

vertical deviations of points in the scatterplot from the least-squares line is smaller

than for any other line.



The point estimates of b, the slope, and a, the y intercept of the population regression line, are the slope and y intercept, respectively, of the least-squares line. That is,

b 5 point estimate of b 5



Sxy

Sxx



a 5 point estimate of a 5 y 2 bx

where

Sxy 5 a xy 2



1 g x 2 1 g y2

1 g x2 2

 and Sxx 5 a x2 2

n

n



The estimated regression line is the familiar least-squares line

y^ 5 a 1 bx

Let x* denote a specified value of the predictor variable x. Then a 1 bx* has two different interpretations:



1. It is a point estimate of the mean y value when x 5 x*.

2. It is a point prediction of an individual y value to be observed when x 5 x*.



Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



618



Chapter 13 Simple Linear Regression and Correlation: Inferential Methods



EXAMPLE 13.2



Mother’s Age and Baby’s Birth Weight



Medical researchers have noted that adolescent females are much more likely to

deliver low-birth-weight babies than are adult females. Because low-birth-weight babies have higher mortality rates, a number of studies have examined the relationship

between birth weight and mother’s age for babies born to young mothers.

One such study is described in the article “Body Size and Intelligence in 6-Year-



Olds: Are Offspring of Teenage Mothers at Risk?” (Maternal and Child Health

Journal [2009]: 847–856). The following data on

x 5 maternal age (in years)

and

y 5 birth weight of baby (in grams)

are consistent with summary values given in the referenced article and also with data

published by the National Center for Health Statistics.



OBSERVATION



x

y



1



2



3



4



5



6



7



8



9



10



15

2289



17

3393



18

3271



15

2648



16

2897



19

3327



17

2970



16

2535



18

3138



19

3573



A scatterplot of the data is given in Figure 13.7. The scatterplot shows a linear pattern

and the spread in the y values appears to be similar across the range of x values. This

supports the appropriateness of the simple linear regression model.

Baby’s weight (g)

3500



3000



2500



FIGURE 13.7

Scatterplot of the data from

Example 13.2.



15



16



17

Mother’s age (yr)



18



19



The summary statistics (computed from the given sample data) are

a x 5 170  

a y 5 30,041

2

2

a x 5 2910  a xy 5 515,600  a y 5 91,785,351

n 5 10  



Step-by-Step technology

instructions available online

Data set available online

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



13.1



Simple Linear Regression Model



619



from which

Sxy 5 a xy 2



1 g x2 1 g y2

11702 130,0412

5 515,600 2

5 4903.0

n

10



Sxx 5 a x 2 2



11702 2

1 g x2 2

5 2910 2

5 20.0

n

10



x5



170

30041

5 17.0  y 5

5 3004.1

10

10



This gives

Sxy

4903.0

5

5 245.15

b5

Sxx

20.0

a 5 y 2 bx 5 3004.1 2 1245.12 117.02 5 21163.45

The equation of the estimated regression line is then

y^ 5 a 1 bx 5 21163.45 1 245.15x

A point estimate of the mean birth weight of babies born to 18-year-old mothers

results from substituting x 5 18 into the estimated equation:

1estimated mean y when x 5 182 5 a 1 bx

5 21163.45 1 245.15 1182

5 3249.25 grams

Similarly, we would predict the birth weight of a baby to be born to a particular

18-year-old mother to be

(predicted y value when x 5 18) 5 a 1 b(18) 5 3249.25 grams

The point estimate and the point prediction are identical, because the same x

value was used in each calculation. However, the interpretation of each is different.

One represents our prediction of the weight of a single baby whose mother is 18,

whereas the other represents our estimate of the mean weight of all babies born to

18-year-old mothers. This distinction will become important in Section 13.4, when

we consider interval estimates and predictions.

The least-squares line could have also been fit using a graphing calculator or a

statistical software package. Minitab output for the data of this example is shown

here. Note that Minitab has rounded the values of the estimated coefficients in the

equation of the regression line, which would result in small differences in predictions

based on the line.

Regression Analysis: Birth Weight versus Maternal Age

The regression equation is

Birth Weight = –1163 + 245 Maternal Age

Predictor

Constant

Maternal Age

S = 205.308



Coef

–1163.4

245.15

R-Sq = 78.1%



SE Coef

783.1

45.91



T

–1.49

5.34



P

0.176

0.001



R-Sq(adj) = 75.4%



In Example 13.2, the x values in the sample ranged from 15 to 19. An estimate

or prediction should not be attempted for any x value much outside this range. Without sample data for such values, there is no evidence that the estimated linear relationCopyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



620



Chapter 13 Simple Linear Regression and Correlation: Inferential Methods



ship can be extrapolated very far. Statisticians refer to this potential pitfall as the

danger of extrapolation.



Estimating s2 and s

The value of s determines the extent to which observed points (x, y) tend to fall close

to or far away from the population regression line. A point estimate of s is based on

SSResid 5 a 1 y 2 y^ 2 2

where y^ 1 5 a 1 bx1, p , y^ n 5 a 1 bxn are the fitted or predicted y values and the

residuals are y1 2 y^ 1, p , yn 2 y^ n. SSResid is a measure of the extent to which the

sample data spread out about the estimated regression line.



DEFINITION

The statistic for estimating the variance s2 is

SSResid

s2e 5

n22

where

SSResid 5 a 1 y 2 y^ 2 2 5 a y2 2 a a y 2 b a xy

The subscript e in s2e reminds us that we are estimating the variance of the

“errors” or residuals.

The estimate of s is the estimated standard deviation

se 5 "s 2e

The number of degrees of freedom associated with estimating s2 or s in simple

linear regression is n 2 2.

The estimates and number of degrees of freedom here have analogs in our previous work involving a single sample x1, x2,  .  .  .  , xn. The sample variance s 2 had a

numerator of g 1x 2 x 2 2 , a sum of squared deviations (residuals), and denominator

n 2 1, the number of degrees of freedom associated with s 2 and s. The use of x as an

estimate of m in the formula for s 2 reduces the number of degrees of freedom by 1,

from n to n 2 1. In simple linear regression, estimation of a and b results in a loss

of 2 degrees of freedom, leaving n 2 2 as the number of degrees of freedom for

SSResid, s 2e , and se.

The coefficient of determination was defined previously (see Chapter 5) as

r2 5 1 2



SSResid

SSTo



where

SSTo 5 a 1 y 2 y 2 2 5 a y 2 2



1 g y2 2

5 Syy

n



The value of r 2 can now be interpreted as the proportion of observed y variation that

can be explained by (or attributed to) the model relationship. The estimate se also

gives another assessment of model performance. Roughly speaking, the value of s

represents the magnitude of a typical deviation of a point (x, y) in the population

from the population regression line. Similarly, in a rough sense, se is the magnitude

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



13.1



Simple Linear Regression Model



621



of a typical sample deviation (residual) from the least-squares line. The smaller the

value of se, the closer the points in the sample fall to the line and the better the line

does in predicting y from x.



EXAMPLE 13.3



Predicting Election Outcomes



The authors of the paper “Inferences of Competence from Faces Predict Election

Outcomes” (Science [2005]: 1623–1626) found that they could successfully predict

the outcome of a U.S. congressional election substantially more than half the time

based on the facial appearance of the candidates. In the study described in the paper,

participants were shown photos of two candidates for a U.S. Senate or House of

Representatives election. Each participant was asked to look at the photos and then

indicate which candidate he or she thought was more competent. The two candidates

were labeled A and B. If a participant recognized either candidate, data from that

participant were not used in the analysis. The proportion of participants who chose

candidate A as the more competent was computed. After the election, the difference

in votes (candidate A 2 candidate B) expressed as a proportion of the total votes cast

in the election was also computed. This difference falls between 11 and 21. It is 0

for an election where both candidates receive the same number of votes, positive for

an election where candidate A received more votes than candidate B (with 11 indicating that candidate A received all of the votes), and negative for an election where

candidate A received fewer votes than candidate B.

This process was carried out for a large number of congressional races. A subset

of the resulting data (approximate values read from a graph that appears in the paper)

is given in the accompanying table, which also includes the predicted values and residuals for the least-squares line fit to these data.



Competent

Proportion



Difference

in Vote

Proportion



Predicted

y Value



Residual



0.20

0.23

0.40

0.35

0.40

0.45

0.50

0.55

0.60

0.68

0.70

0.76



20.70

20.40

20.35

0.18

0.38

20.10

0.20

20.30

0.30

0.18

0.50

0.22



20.389

20.347

20.109

20.179

20.109

20.040

0.030

0.100

0.170

0.281

0.309

0.393



20.311

20.053

20.241

0.359

0.489

20.060

0.170

20.400

0.130

20.101

0.191

20.173



The scatterplot (Figure 13.8) suggests a positive linear relationship between

x 5 proportion of participants who judged candidate A as the more competent

and

y 5 difference in vote proportion.

Data set available online

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



622



Chapter 13 Simple Linear Regression and Correlation: Inferential Methods



The summary statistics are

a x 5 5.82   a y 5 0.11

2

2

a x 5 3.1804   a xy 5 0.5526   a y 5 1.5101

from which we calculate

n 5 12  



a 5 20.6678

b 5 1.3957  

SSResid 5 .81228  SSTo 5 1.50909

Thus,

r2 5 1 2

s 2e 5



SSResid

0.81228

512

5 1 2 .538 5 .462

SSTo

1.50909



SSResid

0.81228

5

5 .081

n22

10



and

se 5 !.081 5 .285

Difference in vote proportion

0.50

0.25



0.00

−0.25

−0.50

−0.75



FIGURE 13.8

Minitab scatterplot for Example 13.3.



0.2



0.3



0.4

0.5

0.6

Competent proportion



0.7



0.8



Approximately 46.2% of the observed variation in the difference in vote proportion y can be attributed to the probabilistic linear relationship with proportion

of participants who judged the candidate to be more competent based on facial

appearance alone. The magnitude of a typical sample deviation from the leastsquares line is about .285, which is reasonably small in comparison to the y values

themselves. The model appears to be useful for estimation and prediction; in Section 13.2, we show how a model utility test can be used to judge whether this is

indeed the case.



A key assumption of the simple linear regression model is that the random deviation e in the model equation is normally distributed. In Section 13.3, we will indicate

how the residuals can be used to determine whether this is plausible.



Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



13.1



Simple Linear Regression Model



623



E X E RC I S E S 1 3 . 1 - 1 3 . 1 1

13.1 Let x be the size of a house (in square feet) and y

be the amount of natural gas used (therms) during a

specified period. Suppose that for a particular community, x and y are related according to the simple linear

regression model with

b 5 slope of population regression line 5 .017

a 5 y intercept of population regression line 5 25.0

Houses in this community range in size from 1000

to 3000 square feet.

a. What is the equation of the population regression

line?

b. Graph the population regression line by first finding

the point on the line corresponding to x 5 1000 and

then the point corresponding to x 5 2000, and

drawing a line through these points.

c. What is the mean value of gas usage for houses with

2100 sq. ft. of space?

d. What is the average change in usage associated with

a 1 sq. ft. increase in size?

e. What is the average change in usage associated with

a 100 sq. ft. increase in size?

f. Would you use the model to predict mean usage for

a 500 sq. ft. house? Why or why not?



13.2 The flow rate in a device used for air quality measurement depends on the pressure drop x (inches of water) across the device’s filter. Suppose that for x values

between 5 and 20, these two variables are related according to the simple linear regression model with population regression line y 5 20.12 1 0.095x.

a. What is the mean flow rate for a pressure drop of

10 inches? A drop of 15 inches?

b. What is the average change in flow rate associated

with a 1 inch increase in pressure drop? Explain.



13.3 The paper “Predicting Yolk Height, Yolk Width,

Albumen Length, Eggshell Weight, Egg Shape Index,

Eggshell Thickness, Egg Surface Area of Japanese

Quails Using Various Egg Traits as Regressors” (International Journal of Poultry Science [2008]: 85–88)

suggests that the simple linear regression model is reasonable for describing the relationship between y 5 eggshell

thickness (in micrometers) and x 5 egg length (mm) for

quail eggs. Suppose that the population regression line is



Bold exercises answered in back



Data set available online



y 5 0.135 1 0.003x and that s 5 0.005. Then, for a

fixed x value, y has a normal distribution with mean

0.135 1 0.003x and standard deviation 0.005.

a. What is the mean eggshell thickness for quail eggs

that are 15 mm in length? For quail eggs that are

17 mm in length?

b. What is the probability that a quail egg with a length

of 15 mm will have a shell thickness that is greater

than 0.18 mm?

c. Approximately what proportion of quail eggs of

length 14 mm has a shell thickness of greater than

.175? Less than .178?



13.4 A sample of small cars was selected, and the values of x 5 horsepower and y 5 fuel efficiency (mpg)

were determined for each car. Fitting the simple linear

regression model gave the estimated regression equation

y^ 5 44.0 2 .150x.

a. How would you interpret b 5 2.150?

b. Substituting x 5 100 gives y^ 5 29.0. Give two different interpretations of this number.

c. What happens if you predict efficiency for a car with

a 300-horsepower engine? Why do you think this

has occurred?

d. Interpret r 2 5 0.680 in the context of this problem.

e. Interpret se 5 3.0 in the context of this problem.



13.5 Suppose that a simple linear regression model is

appropriate for describing the relationship between y 5

house price (in dollars) and x 5 house size (in square

feet) for houses in a large city. The population regression

line is y 5 23,000 1 47x and s 5 5000.

a. What is the average change in price associated with

one extra square foot of space? With an additional

100 sq. ft. of space?

b. What proportion of 1800 sq. ft. homes would be

priced over $110,000? Under $100,000?



13.6 a. Explain the difference between the line y 5



a 1 bx and the line y^ 5 a 1 bx.

b. Explain the difference between b and b.

c. Let x* denote a particular value of the independent

variable. Explain the difference between a 1 bx*

and a 1 bx*.

d. Explain the difference between s and se.



Video Solution available



Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

1: Simple Linear Regression Model

Tải bản đầy đủ ngay(0 tr)

×