1: Simple Linear Regression Model
Tải bản đầy đủ - 0trang
13.1
613
Simple Linear Regression Model
y
Observation (4, 30)
e=4
26
Graph of
y = 50 – 10x + x 2
FIGURE 13.1
A deviation from the deterministic part
of a probabilistic model.
x
4
Simple Linear Regression
The simple linear regression model is a special case of the general probabilistic model
in which the deterministic function f (x) is linear (so its graph is a straight line).
DEFINITION
The simple linear regression model assumes that there is a line with vertical
or y intercept a and slope b, called the population regression line. When a
value of the independent variable x is ﬁxed and an observation on the dependent variable y is made,
y ϭ a ϩ bx ϩ e
Without the random deviation e, all observed (x, y) points would fall exactly on
the population regression line. The inclusion of e in the model equation recognizes that points will deviate from the line by a random amount.
Figure 13.2 shows two observations in relation to the population regression line.
y
Observation when x = x1
(positive deviation)
Population regression
line (slope β)
e2
e1
Observation when x = x2
(negative deviation)
α = vertical
intercept
FIGURE 13.2
Two observations and deviations from
the population regression line.
x
0
0
x = x1
x = x2
Before we make an observation on y for any particular value of x, we are uncertain
about the value of e. It could be negative, positive, or even 0. Also, it might be quite
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
614
Chapter 13 Simple Linear Regression and Correlation: Inferential Methods
large in magnitude (a point far from the population regression line) or quite small (a
point very close to the line). In this chapter, we make some assumptions about the
distribution of e in repeated sampling at any particular x value.
Basic Assumptions of the Simple Linear Regression Model
1. The distribution of e at any particular x value has mean value 0. That is,
me ϭ 0.
2. The standard deviation of e (which describes the spread of its distribution)
is the same for any particular value of x. This standard deviation is denoted
by s.
3. The distribution of e at any particular x value is normal.
4. The random deviations e1, e2, . . . , en associated with different observations
are independent of one another.
These assumptions about the e term in the simple linear regression model also
imply that there is variability in the y values observed at any particular value of x.
Consider y when x has some fixed value x*, so that
y 5 a 1 bx* 1 e
Because a and b are fixed numbers, a 1 bx* is also a fixed number. The sum of
a fixed number and a normally distributed variable (e) is also a normally distributed variable (the bell-shaped curve is simply relocated), so y itself has a normal
distribution. Furthermore, me 5 0 implies that the mean value of y is a 1 bx*, the
height of the population regression line above the value x*. Finally, because there
is no variability in the fixed number a 1 bx*, the standard deviation of y is the
same as the standard deviation of e. These properties are summarized in the following box.
At any fixed x value, y has a normal distribution, with
a
mean y value
height of the population
b5a
b 5 a 1 bx
for fixed x
regression line above x
and
(standard deviation of y for a fixed x) = s
The slope b of the population regression line is the average change in y associated
with a 1-unit increase in x. The y intercept a is the height of the population line when
x 5 0. The value of s determines the extent to which (x, y) observations deviate from
the population line. When s is small, most observations will be quite close to the line,
but when s is large, there are likely to be some large deviations.
The key features of the model are illustrated in Figures 13.3 and 13.4. Notice
that the three normal curves in Figure 13.3 have identical spreads. This is a consequence of se 5 s, which implies that the variability in the y values at a particular x
does not depend on the value of x.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
13.1
615
Simple Linear Regression Model
y
y = α + βx,
the population
regression line
(line of mean values)
α + βx3
Mean value α + βx3
Standard deviation σ
Normal curve
α + βx2
Mean value α + βx2
Standard deviation σ
Normal curve
α + βx1
Mean value α + βx1
Standard deviation σ
Normal curve
x1
FIGURE 13.3
Illustration of the simple linear regression model.
x2
x
x3
Three different x values
Population regression
line
Population regression
line
FIGURE 13.4
Data from the simple linear regression
model: (a) small s; (b) large s.
(a)
EXAMPLE 13.1
(b)
Stand on Your Head to Lose Weight?
The authors of the article “On Weight Loss by Wrestlers Who Have Been Stand-
© ImageState Royalty-Free/Alamy
ing on Their Heads” (paper presented at the Sixth International Conference on
Statistics, Combinatorics, and Related Areas, Forum for Interdisciplinary Mathematics, 1999, with the data also appearing in A Quick Course in Statistical Process
Control, Mick Norton, Pearson Prentice Hall, 2005) stated that “amateur wrestlers
who are overweight near the end of the weight certification period, but just barely
so, have been known to stand on their heads for a minute or two, get on their feet,
step back on the scale, and establish that they are in the desired weight class. Using
a headstand as the method of last resort has become a fairly common practice in
amateur wrestling.”
Does this really work? Data were collected in an experiment in which weight loss
was recorded for each wrestler after exercising for 15 minutes and then doing a headstand for 1 minute 45 seconds. Based on these data, the authors of the article concluded that there was in fact a demonstrable weight loss that was greater than that for
a control group that exercised for 15 minutes but did not do the headstand. (The
authors give a plausible explanation for why this might be the case based on the way
blood and other body fluids collect in the head during the headstand and the effect
of weighing while these fluids are draining immediately after standing.) The authors
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
616
Chapter 13 Simple Linear Regression and Correlation: Inferential Methods
also concluded that a simple linear regression model was a reasonable way to describe
the relationship between the variables
y 5 weight loss (in pounds)
and
x 5 body weight prior to exercise and headstand (in pounds)
Suppose that the actual model equation has a 5 0, b 5 0.001, and s 5 0.09 (these
values are consistent with the findings in the article). The population regression line
is shown in Figure 13.5.
y
Mean y when
= 0.19
x = 190
(
)
Population
regression line
y = 0.001x
x
FIGURE 13.5
The population regression line for
Example 13.1.
x = 190
If the distribution of the random errors at any fixed weight (x value) is normal,
then the variable y 5 weight loss is normally distributed with
my 5 a 1 bx 5 0 1 0.001x
sy 5 s 5 .09
For example, when x 5 190 (corresponding to a 190-pound wrestler), weight
loss has mean value
my 5 0 1 0.001(190) 5 .19
Because the standard deviation of y is s 5 0.09, the interval 0.19 6 2(0.09) 5
(0.01, 0.37) includes y values that are within 2 standard deviations of the mean
value for y when x 5 190. Roughly 95% of the weight loss observations made for
190 pound wrestlers will be in this range.
The slope b 5 0.001 is the change in average weight loss associated with each
additional pound of body weight.
More insight into model properties can be gained by thinking of the population
of all (x, y) pairs as consisting of many smaller populations. Each one of these smaller
populations contains pairs for which x has a fixed value. For example, suppose that in
a large population of college students the variables
x 5 grade point average in major courses
and
y 5 starting salary after graduation
are related according to the simple linear regression model. Then there is the population of all pairs with x 5 3.20 (corresponding to all students with a grade point average of 3.20 in major courses), the population of all pairs having x 5 2.75, and so on.
The model assumes that for each such population, y is normally distributed with
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
13.1
617
Simple Linear Regression Model
the same standard deviation, and the mean y value (rather than y itself ) is linearly
related to x.
In practice, the judgment of whether the simple linear regression model is appropriate must be based on how the data were collected and on a scatterplot of the
data. The sample observations should be independent of one another. In addition,
the scatterplot should show a linear rather than a curved pattern, and the vertical
spread of points should be relatively homogeneous throughout the range of x values.
Figure 13.6 shows plots with three different patterns; only the first is consistent with
the model assumptions.
y
y
y
FIGURE 13.6
Some commonly encountered patterns
in scatterplots: (a) consistent with the
simple linear regression model;
(b) suggests a nonlinear probabilistic
model; (c) suggests that variability in
y changes with x.
x
x
(b)
(a)
x
(c)
Estimating the Population Regression Line
For the remainder of this chapter, we will proceed with the view that the basic assumptions of the simple linear regression model are reasonable. The values of a and
b ( y intercept and slope of the population regression line) will almost never be known
to an investigator. Instead, these values must first be estimated from the sample data
(x1, y1), . . . , (xn, yn).
Let a and b denote point estimates of a and b, respectively. These estimates are
based on the method of least squares introduced in Chapter 5. The sum of squared
vertical deviations of points in the scatterplot from the least-squares line is smaller
than for any other line.
The point estimates of b, the slope, and a, the y intercept of the population regression line, are the slope and y intercept, respectively, of the least-squares line. That is,
b 5 point estimate of b 5
Sxy
Sxx
a 5 point estimate of a 5 y 2 bx
where
Sxy 5 a xy 2
1 g x 2 1 g y2
1 g x2 2
and Sxx 5 a x2 2
n
n
The estimated regression line is the familiar least-squares line
y^ 5 a 1 bx
Let x* denote a specified value of the predictor variable x. Then a 1 bx* has two different interpretations:
1. It is a point estimate of the mean y value when x 5 x*.
2. It is a point prediction of an individual y value to be observed when x 5 x*.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
618
Chapter 13 Simple Linear Regression and Correlation: Inferential Methods
EXAMPLE 13.2
Mother’s Age and Baby’s Birth Weight
Medical researchers have noted that adolescent females are much more likely to
deliver low-birth-weight babies than are adult females. Because low-birth-weight babies have higher mortality rates, a number of studies have examined the relationship
between birth weight and mother’s age for babies born to young mothers.
One such study is described in the article “Body Size and Intelligence in 6-Year-
Olds: Are Offspring of Teenage Mothers at Risk?” (Maternal and Child Health
Journal [2009]: 847–856). The following data on
x 5 maternal age (in years)
and
y 5 birth weight of baby (in grams)
are consistent with summary values given in the referenced article and also with data
published by the National Center for Health Statistics.
OBSERVATION
x
y
1
2
3
4
5
6
7
8
9
10
15
2289
17
3393
18
3271
15
2648
16
2897
19
3327
17
2970
16
2535
18
3138
19
3573
A scatterplot of the data is given in Figure 13.7. The scatterplot shows a linear pattern
and the spread in the y values appears to be similar across the range of x values. This
supports the appropriateness of the simple linear regression model.
Baby’s weight (g)
3500
3000
2500
FIGURE 13.7
Scatterplot of the data from
Example 13.2.
15
16
17
Mother’s age (yr)
18
19
The summary statistics (computed from the given sample data) are
a x 5 170
a y 5 30,041
2
2
a x 5 2910 a xy 5 515,600 a y 5 91,785,351
n 5 10
Step-by-Step technology
instructions available online
Data set available online
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
13.1
Simple Linear Regression Model
619
from which
Sxy 5 a xy 2
1 g x2 1 g y2
11702 130,0412
5 515,600 2
5 4903.0
n
10
Sxx 5 a x 2 2
11702 2
1 g x2 2
5 2910 2
5 20.0
n
10
x5
170
30041
5 17.0 y 5
5 3004.1
10
10
This gives
Sxy
4903.0
5
5 245.15
b5
Sxx
20.0
a 5 y 2 bx 5 3004.1 2 1245.12 117.02 5 21163.45
The equation of the estimated regression line is then
y^ 5 a 1 bx 5 21163.45 1 245.15x
A point estimate of the mean birth weight of babies born to 18-year-old mothers
results from substituting x 5 18 into the estimated equation:
1estimated mean y when x 5 182 5 a 1 bx
5 21163.45 1 245.15 1182
5 3249.25 grams
Similarly, we would predict the birth weight of a baby to be born to a particular
18-year-old mother to be
(predicted y value when x 5 18) 5 a 1 b(18) 5 3249.25 grams
The point estimate and the point prediction are identical, because the same x
value was used in each calculation. However, the interpretation of each is different.
One represents our prediction of the weight of a single baby whose mother is 18,
whereas the other represents our estimate of the mean weight of all babies born to
18-year-old mothers. This distinction will become important in Section 13.4, when
we consider interval estimates and predictions.
The least-squares line could have also been fit using a graphing calculator or a
statistical software package. Minitab output for the data of this example is shown
here. Note that Minitab has rounded the values of the estimated coefficients in the
equation of the regression line, which would result in small differences in predictions
based on the line.
Regression Analysis: Birth Weight versus Maternal Age
The regression equation is
Birth Weight = –1163 + 245 Maternal Age
Predictor
Constant
Maternal Age
S = 205.308
Coef
–1163.4
245.15
R-Sq = 78.1%
SE Coef
783.1
45.91
T
–1.49
5.34
P
0.176
0.001
R-Sq(adj) = 75.4%
In Example 13.2, the x values in the sample ranged from 15 to 19. An estimate
or prediction should not be attempted for any x value much outside this range. Without sample data for such values, there is no evidence that the estimated linear relationCopyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
620
Chapter 13 Simple Linear Regression and Correlation: Inferential Methods
ship can be extrapolated very far. Statisticians refer to this potential pitfall as the
danger of extrapolation.
Estimating s2 and s
The value of s determines the extent to which observed points (x, y) tend to fall close
to or far away from the population regression line. A point estimate of s is based on
SSResid 5 a 1 y 2 y^ 2 2
where y^ 1 5 a 1 bx1, p , y^ n 5 a 1 bxn are the fitted or predicted y values and the
residuals are y1 2 y^ 1, p , yn 2 y^ n. SSResid is a measure of the extent to which the
sample data spread out about the estimated regression line.
DEFINITION
The statistic for estimating the variance s2 is
SSResid
s2e 5
n22
where
SSResid 5 a 1 y 2 y^ 2 2 5 a y2 2 a a y 2 b a xy
The subscript e in s2e reminds us that we are estimating the variance of the
“errors” or residuals.
The estimate of s is the estimated standard deviation
se 5 "s 2e
The number of degrees of freedom associated with estimating s2 or s in simple
linear regression is n 2 2.
The estimates and number of degrees of freedom here have analogs in our previous work involving a single sample x1, x2, . . . , xn. The sample variance s 2 had a
numerator of g 1x 2 x 2 2 , a sum of squared deviations (residuals), and denominator
n 2 1, the number of degrees of freedom associated with s 2 and s. The use of x as an
estimate of m in the formula for s 2 reduces the number of degrees of freedom by 1,
from n to n 2 1. In simple linear regression, estimation of a and b results in a loss
of 2 degrees of freedom, leaving n 2 2 as the number of degrees of freedom for
SSResid, s 2e , and se.
The coefficient of determination was defined previously (see Chapter 5) as
r2 5 1 2
SSResid
SSTo
where
SSTo 5 a 1 y 2 y 2 2 5 a y 2 2
1 g y2 2
5 Syy
n
The value of r 2 can now be interpreted as the proportion of observed y variation that
can be explained by (or attributed to) the model relationship. The estimate se also
gives another assessment of model performance. Roughly speaking, the value of s
represents the magnitude of a typical deviation of a point (x, y) in the population
from the population regression line. Similarly, in a rough sense, se is the magnitude
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
13.1
Simple Linear Regression Model
621
of a typical sample deviation (residual) from the least-squares line. The smaller the
value of se, the closer the points in the sample fall to the line and the better the line
does in predicting y from x.
EXAMPLE 13.3
Predicting Election Outcomes
The authors of the paper “Inferences of Competence from Faces Predict Election
Outcomes” (Science [2005]: 1623–1626) found that they could successfully predict
the outcome of a U.S. congressional election substantially more than half the time
based on the facial appearance of the candidates. In the study described in the paper,
participants were shown photos of two candidates for a U.S. Senate or House of
Representatives election. Each participant was asked to look at the photos and then
indicate which candidate he or she thought was more competent. The two candidates
were labeled A and B. If a participant recognized either candidate, data from that
participant were not used in the analysis. The proportion of participants who chose
candidate A as the more competent was computed. After the election, the difference
in votes (candidate A 2 candidate B) expressed as a proportion of the total votes cast
in the election was also computed. This difference falls between 11 and 21. It is 0
for an election where both candidates receive the same number of votes, positive for
an election where candidate A received more votes than candidate B (with 11 indicating that candidate A received all of the votes), and negative for an election where
candidate A received fewer votes than candidate B.
This process was carried out for a large number of congressional races. A subset
of the resulting data (approximate values read from a graph that appears in the paper)
is given in the accompanying table, which also includes the predicted values and residuals for the least-squares line fit to these data.
Competent
Proportion
Difference
in Vote
Proportion
Predicted
y Value
Residual
0.20
0.23
0.40
0.35
0.40
0.45
0.50
0.55
0.60
0.68
0.70
0.76
20.70
20.40
20.35
0.18
0.38
20.10
0.20
20.30
0.30
0.18
0.50
0.22
20.389
20.347
20.109
20.179
20.109
20.040
0.030
0.100
0.170
0.281
0.309
0.393
20.311
20.053
20.241
0.359
0.489
20.060
0.170
20.400
0.130
20.101
0.191
20.173
The scatterplot (Figure 13.8) suggests a positive linear relationship between
x 5 proportion of participants who judged candidate A as the more competent
and
y 5 difference in vote proportion.
Data set available online
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
622
Chapter 13 Simple Linear Regression and Correlation: Inferential Methods
The summary statistics are
a x 5 5.82 a y 5 0.11
2
2
a x 5 3.1804 a xy 5 0.5526 a y 5 1.5101
from which we calculate
n 5 12
a 5 20.6678
b 5 1.3957
SSResid 5 .81228 SSTo 5 1.50909
Thus,
r2 5 1 2
s 2e 5
SSResid
0.81228
512
5 1 2 .538 5 .462
SSTo
1.50909
SSResid
0.81228
5
5 .081
n22
10
and
se 5 !.081 5 .285
Difference in vote proportion
0.50
0.25
0.00
−0.25
−0.50
−0.75
FIGURE 13.8
Minitab scatterplot for Example 13.3.
0.2
0.3
0.4
0.5
0.6
Competent proportion
0.7
0.8
Approximately 46.2% of the observed variation in the difference in vote proportion y can be attributed to the probabilistic linear relationship with proportion
of participants who judged the candidate to be more competent based on facial
appearance alone. The magnitude of a typical sample deviation from the leastsquares line is about .285, which is reasonably small in comparison to the y values
themselves. The model appears to be useful for estimation and prediction; in Section 13.2, we show how a model utility test can be used to judge whether this is
indeed the case.
A key assumption of the simple linear regression model is that the random deviation e in the model equation is normally distributed. In Section 13.3, we will indicate
how the residuals can be used to determine whether this is plausible.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
13.1
Simple Linear Regression Model
623
E X E RC I S E S 1 3 . 1 - 1 3 . 1 1
13.1 Let x be the size of a house (in square feet) and y
be the amount of natural gas used (therms) during a
specified period. Suppose that for a particular community, x and y are related according to the simple linear
regression model with
b 5 slope of population regression line 5 .017
a 5 y intercept of population regression line 5 25.0
Houses in this community range in size from 1000
to 3000 square feet.
a. What is the equation of the population regression
line?
b. Graph the population regression line by first finding
the point on the line corresponding to x 5 1000 and
then the point corresponding to x 5 2000, and
drawing a line through these points.
c. What is the mean value of gas usage for houses with
2100 sq. ft. of space?
d. What is the average change in usage associated with
a 1 sq. ft. increase in size?
e. What is the average change in usage associated with
a 100 sq. ft. increase in size?
f. Would you use the model to predict mean usage for
a 500 sq. ft. house? Why or why not?
13.2 The flow rate in a device used for air quality measurement depends on the pressure drop x (inches of water) across the device’s filter. Suppose that for x values
between 5 and 20, these two variables are related according to the simple linear regression model with population regression line y 5 20.12 1 0.095x.
a. What is the mean flow rate for a pressure drop of
10 inches? A drop of 15 inches?
b. What is the average change in flow rate associated
with a 1 inch increase in pressure drop? Explain.
13.3 The paper “Predicting Yolk Height, Yolk Width,
Albumen Length, Eggshell Weight, Egg Shape Index,
Eggshell Thickness, Egg Surface Area of Japanese
Quails Using Various Egg Traits as Regressors” (International Journal of Poultry Science [2008]: 85–88)
suggests that the simple linear regression model is reasonable for describing the relationship between y 5 eggshell
thickness (in micrometers) and x 5 egg length (mm) for
quail eggs. Suppose that the population regression line is
Bold exercises answered in back
Data set available online
y 5 0.135 1 0.003x and that s 5 0.005. Then, for a
fixed x value, y has a normal distribution with mean
0.135 1 0.003x and standard deviation 0.005.
a. What is the mean eggshell thickness for quail eggs
that are 15 mm in length? For quail eggs that are
17 mm in length?
b. What is the probability that a quail egg with a length
of 15 mm will have a shell thickness that is greater
than 0.18 mm?
c. Approximately what proportion of quail eggs of
length 14 mm has a shell thickness of greater than
.175? Less than .178?
13.4 A sample of small cars was selected, and the values of x 5 horsepower and y 5 fuel efficiency (mpg)
were determined for each car. Fitting the simple linear
regression model gave the estimated regression equation
y^ 5 44.0 2 .150x.
a. How would you interpret b 5 2.150?
b. Substituting x 5 100 gives y^ 5 29.0. Give two different interpretations of this number.
c. What happens if you predict efficiency for a car with
a 300-horsepower engine? Why do you think this
has occurred?
d. Interpret r 2 5 0.680 in the context of this problem.
e. Interpret se 5 3.0 in the context of this problem.
13.5 Suppose that a simple linear regression model is
appropriate for describing the relationship between y 5
house price (in dollars) and x 5 house size (in square
feet) for houses in a large city. The population regression
line is y 5 23,000 1 47x and s 5 5000.
a. What is the average change in price associated with
one extra square foot of space? With an additional
100 sq. ft. of space?
b. What proportion of 1800 sq. ft. homes would be
priced over $110,000? Under $100,000?
13.6 a. Explain the difference between the line y 5
a 1 bx and the line y^ 5 a 1 bx.
b. Explain the difference between b and b.
c. Let x* denote a particular value of the independent
variable. Explain the difference between a 1 bx*
and a 1 bx*.
d. Explain the difference between s and se.
Video Solution available
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.