1 Step 1 – Estimation of the Marginal Implicit Price of Open Space
Tải bản đầy đủ - 0trang
174
S.-H. Cho
where ln pi is the natural log of the value of house i ; xi is a vector of factors
determining the value of house i ; ln oi is the natural log of open space in the vicinity
of house i ; oO i is the predicted value from (2); i is a vector of instruments that are
correlated with ln oi and uncorrelated with "i ; and ."i ; Ái / are a random disturbances
with expected values of zero and unknown variances. The instruments used in (2)
are identified in Table 1.
The GWR hedonic model with spatially autocorrelated disturbances is:
ln pi D
Ÿi
X
k
“k .ui ; vi / xO i k C ©i ;
i id 0; Â 2
âi D
Xn
j D1;j Ôi
wij âj C Ÿi ;
(3)
where xO i k is a vector of exogenous variables, including the predicted value of ln oO i ;
.ui ; vi / denotes the coordinates of the i th location in the housing market; ˇk .ui ; vi /
represents the local parameters associated with house i ; wij is an element of an
m by n spatial weighting matrix between points i and j ; and is a spatial error
autoregressive parameter.
The specification in (3) allows a continuous surface of parameter values with spatially autocorrelated disturbances, and measurements taken at certain points denote
the spatial heterogeneity of the surface (Fotheringham et al. 2002). Previous studies
have found that a log transformation of the distance and area explanatory variables
generally performs better than a simple linear functional form, as the log transformation captures the declining effects of these distance variables (Bin and Polasky 2004;
Iwata et al. 2000; Mahan et al. 2000). Thus, a natural log transformation of the
distance and area-related variables is used in this study.
Given estimation of (3), GWR residuals are tested for spatial error autocorrelation using a Lagrange Multiplier (LM) test (Anselin 1988). A row-standardized
inverse distance matrix was used to test the hypothesis of spatial error independence. Rejection of the null hypothesis suggests a GWR-spatial autoregressive error
model (GWR-SEM) as a way to address spatial heterogeneity and spatial error
autocorrelation. The GWR-SEM combines well-founded methods typically used in
conventional spatial econometric analyses, i.e., the Cochran–Orcutt method of filtering dependent and explanatory variables to address spatial error autocorrelation
(Anselin 1988), with local regression techniques in a parametric framework. The
filtering mechanism Œ.I œW/ partials out spatial error autocorrelation associated
with the explanatory and dependent variables while estimating local coefficients. It
helps to envision GWR as running n parametric regressions at n locations to control
spatial heterogeneity, and then testing whether the residuals generated by these local
regressions are spatially correlated. If the hypothesis of no spatial autocorrelation is
rejected, conventional methods are applied to filter the dependent and explanatory
variables (e.g., Anselin 1988, p. 183), and the GWR model is estimated again using
the transformed variables.
A convenient procedure to estimate is Kelejian and Prucha’s (1998) general
moments approach, based on the set of GWR residuals. Given determination of ,
the closed form solution to (3) is:
Demand for Open Space and Urban Sprawl
Table 1 Variable names, definitions, and descriptive statistics
Variable (Unit)
Definition
Dependent variable
Housing price ($)
Sale price adjusted to 2000 by
the housing price index
Variables closely associated with urban sprawl
Median household income
Incomea ($)
Finished areaa (feet2 )
Total finished square footage of
house
Total parcel square footage
Lot sizea (feet2 )
Housing density for
Housing densitya
(houses per acre)
census-block group
Area of open space within a
Open space
buffer of 1.0 mile drawn
103 feet2
around each house sale
transaction
Price of open space ($)
Marginal implicit price of
increasing additional
10,000 ft2 of open space
within 1.0-mile buffer
(assuming individual
housing price and
open-space area)
Structural variables
Agea (year)
Bricka
Poola
Garagea
Bedrooma
Storiesa
Fireplacea
Quality of constructiona
Condition of structurea
Distance variables
Distance to CBDa (feet)
Distance to greenwaya
(feet)
Year house was built subtracted
from 2006
Dummy variable for brick
siding (1 if brick, 0
otherwise)
Dummy variable for swimming
pool (1 if pool, 0 otherwise)
Dummy variable for garage (1
if garage, 0 otherwise)
Number of bedrooms in house
Height of house in number of
stories
Number of fireplaces in house
Dummy variable for quality of
construction (1 if excellent,
very good and good, 0
otherwise)
Dummy variable for condition
of structure (1 if excellent,
very good and good, 0
otherwise)
Distance to the central business
district
Distance to nearest greenway
175
Mean
Std. Dev.
129,610.227
95,460.498
51,505.871
1,929.689
20,940.122
975.633
25895.720
1.105
69956.690
0.927
53,822.711
15,490.449
47.618
38.610
29.207
21.733
0.254
0.435
0.055
0.229
0.635
0.481
3.068
1.340
0.647
0.474
0.729
0.352
0.575
0.478
0.734
0.442
44,552.592
20,713.081
7,886.866
5,573.062
(continued)
176
Table 1 (continued)
Variable (Unit)
Distance to railroada
(feet)
Distance to sidewalka
(feet)
Distance to parka (feet)
Park sizea (feet2 )
Distance to golf coursea
(feet)
Distance to water bodya
(feet)
Size of water bodya
(1,000 feet2 )
S.-H. Cho
Definition
Distance to nearest railroad
Mean
6,978.618
Std. Dev.
5,463.655
Distance to nearest sidewalk
3,060.270
4,229.282
Distance to nearest park
Size of nearest park
Distance to nearest golf course
8,652.930
1,454.759
10,680.078
5,556.530
5,094.984
4,942.615
Dist. to nearest stream, lake,
river, or other water body
Size of nearest water body
8,440.579
5,884.047
19,632.026
39,026.745
0.077
0.266
0.157
0.363
0.027
0.161
0.092
0.290
0.053
0.224
0.055
0.228
0.057
0.231
0.147
0.354
0.065
0.247
0.148
0.355
0.014
0.116
0.063
0.031
0.037
0.029
22.519
3.314
0.343
0.475
High school district dummy variables (1 if in School District)
Dummy variable for Doyle
Doylea
High School District
Dummy variable for Bearden
Beardena
High School District
Dummy variable for Carter
Cartera
High School District
Dummy variable for Central
Centrala
High School District
Dummy variable for Fulton
Fultona
High School District
Dummy variable for Gibbs
Gibbsa
High School District
Dummy variable for Halls
Hallsa
High School District
Dummy variable for Karns
Karnsa
High School District
Dummy variable for Powell
Powella
High School District
Dummy variable for Farragut
Farraguta
High School District
Dummy variable for Austin
Austina
High School District
Census block-group variables
Vacancy rate for census-block
Vacancy ratea (ratio)
group (2000)
Unemployment ratea
Unemployment rate for
(ratio)
census-block group (2000)
Average travel time to work for
Travel time to worka
(min)
census-block group (2000)
Other variables
Dummy variable for City of
Knoxvillea
Knoxville (1 if Knoxville, 0
otherwise)
(continued)
Demand for Open Space and Urban Sprawl
177
Table 1 (continued)
Variable (Unit)
Flooda
Definition
Dummy variable for 500-year
floodplain (1 if in stream
protection area, 0 otherwise)
Interfacea
Dummy variable for
rural–urban interface (1 if in
census block of mixed
rural–urban housing, 0
otherwise)
Urban growth areaa
Dummy variable for urban
growth area (1 if in urban
growth area, 0 otherwise)
Dummy variable for planned
Planned growth areaa
growth area (1 if in planned
growth area, 0 otherwise)
Seasona
Dummy variable for season of
sale (1 if April through
September, 0 otherwise)
Average prime interest rate less
Prime interest ratea
average inflation rate
a
Indicates instrumental variables used in the first step estimation
ˇO .ui ; vi / D .X0 .I
W/0 A.I
W/ X/ 1 X0 .I
Mean
0.010
Std. Dev.
0.097
0.223
0.417
0.083
0.276
0.431
0.495
0.559
0.497
4.267
2.104
W/0 A .I
W/P
(4)
which is analogous to the GLS estimator in the spatial econometric literature,
ˇ SEM D .X0 .I– W/0 .I– W/X/ 1 X0 .I– W/0 .I– W/y, where is a an n by n
diagonal matrix with a set of weights corresponding with each observation, except
that it generates i sets of local parameters. The n by n matrix A (which is a function of ui and vi ) addresses spatial heterogeneity, with diagonal elements identifying
the location of other houses relative to house i and zeros in off-diagonal positions
(Fotheringham et al. 2002). Houses near house i have more influence in the estimation of the parameters associated with house i than other houses located farther
away.
When D 0, (4) generates the usual GWR estimates. Pseudo-standard errors for
the i sets of regression parameters are based on the covariance matrix (cov):
O i ; vi // D ¢ 2 .X0 .I
cov .ˇ.u
i
œW/0 A .I
œW/X/
1
(5)
W/0 A.I
W/e=.q k/ is the variance associated with the i th
where i2 D e 0 .I
regression point (Fotheringham et al. 2002).4 Statistical significance of the estimates
from the GWR-SEM at the i th regression point is evaluated with the Pseudo-t tests
4
Those standard errors do not take into consideration the first stage estimation. Further studies will
consider a covariance matrix adjusted for the first stage regression.
178
S.-H. Cho
derived from the Pseudo-standard errors of the location-specific covariance matrices. Based on the GWR-SEM, the marginal implicit price of an additional 10; 000 ft2
of open space is estimated.
2.2 Step 2 – Open-Space Demand Estimation
The demand for open space is estimated using the marginal implicit price of open
space estimated in the first step as a proxy for the price of open space. The demand
equation for open space in the GWR framework is:
ln oi D —.ui ; vi / ln pOi C
X
k
’k .ui ; vi /xi k C Ôi ; k D 1; : : : ; m5
(6)
where ln pOi is the natural log of the estimated marginal implicit price of open space
for house i , and xi k is the kth of m variables determining the demand of open space
for house i . The xi k includes variables closely associated with urban sprawl (e.g.,
income, house and lot size, and housing density), structural attributes of the house,
census-block group variables (e.g., vacancy rate, unemployment rate, and travel time
to work), distance measures to amenities (e.g., lakes, parks) or disamenities (e.g.,
railroads), school districts, and other spatial dummy variables (e.g., urban growth
area and planned growth area) (see Table 1 for the complete list). The statistical
significance of the local estimates at the i th regression point is evaluated with t-tests
derived from the standard errors of the location-specific covariance matrices.
Another concern in regression models with many explanatory variables is multicollinearity, which occurs when two (or more) independent variables are linearly
related. Multicollinearity may inflate estimates of standard errors, rendering hypothesis testing inconclusive. Multicollinearity can be detected by variance inflation
factors (VIF) (Maddala 1992). VIFs are a scaled version of the multiple correlation
coefficients between a variable and the rest of the independent variables (Maddala 1983). There is no clear guideline for how large the VIF must be to reflect
serious multicollinearity, but a rule of thumb is that multicollinearity may be a
problem if the VIF for an independent variable is greater than ten (Gujarati 1995).
The VIFs were lower than ten for all but three variables, namely dummy variables
differentiating the rural–urban interface (22), the City of Knoxville (12), and the
Bearden high school district (11) in the demand for open space equation. In general,
multicollinearity does not appear to be too great a concern because many of the
location-specific coefficients were significantly different from zero at the 5% level.6
Covariance of pOi is not adjusted for first stage regression.
If the VIF is large but the coefficient is significant, multicollinearity is not a problem with respect
to the estimation of the standard errors. If a coefficient is significant using a weak t-test caused
by collinearity (inflated standard error), it would be significant using the stronger t-test associated
with the lack of collinearity (inflated standard error).
5
6
Demand for Open Space and Urban Sprawl
179
Nevertheless, those three variables with high VIFs were not excluded for lack of
sufficient justification.
3 Study Area and Data
Knox County, Tennessee was chosen as a case study for this research because (1)
Knoxville is the eighth most sprawling U.S. metropolitan region (Ewing et al. 2002),
and (2) the area consists of both rapid and slow regions of housing growth. Knox
County is located in East Tennessee, one of the three “Grand Divisions” in the state.
The City of Knoxville is the county seat of Knox County. Knoxville comprises
101 miles2 of the 526 miles2 within Knox County. Total populations of Knoxville
and the Knoxville Metropolitan Area were 173,890 and 655,400 in 2000, respectively (US Census Bureau 2002). The University of Tennessee and the headquarters
of Tennessee Valley Authority (TVA) are near downtown Knoxville, and the US
Department of Energy’s Oak Ridge National Laboratory is 15 miles northwest of
Knoxville. These institutions are the major employers of the area. Maryville is
located approximately 14 miles southwest of Knoxville and it is home to ALCOA,
the largest producer of aluminum in the United States. Farragut, a bedroom community, is located along the edge of the western end of Knox County (see Fig. 1). The
Smoky Mountains, the most-visited National Park in the United States, and a large
quantity of lake acreage (17 miles2 of water bodies) developed by the TVA are on
Knoxville’s doorstep.
It is important to note that push/pull factors of the geography surrounding
the study area were not modeled because data were not available. However, to
our knowledge, no other hedonic studies have successfully addressed this issue.
Admittedly, these omitted factors may cause some estimates to be biased. But understanding this context beforehand aids in the interpretation of patterns generated by
mapped coefficients. It is also important to note that the results of this study may not
be representative of other urban areas. The data set does not represent most typical
urban areas, and because of the local amenities and job opportunities, Knox County
may be more of an outlier case compared to other rapidly growing metropolitan
areas. Nevertheless, the methods used in this case study can be applied to other
urban areas where similar data exist.
This research used five GIS data sets: individual parcel data, satellite imagery
data, census-block group data, boundary data, and environmental feature data. The
individual parcel data, i.e., sales price, lot size, and structural information, were
obtained from the Knoxville, Knox County, Knoxville Utilities Board Geographic
Information System (KGIS 2009), and the Knox County Tax Assessor’s Office. Data
were used for single-family home sales transactions between 1998 and 2002 in Knox
County, Tennessee. A total of 22,704 single-family home sales were recorded during this period. Of the 22,704 houses sold, 15,500 were randomly selected for this
analysis. County officials suggested that sales prices below $40,000 were probably
gifts, donations, or inheritances, and would therefore not reflect true market value.