x <- readRDS('C:/temp/model_object.RDS')
mod <- lm(bldgval ~ bldg_sqft_sum + gis_acres, data=x)
summary(mod)
##
## Call:
## lm(formula = bldgval ~ bldg_sqft_sum + gis_acres, data = x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -52257986 -74999 4154 80163 129886375
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.990e+03 1.512e+03 -5.944 2.78e-09 ***
## bldg_sqft_sum 1.040e+02 3.079e-01 337.571 < 2e-16 ***
## gis_acres 6.268e+05 2.399e+03 261.288 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 810900 on 351151 degrees of freedom
## (2418 observations deleted due to missingness)
## Multiple R-squared: 0.3591, Adjusted R-squared: 0.3591
## F-statistic: 9.839e+04 on 2 and 351151 DF, p-value: < 2.2e-16
“As building value (bldgval
) increases, so does
our error.” or “Our model is better at predicting home values for
less expensive homes”
Everything is related to everything else, but near things are more related than distant things.
Follows directly from Tobler’s Law
Actual Population
->
Random Population
https://mgimond.github.io/Spatial/spatial-autocorrelation.html
Actual Elevation
->
Random Elevation
https://mgimond.github.io/Spatial/spatial-autocorrelation.html
Two aspects of spatial variation:
https://en.wikipedia.org/wiki/File:Annual_Average_Temperature_Map.jpg
1. We can take note of this, and include it in
our analysis considerations…
2. … or
we can use it for voter suppression!
Our choice of spatial reference frame is itself a significant determinant of the statistical and other patterns we observe.
x = % of population over 62; y= % vote
for Republican congressional candidates (1968)
Openshaw, S. and P.J. Taylor. 1979. “A million or so correlation coefficients: Three experiments on the modifiable areal unit problem.” Pp. 127-144 in Statistical Methods in the Spatial Sciences, edited by N. Wrigley. London: Pion.
Openshaw, S. and P.J. Taylor. 1979. “A million or so correlation coefficients: Three experiments on the modifiable areal unit problem.” Pp. 127-144 in Statistical Methods in the Spatial Sciences, edited by N. Wrigley. London: Pion.
Invalid transfer of conclusions from spatially aggregated analysis to smaller areas or even to the individual level.
Comparing Dasymetric population data to standard census geometries:
Scale effects are fundamental and should be considered before spatial analysis
Dependent
87 | 95 | 72 | 37 | 44 | 24 |
40 | 55 | 55 | 38 | 88 | 34 |
41 | 30 | 26 | 35 | 38 | 24 |
14 | 56 | 37 | 34 | 8 | 18 |
49 | 44 | 51 | 67 | 17 | 37 |
55 | 25 | 33 | 32 | 59 | 54 |
Independent
72 | 75 | 85 | 29 | 58 | 30 |
50 | 60 | 49 | 46 | 84 | 23 |
21 | 46 | 22 | 42 | 45 | 14 |
19 | 36 | 48 | 23 | 8 | 29 |
38 | 47 | 52 | 52 | 22 | 48 |
58 | 40 | 46 | 38 | 35 | 55 |
Dependent
69.25 | 50.5 | 47.5 |
35.25 | 33 | 22 |
43.25 | 45.75 | 41.75 |
Independent
64.25 | 52.25 | 48.75 |
30.5 | 33.75 | 24 |
45.75 | 47 | 40 |
Dependent
55.67 | 40.22 |
40.44 | 36.22 |
Independent
53.33 | 41.22 |
42.67 | 34.44 |
Median Household Income
require(tidycensus)
require(sf)
require(tmap)
cnty <- tidycensus::get_acs(
geography = 'county', variables = 'B19049_001E',
state="OR", county=c('Clackamas','Multnomah'),
year=2010, geometry = TRUE) %>%
st_transform(2913)
trct <- tidycensus::get_acs(
geography = 'tract', variables = 'B19049_001E',
state="OR", county=c('Clackamas','Multnomah'),
year=2020, geometry = TRUE) %>%
st_transform(2913)
bg <- tidycensus::get_acs(
geography = 'block group', variables = 'B19049_001E',
state="OR", county=c('Clackamas','Multnomah'),
year=2020, geometry = TRUE) %>%
st_transform(2913)
Percent Hispanic or Latino
cnty <- tidycensus::get_acs(
geography = 'county',
variables = c('B01001_001E','B03002_012E'),
state="OR",
year=2020,
county=c('Clackamas','Multnomah'),
geometry = TRUE,
output = 'wide') %>%
mutate(
pct_hl = B03002_012E / B01001_001E
) %>% st_transform(2913)
trct <- tidycensus::get_acs(
geography = 'tract',
variables = c('B01001_001E','B03002_012E'),
state="OR",
year=2020,
county=c('Clackamas','Multnomah'),
geometry = TRUE,
output = 'wide') %>%
mutate(
pct_hl = B03002_012E / B01001_001E
) %>% st_transform(2913)
bg <- tidycensus::get_acs(
geography = 'block group',
variables = c('B01001_001E','B03002_012E'),
state="OR",
year=2020,
county=c('Clackamas','Multnomah'),
geometry = TRUE,
output = 'wide') %>%
mutate(
pct_hl = B03002_012E / B01001_001E
) %>% st_transform(2913)
Findings about the effects of area-based attributes could be affected by how contextual units or neighborhoods are geographically delineated and the extent to which these areal units deviate from the ‘true causally relevant’ geographic context
Mei-Po Kwan (2012) The Uncertain Geographic Context Problem, Annals of the Association of American Geographers, 102:5, 958-968, DOI: 10.1080/00045608.2012.687349
Arises because of our limited knowledge about the precise spatial and temporal configuration of each individual’s true geographic context, not because of the use of a particular scheme of areal division, zonal aggregation, or spatial scale
Kwan, Mei-Po. 2012. “How GIS can help address the uncertain geographic context problem in social science research.” Annals of GIS 18:245-255.
Source: XKCD
For our study area we’ve included information on:
Maybe our initial model has some issues…