R for Spatial Analysis

Jackson Voelkel

November 18th, 2016

Introduction

Me

  • Background
  • Computer Experience

What is R?

  • Language
  • Application
  • “Do it all” Workbench

What can I do in R?

  • Advanced (or not so advanced) statistics
  • Powerful plotting
  • Read almost any file type
  • Write almost any file type
  • Almost anything you’d ever want to!

Reasons to use R:

  • Programatic
  • Reproducible
  • Cross Platform: PC, Mac, and Linux!
  • Low(ish) level
  • Open Source and community driven
  • Free
  • Free

Drawbacks to using R:

  • Steep learning curve
  • Can be incredibly complex for simple tasks
  • Not a GUI, and not overtly intuitive

R won’t replace spatial analysis platforms like ArcGIS, however it’s “language” format will allow for incredibly powerful, complex, and custom geoprocessing tasks to be performed.

What does R look like?

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

model<-lm(mpg~wt,data=mtcars)
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10
rsqared<-summary(model)$adj.r.squared

plot(mpg~wt,data=mtcars)
abline(model,col="red",lwd=2)
text(4,26,paste0("R-squared: ",round(rsqared,4)))

par(mfrow=c(2,2))
hist(mtcars$mpg,breaks=10)
plot(density(mtcars$mpg,breaks=10))
boxplot(mtcars)
plot(mtcars[,1])

Like other programming languages, R has the ability to “save” objects for use later:

first_object<-570
second_object<-430
first_object-second_object
## [1] 140

R is also used heavily in data analysis, so it is great at manipulating data. Here’s a subset:

kable(table(iris[seq(1,150,by=25),c("Petal.Width","Species")]))
setosa versicolor virginica
0.2 2 0 0
1.4 0 2 0
1.8 0 0 1
2.5 0 0 1

Standard Data Types in R

Vector

numbers<-c(1,2,4,7,10,11,12,19,26)
letters<-c("a","b","d","g","j","k","l","s","z")
numbers
## [1]  1  2  4  7 10 11 12 19 26
letters
## [1] "a" "b" "d" "g" "j" "k" "l" "s" "z"

Data frame

n_and_l<-data.frame(numbers=numbers,letters=letters)
n_and_l
##   numbers letters
## 1       1       a
## 2       2       b
## 3       4       d
## 4       7       g
## 5      10       j
## 6      11       k
## 7      12       l
## 8      19       s
## 9      26       z

Matrix

n_matrix<-matrix(numbers,nrow=3,ncol=3,byrow=T)
n_matrix
##      [,1] [,2] [,3]
## [1,]    1    2    4
## [2,]    7   10   11
## [3,]   12   19   26

Data manipulation

Subsetting data

n_and_l[1,2]
## [1] a
## Levels: a b d g j k l s z
n_and_l$numbers
## [1]  1  2  4  7 10 11 12 19 26

Data manipulation

Adding data

colnames(n_and_l)
## [1] "numbers" "letters"
n_and_l$squared<-n_and_l$numbers^2
n_and_l
##   numbers letters squared
## 1       1       a       1
## 2       2       b       4
## 3       4       d      16
## 4       7       g      49
## 5      10       j     100
## 6      11       k     121
## 7      12       l     144
## 8      19       s     361
## 9      26       z     676

Spatial Data in R - Vector Data

Uses most of the same functions as ‘base’ R.

Types of Spatial Data Objects

  • Spatial Points
  • Spatial Lines
  • Spatial Polygons
  • Raster
  • Special Custom Objects (e.g. datacubes, networks, etc)

Reading in Geospatial Data

Adaptation of the Geospatial Data Abstraction Library (GDAL) is a perfect example of R’s incorporation of a wide swath of existing open source libraries.

library(rgdal)
cities<-readOGR("I:/Research/Shares/gisdata/PortlandRLIS/RLISArchive/2016/2016_May/BOUNDARY","cty_fill",verbose = F)

par(mar=c(0,0,0,0))
plot(cities,col=cities$CITYNAME)

Just like with a table, this new “SpatialPolygonsDataFrame” is subsetable:

par(mar=c(0,0,0,0))
pdx<-cities[cities$CITYNAME == "Portland",]
plot(pdx, col="cornsilk")

Subsetting by Area:

par(mar=c(0,0,0,0))
bigCities<-cities[cities$AREA >= quantile(cities$AREA,0.90),]
plot(bigCities,col=cities$CITYNAME)

Spatial Subsetting

First, we will read in some points:

par(mar=c(0,0,0,0))
schools<-readOGR("I:/Research/Shares/gisdata/PortlandRLIS/RLISArchive/2016/2016_May/PLACES","schools",verbose = F)

plot(schools,pch= 19,cex=0.8,col=schools$DIST_NO)

Now, we’ll subset it:

par(mar=c(0,0,0,0))
pdx_schools<-schools[pdx,]
plot(pdx, col="cornsilk")
points(pdx_schools,col="red",pch=17)

Note that the same subsetting format is used to subset spatial data as it is to subset tables!

Learning the functionality of R can open up many packages for use… even though some packages have very specific syntax.

Most of the high-powered tools available out-of-box in R are in the rgeos library.

library(rgeos)

Buffering by 800ft:

plot(gBuffer(pdx_schools,width=1200),col="grey")

Clipping:

par(mar=c(0,0,0,0))
metro_cbg<-readOGR("I:/Research/Shares/gisdata/PortlandRLIS/RLISArchive/2016/2016_May/CENSUS","tract2010", verbose = F)
pdx_cbg<-gIntersection(metro_cbg,pdx,byid=TRUE,drop_lower_td = FALSE)
plot(pdx_cbg,col=metro_cbg$TRACT_NO)

Transforming Data:

cities_utm<-spTransform(cities,"+proj=utm +zone=10 +ellps=GRS80 +datum=NAD83 +units=m +no_defs")

Table Manipulation

  • Joining
  • Field Calculation
  • Column Name Manipulation
  • Shapefiles store data in a .dbf!

library(foreign)
metro_cbg_table<-read.dbf("I:/Research/Shares/gisdata/PortlandRLIS/RLISArchive/2016/2016_May/CENSUS/tract2010.dbf")
metro_cbg_table<-metro_cbg_table[,c(1,5,6,9)]
colnames(metro_cbg_table)
## [1] "STATE" "FIPS"  "POP10" "WHITE"

metro_cbg_table$non_w<-metro_cbg_table$POP10-metro_cbg_table$WHITE
head(metro_cbg_table)
##   STATE        FIPS POP10 WHITE non_w
## 1    41 41009970900  4877  4458   419
## 2    41 41009970300  4257  3955   302
## 3    41 41009970500  6287  5877   410
## 4    41 41009970600  6118  5706   412
## 5    41 41051005100  7926  6351  1575
## 6    41 41051005200  4525  3862   663

colnames(metro_cbg_table)<-c("s","f","total","w","poc")
head(metro_cbg_table)
##    s           f total    w  poc
## 1 41 41009970900  4877 4458  419
## 2 41 41009970300  4257 3955  302
## 3 41 41009970500  6287 5877  410
## 4 41 41009970600  6118 5706  412
## 5 41 41051005100  7926 6351 1575
## 6 41 41051005200  4525 3862  663

  • write tables with write.dbf to save them
  • caution: only use this on a copy of your data!

Plotting Vector Data

Basic Settings

Color ramps are simple to create. I’ll set a few up now that I’ll be using throughout the presentation:

bw<-colorRampPalette(c("black","white"))(50)
blured<-colorRampPalette(c("blue","cornsilk1","red"))(50)
whigreen<-colorRampPalette(c("white","forestgreen"))(50)
redgreen<-colorRampPalette(c("darkgreen","yellow3","firebrick"))(50)
redredgreen<-colorRampPalette(c("darkgreen","yellow3","red3","firebrick"))(50)
greengreen<-colorRampPalette("forestgreen")(50)

Color Breaks

library(RColorBrewer)
library(classInt)

Color breaks are a little tricky in R, however R gives more options than most software for break types:

Break options: Fixed, sd, equal, pretty, quantile, kmeans, hclust, bclust, fisher, jenks, and quantile.

Hierarchical Clustering Breaks:

par(mar=c(0,0,0,0))
class <- classIntervals(cities@data$AREA, 15, style="hclust")
colcode_hc <- findColours(class, redgreen)
plot(cities,col=colcode_hc)

Standard Deviation Breaks:

par(mar=c(0,0,0,0))
class <- classIntervals(cities@data$AREA, 15, style="sd")
colcode_sd <- findColours(class, redgreen)
plot(cities,col=colcode_sd)

‘Pretty’ Breaks:

par(mar=c(0,0,0,0))
class <- classIntervals(cities@data$AREA, 15, style="pretty")
colcode_sd <- findColours(class, redgreen)
plot(cities,col=colcode_sd)

Functional Coloring

You can even write a function to handle your coloring options:

colorPDX<-function(){
  getCol<-function(x){
    if (x >= quantile(cities$AREA,0.95)){
      return("pink")
    } else {return("lightgreen")}
  }
  return(as.vector(sapply(as.vector(cities@data[,"AREA"]),getCol)))
}

par(mar=c(0,0,0,0))
plot(cities,col=colorPDX())

Raster Data

Its just a matrix

Geospatial rasters are, at their core, matrices with a little extra data (e.g. resolution, projection).

m<-matrix(nrow = 4,ncol=4,c(1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7),dimnames = list(c("Row 1","Row 2","Row 3", "Row 4"),c("Col 1", "Col 2", "Col 3", "Col 4")))
m
##       Col 1 Col 2 Col 3 Col 4
## Row 1     1     5     9     4
## Row 2     2     6     1     5
## Row 3     3     7     2     6
## Row 4     4     8     3     7

Lets turn that matrix above into a raster!

Matrix_Raster<-raster(m)
plot(Matrix_Raster)

Loading a Raster

Reading a geospatial raster is easy - all we have to do is assign a filepath to any name we choose with the raster() function. Here we will open up a Digital Elevation Model and a Digital Surface Model:

dem<-raster("C:/Local_Working/dem.tif")
dsm<-raster("C:/Local_Working/dsm.tif")

Now we have two objects of the class “RasterLayer”.

dem
## class       : RasterLayer 
## dimensions  : 948, 1005, 952740  (nrow, ncol, ncell)
## resolution  : 3, 3  (x, y)
## extent      : 744597.5, 747612.5, 1379506, 1382350  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=43 +lat_2=45.5 +lat_0=41.75 +lon_0=-120.5 +x_0=399999.9999999999 +y_0=0 +ellps=GRS80 +units=ft +no_defs 
## data source : C:\Local_Working\dem.tif 
## names       : dem

dsm
## class       : RasterLayer 
## dimensions  : 948, 1005, 952740  (nrow, ncol, ncell)
## resolution  : 3, 3  (x, y)
## extent      : 744597.5, 747612.5, 1379506, 1382350  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=43 +lat_2=45.5 +lat_0=41.75 +lon_0=-120.5 +x_0=399999.9999999999 +y_0=0 +ellps=GRS80 +units=ft +no_defs 
## data source : C:\Local_Working\dsm.tif 
## names       : dsm

plot(dem,col=bw,box=F,axes=F,main="DEM")

plot(dsm,col=bw,box=F,axes=F,main="DSM")

Categorical Rasters

Just like continuous variable rasters, we can also easily manipulate categorical rasters (such as the National Land Cover Database):

NLCD<-raster("C:/Local_Working/NLCD.tif")

plot(NLCD)

We can even extract certain values from the “matrix” of the NLCD raster and turn them into new raster. This gets a little more complicated, but is incredibly powerful:

NLCD_Trees<-raster(ifelse(as.matrix(NLCD) == 42, 1, 0),template=NLCD)
plot(NLCD_Trees)

Raster Stacks

R can also handle multi-band or “timecube” data by creating a “stack” of rasters. This dramatically increases performance and keeps many rasters together in a single managable object. This is great for analysis, but also loading in a multi-band raster, such as an RGB!

Rose_Gardens<-stack("C:/Local_Working/rgbi.tif")

Rose_Gardens
## class       : RasterStack 
## dimensions  : 948, 1005, 952740, 4  (nrow, ncol, ncell, nlayers)
## resolution  : 3, 3  (x, y)
## extent      : 744597.5, 747612.5, 1379506, 1382350  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=43 +lat_2=45.5 +lat_0=41.75 +lon_0=-120.5 +x_0=399999.9999999999 +y_0=0 +ellps=GRS80 +units=ft +no_defs 
## names       : rgbi.1, rgbi.2, rgbi.3, rgbi.4 
## min values  :      0,      0,      0,      0 
## max values  :    254,    254,    254,    254

Plotting each raster layer within the stack:

plot(Rose_Gardens,col=bw)

Plotting as a standard RGB:

plotRGB(Rose_Gardens,r=1,g=2,b=3)

Plotting as a false-color infrared:

plotRGB(Rose_Gardens,r=4,g=2,b=3)

Raster Functions

Matrix Math

Just like matrix objects, we can perform mathmatical operations on raster objects! Lets subtract the DEM from the DSM and create a Feature Height dataset. Feature Height datasets are incredibly useful as they show us the height above ground for each pixel:

Feature_Height<-dsm-dem

Feature_Height
## class       : RasterLayer 
## dimensions  : 948, 1005, 952740  (nrow, ncol, ncell)
## resolution  : 3, 3  (x, y)
## extent      : 744597.5, 747612.5, 1379506, 1382350  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=43 +lat_2=45.5 +lat_0=41.75 +lon_0=-120.5 +x_0=399999.9999999999 +y_0=0 +ellps=GRS80 +units=ft +no_defs 
## data source : in memory
## names       : layer 
## values      : 0, 203.74  (min, max)

par(mar=c(2,1,1,2))
plot(Feature_Height,col=bw,box=F,axes=F,main="Feature Height (Height Above Ground)")

NDVI

This type of raster math can be used to perform band math within an RGBI raster stack:

gNDVI<-(Rose_Gardens[[4]]-Rose_Gardens[[2]]) / (Rose_Gardens[[4]]+Rose_Gardens[[2]])

plot(gNDVI,col=rev(redredgreen))

Math with Multiple rasters

A simple definition of ‘tree canopy’: areas where gNDVI is greater than 0.02 and the feature height is greater than 10ft.

tree_matrix<-ifelse(as.matrix(gNDVI) >= 0.02 & as.matrix(Feature_Height) >= 10, 1, 0)
trees_1m<-raster(tree_matrix,template=Feature_Height)
trees_1m[is.na(trees_1m)]<-0

plot(trees_1m,col=whigreen)

Plotting over the Feature Height layer:

plot(Feature_Height,col=bw)
plot(trees_1m_NA,col=greengreen,alpha=0.3,add=TRUE,legend=F)

Plotting over black and white imagery layer:

plot(Rose_Gardens[[1]],col=bw)
plot(trees_1m_NA,col=greengreen,alpha=0.4,add=TRUE,legend=F)

Rasters aren’t limited to simple mathematics either, we can get very advanced very quickly. Here is the Feature_Height with a 9x9 and a 33x33 moving window “mean” applied:

Moving_Window9x9<-focal(Feature_Height,w=matrix(1/81,ncol=9,nrow=9))
Moving_Window33x33<-focal(Feature_Height,w=matrix(1/1089,ncol=33,nrow=33))

par(mar=c(1,2,2,1))
plot(Moving_Window9x9,col=bw,box=F,axes=F,main="9x9 Moving Window (Mean)")

par(mar=c(1,2,2,1))
plot(Moving_Window33x33,col=bw,box=F,axes=F,main="33x33 Moving Window (Mean)")

We can even change the shape of the window, allowing for more complex analysis. Here we will ‘buffer’ the canopy dataset we created to 50 meters (~55 map units, due to 50m = ~164ft, and a resolution of 3ft).

Trees_50m<-focal(trees_1m,focalWeight(trees_1m,55,type = "circle"),func="MEAN")

Now, here is a raster whose pixels represent the percent of the ground coverd by canopy within 50m.

plot(Trees_50m,col=whigreen)

Moving window analysis rasters have infinite possibilities in spatial applications from site/situation analysis to statistical modeling to feature extraction!

Advanced Moving Window Analyses

Heres a far, far more complex application of a moving window analysis to get you thinking. The following function creates multiple moving window analyses and places a weight on the direct the that window would ‘flow’. By applying it many times, we can find something called the “local maxima”.

Local Maxima Function:

require(foreach)

createDirectional<-function(directional,hm){
    return(ifelse(as.matrix(hm-focal(hm,directional,sum))>=0,NA,-1))
}
createWeight<-function(weight, Directional){
    return(ifelse(as.matrix(Directional)>=0,NA,weight))
}

localMaxima<-function(heightRaster){
    mList<-list(matrix(c(1,0,0,0,0,0,0,0,0),ncol=3,nrow=3),
        matrix(c(0,1,0,0,0,0,0,0,0),ncol=3,nrow=3),
        matrix(c(0,0,1,0,0,0,0,0,0),ncol=3,nrow=3),
        matrix(c(0,0,0,1,0,0,0,0,0),ncol=3,nrow=3),
        matrix(c(0,0,0,0,0,1,0,0,0),ncol=3,nrow=3),
        matrix(c(0,0,0,0,0,0,1,0,0),ncol=3,nrow=3),
        matrix(c(0,0,0,0,0,0,0,1,0),ncol=3,nrow=3),
        matrix(c(0,0,0,0,0,0,0,0,1),ncol=3,nrow=3))
    wList<-list(32,16,8,64,4,128,1,2)
    hm<-heightRaster*-1

    hm<-focal(hm,focalWeight(hm, 2.0, type='Gauss'))
    hm<-focal(hm,focalWeight(hm, 2.0, type='Gauss'))
    hm<-focal(hm,focalWeight(hm, 2.4, type='Gauss'))

    dirBool<-lapply(mList,createDirectional,hm=hm)
    byteWeight<-foreach(a=wList, b=dirBool) %do% createWeight(a,b)
    fMatrix<-Reduce('+', byteWeight) 
    fMatrix<-ifelse(fMatrix==255,1,0)
    rPoints<-raster(fMatrix,template=heightRaster)
    fshp<-as.data.frame(rasterToPoints(rPoints)[,c(1,2)])
    fshp$Height<-data.frame(Height=extract(heightRaster,fshp))
    out<-SpatialPointsDataFrame(coords=fshp[,c(1,2)],proj4string=crs(heightRaster),data=fshp[,3])
    return(out)
}

Advanced Moving Window Analyses

Lets run that function:

local_maxima<-localMaxima(Feature_Height)

Advanced Moving Window Analyses

Now we have a local maxima points dataset, so lets write another function that will points 10ft+ where the gNDVI 0.02+:

makeTrees<-function(lMaxima,ndvi, spectralThreshold,height){
    lMaxima$NDVI<-extract(ndvi,lMaxima)
    lMaxima<-lMaxima[!(is.na(lMaxima$NDVI)),]
    lMaxima<-lMaxima[lMaxima$NDVI > spectralThreshold,]
    lMaxima<-lMaxima[lMaxima$Height >= height,]
}

Advanced Moving Window Analyses

Now we can run our local_maxima dataset through our makeTrees() function:

tree_points<-makeTrees(local_maxima,gNDVI,0.02,10)

Advanced Moving Window Analyses

Now, lets plot those trees over an image:

plot(Rose_Gardens[[2]],col=bw)
points(tree_points,col="green3",pch=19,cex=0.4)

Here is a detail:

Raster Correlation

Reading in the Data

UHI<-raster("I:/Research/Shares/hthp/CanopyLiDAR/voelkel/Working/UHI/ASTER/resamps/b_8_22_2014.tif")
ASTER<-raster("I:/Research/Shares/hthp/CanopyLiDAR/voelkel/Working/UHI/ASTER/8_22/8_22_2014_NULL.tif")

par(mfrow=c(1,2))
plot(ASTER,col=blured,box=F,axes=F,main="ASTER")
hist(ASTER,main="ASTER Histogram",breaks=50)

par(mfrow=c(1,2))
plot(UHI,col=blured,box=F,axes=F,main="UHI")
hist(UHI,main="UHI Histogram",breaks=50)

Clipping the ASTER tile

UHI2<-UHI
UHI2[!(is.na(UHI2)),]<-1
ASTER<-((UHI2 * ASTER) - 273.15) * 0.01

par(mfrow=c(1,2))
plot(UHI,col=blured,main="UHI Model")
plot(ASTER,col=blured,main="ASTER Data")

Assessing Correlation:

aster_data<-as.data.frame(ASTER)
aster_data<-aster_data[!is.na(aster_data),]

uhi_data<-as.data.frame(UHI)
uhi_data<-uhi_data[!is.na(uhi_data),]

mod<-lm(aster_data~uhi_data)

Assessing the Model

summary(mod)
## 
## Call:
## lm(formula = aster_data ~ uhi_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.32537 -0.19313  0.09273  0.30108  2.33576 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 18.805727   0.058772   320.0   <2e-16 ***
## uhi_data     0.292384   0.001895   154.3   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5048 on 34665 degrees of freedom
## Multiple R-squared:  0.4072, Adjusted R-squared:  0.4072 
## F-statistic: 2.381e+04 on 1 and 34665 DF,  p-value: < 2.2e-16

Assessing the Model

plot(mod,which=3,main="Heteroskedasticy Check")

Creating a Raster from the Residuals

error_raster<-UHI
error_raster[!is.na(error_raster),]<-mod$residuals*-1

Overpredition and Underprediction (Red and Blue, respectively) in °C

plot(error_raster,col=blured)

Rasters can also quickly be resampled/classified!

classBreaks<-classIntervals(as.vector(error_raster), 5, style="quantile")
classBreaks$brks[1] <- classBreaks$brks[1] - 1
uhiClassified<-cut(error_raster,classBreaks$brks)

Classified Residuals Raster

Raster Visualization

rasterVis Plotting

library(rasterVis)

The rasterVis package makes advanced plotting of rasters even easier.

Levelplot

levelplot(error_raster)

Vector Plot

vectorplot(dem)

Perspective Plots

persp(dem,phi=30,theta=30,r=1,expand=0.3,border=NA,shade=0.2,main="Rose Gardens - DEM",xlab="Longitude",ylab="Latitude",zlab="Elevation")

Webmapping

Leaflet

  • Open Source JavaScript library for webmapping

  • Used by: Pinterest, Facebook, Data.gov, flickr, NPR, The Washington Post, National Parks Service, and many, many more

Leaflet in R

  • R can create more-than-basic Leaflet webmaps..
  • … without any need to use JavaScript …
  • … but you should learn JavaScript anyways! (person opinion)

A Basic Leaflet Map

library(leaflet)
basic_map<-leaflet() %>% addTiles() %>% addMarkers(lng=-122.65,lat=45.5,popup=paste0("This is a popup!"))

Leaflet map with marker:

Custom Leaflet in R

custom_map<-leaflet() %>% addTiles("http://{s}.sm.mapstack.stamen.com/((toner-background,$fff[@40],$7ab8b8[hsl-color]),(buildings,$fff[@10],$ff6600[hsl-color@90]))/{z}/{x}/{y}.png") %>% addMarkers(lng=-122.65,lat=45.5,popup=paste0("This is a popup!"))

Leaflet map with custom basemap:

Adding your own data to Leaflet in R

library(rgeos)
projCities<-spTransform(cities,CRS(CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")))

class <- classIntervals(cities@data$AREA, 15, style="hclust")
colcode <- findColours(class, redgreen)

popper<-paste0("City name:  <strong>",cities$CITYNAME,"</strong><br><br>City area:  <strong>",cities$AREA,"</strong>")

own_data<-leaflet() %>% addTiles("http://{s}.tile.stamen.com/toner-lite/{z}/{x}/{y}.png") %>% addPolygons(data=projCities,stroke = TRUE, 
    color = "black", weight = 1, opacity = 0.5, fill = TRUE, fillColor = colcode, 
    fillOpacity = 0.8,popup=popper)

Leaflet map with data:

Adding SMALL rasters

raster_map<-leaflet() %>% addTiles("https://www.portlandmaps.com/arcgis/rest/services/Public/Aerial_Photos_Summer_2015/MapServer/tile/{z}/{y}/{x}") %>% addRasterImage(error_raster,colors=blured, opacity=.4)

Leaflet map with a static raster:

Accessing the Web

Scraping Wikipedia

library(rvest)
url <- "https://en.wikipedia.org/wiki/List_of_tallest_buildings_in_Portland,_Oregon"
buildings <- url %>%
  read_html() %>%
  html_nodes(xpath='//*[@id="mw-content-text"]/table[1]') %>%
  html_table()
buildings <- buildings[[1]]

Scraping Wikipedia

buildings<-buildings[,-c(3,7)]
head(buildings)
##   Rank                   Name Heightfeet / m Floors Year
## 1    1     Wells Fargo Center      546 / 166     41 1972
## 2    2     U.S. Bancorp Tower      536 / 163     43 1983
## 3    3            KOIN Center      509 / 155     35 1984
## 4    4 Park Avenue West Tower      502 / 154     30 2016
## 5    5         PacWest Center      418 / 127     29 1984
## 6    6              Fox Tower      372 / 113     27 2000

Geocoding with Google

library(ggmap)
LL<-geocode(paste0(buildings$Name," portland oregon"),output = "latlon")
buildings<-cbind(buildings,LL)
buildings<-buildings[!is.na(buildings$lat),]
b<-SpatialPointsDataFrame(data.frame(buildings[,c(6,7)]),data=buildings)

Putting it all together…

class <- classIntervals(b$Floors, 15, style="hclust")
colcode <- findColours(class, redgreen)

pop=paste0("Name:  <strong>",b$Name,"</strong><br><br>Rank:  <strong>",b$Rank,"</strong><br><br>Height (ft/m):  <strong>",b@data[,3],"</strong>")
geo_map<-leaflet() %>% addTiles("http://{s}.tile.stamen.com/toner-lite/{z}/{x}/{y}.png") %>% addCircleMarkers(data=b,popup=pop,fillColor=colcode,col=colcode,fillOpacity = 0.75,radius=6)

Scraped Wikipedia Data, Geocoded with Google!

One more, with mountains:

url <- "https://en.wikipedia.org/wiki/List_of_highest_mountains"
mountains <- url %>%
  read_html() %>%
  html_nodes(xpath='//*[@id="mw-content-text"]/table[2]') %>%
  html_table(fill=T)
mountains <- mountains[[1]]
mt<-mountains[-1,c(1,2,3)]
mt$Mountain<-unlist(lapply(mt$Mountain,function(x) strsplit(x,"/")[[1]][1]))
coord1<-unlist(lapply(2:(nrow(mt)+1),function(x) gsub("°E","",gsub("°N","",strsplit(mountains[x,7], "/")[[1]][2]))))
mt$lat<-as.vector(unlist(lapply(coord1, function(x) strsplit(x, " ")[[1]][2])))
mt$lat<-as.numeric(iconv(mt$lat, "latin1", "ASCII", sub=""))
mt$lon<-as.vector(unlist(lapply(coord1, function(x) strsplit(x, " ")[[1]][3])))
mt$lon<-as.numeric(iconv(mt$lon, "latin1", "ASCII", sub=""))
mt$Rank<-as.numeric(mt$Rank)
rownames(mt)<-1:118
mt$Rank<-1:118

mt_points<-SpatialPointsDataFrame(data.frame(mt[,c(5,4)]),data=mt)
mt_col<-colorRampPalette(c("yellow2","purple"))(50)
class <- classIntervals(as.numeric(mt$Rank), 9, style="hclust")
colcode <- findColours(class, mt_col)
pop=paste0("Rank:  <strong>",mt$Rank,"</strong><br><br>Name:  <strong>",mt$Mountain,"</strong><br><br>Height (ft/m):  <strong>",mt$`Height[3]`,"</strong><br><br>Google Maps:  <strong><a href=", paste0(paste0('https://www.google.com/maps/place/' ,mt$lat,  ',',mt$lon, '/@', mt$lat,',', mt$lon, ',79m/data=!3m1!1e3')," target = '_blank'>Click Me!</a></strong>"))

mt_map<-leaflet() %>% addTiles("http://{s}.tile.stamen.com/terrain/{z}/{x}/{y}.png") %>% addCircleMarkers(data=mt_points,popup=pop,fillColor=colcode,col=colcode,opacity=1,weight=2.5,fillOpacity = 0.5,radius=8)

Scraped Wikipedia Data 2:

Advanced topics

Raster Modelling

Predictive Surfaces

Created with thousands of field observations as independent variables, 90 land cover attributes, and Random Forest machine learning.

Voelkel, J., V. Shandas, and B Haggerty (2016). Developing High-Resolution Descriptions of Urban Heat Islands: A public health imperative. Preventing Chronic Disease.

Classification

Created with ~900 spectral, structural, and trigonometric variables in an ensemble machine learning analysis.

Machine Learning-derived evergreen/deciduous classification (~90% accuracy). Collaboration with Metro. Open Data.

Parallel Computing

Apply, Sapply, Lappy

Do multiple things at once, without a loop!

system.time(lapply(1:3,function(x) Sys.sleep(x)))
##    user  system elapsed 
##    0.00    0.00    5.99

parApply, parSapply, parLapply

library(snow)
cl<-makeCluster(3,"SOCK")
system.time(parLapply(cl, 1:3,function(x) Sys.sleep(x)))
##    user  system elapsed 
##    0.00    0.00    2.99
stopCluster(cl)

Parallelize Objects!

cl<-makeCluster(3,"SOCK")
clusterExport(cl,list = c("focal"))
system.time(pr<-parLapply(cl, c(dem,dsm,Feature_Height),function(x) focal(x,w=matrix(1/1089,ncol=33,nrow=33))))
##    user  system elapsed 
##    0.02    0.09    3.26
stopCluster(cl)

Plotting the Parallel-Processed Rasters:

par(mfrow=c(1,3))
invisible(lapply(1:3,function(x) plot(pr[[x]])))

Beyond R

Reading External Data

lidar_specs<-read.csv("ftp://coast.noaa.gov/pub/DigitalCoast/lidar1_z/geoid12b/data/5037/pr2016_usace_ncmp_pr_m5037_minmax.csv")
head(lidar_specs)
##                                   Filename     min_x     max_x    min_y
## 1 ./20160516_ncmp_puertorico_20qkf2432.laz -65.61191 -65.61154 18.36845
## 2 ./20160516_ncmp_puertorico_20qkf1836.laz -65.66920 -65.65961 18.39535
## 3 ./20160516_ncmp_puertorico_20qkf1736.laz -65.67866 -65.66906 18.39522
## 4 ./20160516_ncmp_puertorico_20qkf1735.laz -65.67852 -65.66893 18.38619
## 5 ./20160516_ncmp_puertorico_20qkf1635.laz -65.68798 -65.67838 18.38605
## 6 ./20160516_ncmp_puertorico_19qha0442.laz -66.12191 -66.11342 18.44634
##      max_y  min_z  max_z
## 1 18.36905 -18.54 -17.98
## 2 18.40451 -26.21 -17.07
## 3 18.40438 -25.50 -16.41
## 4 18.39535 -24.10 -18.77
## 5 18.39521 -24.08 -20.63
## 6 18.45495 -10.33  -2.59

Calling Command Line Tools

  • Python (utilities, servers, ArcPy)
  • LASTools (lidar classification tool)
system('Command Line Argument')

LasTools Classified Lidar Data

(No QA/QC)

Familiar GIS Packages

  • RSAGA (SAGA GIS)
  • rgrass7 (GRASS GIS)
  • RPyGeo (Yes, even ArcGIS)

Conclusion

Things I didn’t get to:

  • Fully Interactive Plots
  • Fully Interactive 3D Models
  • R Markdown
  • Shiny Applications
  • Time series 3D Raster Modeling (NetCDF)
  • Spatial Statistics (spdep package)
  • Census Data (tigris package)
  • Too many more to count!

How do I get started?

  • R Stuido (free, easier to understand)
  • R package documentation - the best (and most frustrating) way to learn new tools.
  • Blogs, Stack Overflow
  • Google (The one R instructor to rule them all, and in the darkness cbind() them.)

Takeaway

  • Learn how processes actually work
  • Share with collaborators
  • Freedom!