library(tidyverse) # Tidyverse for Tidy Data
library(readxl)
library(tmap) # Thematic Mapping
library(tmaptools)
library(tigris) # Get Census Geography Poloygons
library(sf)
Thematic Mapping with tmap
Shapefiles as sf
Using the tigris
package, get Census Tiger shapefiles for census geographies. Tigris will return the shapefile in the sf
, or simple features, format.
us_geo <- tigris::states(class = "sf")
<- tigris::states(class = "sf") us_geo
Data Structure
The “Simple Features” (sf
) data structure can easily be viewed and manipulated as a rectangular data frame, before visualizing. As an historical note – an sf
predecessor – the sp
data structure uses @data
slots to hold data. We’ll focus on the sf
package. Below are two methods of viewing the structure of the downloaded shapefiles.
class(us_geo)
[1] "sf" "data.frame"
glimpse(us_geo)
Rows: 56
Columns: 15
$ REGION <chr> "3", "3", "2", "2", "3", "1", "4", "1", "3", "1", "1", "3", "…
$ DIVISION <chr> "5", "5", "3", "4", "5", "1", "8", "1", "5", "1", "1", "5", "…
$ STATEFP <chr> "54", "12", "17", "27", "24", "44", "16", "33", "37", "50", "…
$ STATENS <chr> "01779805", "00294478", "01779784", "00662849", "01714934", "…
$ GEOID <chr> "54", "12", "17", "27", "24", "44", "16", "33", "37", "50", "…
$ STUSPS <chr> "WV", "FL", "IL", "MN", "MD", "RI", "ID", "NH", "NC", "VT", "…
$ NAME <chr> "West Virginia", "Florida", "Illinois", "Minnesota", "Marylan…
$ LSAD <chr> "00", "00", "00", "00", "00", "00", "00", "00", "00", "00", "…
$ MTFCC <chr> "G4000", "G4000", "G4000", "G4000", "G4000", "G4000", "G4000"…
$ FUNCSTAT <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "…
$ ALAND <dbl> 62266298634, 138961722096, 143778561906, 206232627084, 251519…
$ AWATER <dbl> 489204185, 45972570361, 6216493488, 18949394733, 6979074857, …
$ INTPTLAT <chr> "+38.6472854", "+28.3989775", "+40.1028754", "+46.3159573", "…
$ INTPTLON <chr> "-080.6183274", "-082.5143005", "-089.1526108", "-094.1996043…
$ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-80.85847 3..., MULTIPOLYGON (((…
sf as data frame
And as noted, here’s the data frame view. Notice the geometry (polygon shape) in the far-right column of the data frame.
as_tibble(us_geo)
Quick Plotting
If you want to see a very quick view of your mapping data, you can plot the geometry data with the plot
function. In this case we use the sf::st_geometry()
function to plot only the geometry. You can quickly generate a faceted map by excluding the st_geometry
function: e.g. plot(us_geo)
but that will consume computation cycles (i.e. wait time). Therefore, I recommend trying the smaller layer for now.
Note: Census geography for the USA will span the globe in part becuase Region 9 includes a multitude of pacific islands. Later we will limit to simply the “lower 48” states.
plot(st_geometry(us_geo))
Get BLS data
I’ve already downloaded and stored some data from the Bureau of Labor Statistics. Those data are stored in an excel file in the data
directory of the repository: data/OES_Report.xlsx
. The goal is to attach this data to the previously downloaded shapefiles.
But you may be interested in how I gathered the data. below are some summary notes documenting my steps of gathering the data from the Bureau of Labor Statistics.
https://data.bls.gov/oes/#/occGeo/One%20occupation%20for%20multiple%20geographical%20areas
One occupation for multiple geographical areas
Mental Health and Substance Abuse Social Workers
State
All States in this list
Annual Mean wage
- Excel
Read the Data in with the RStudio “Import Dataset” wizard available in the Environment tab. This will generate the code below and ensure the import
- Skips the first 4 lines
- Coerces the 2nd column to numeric
<-
Salary4Helpers read_excel("data/OES_Report.xlsx",
col_types = c("text", "numeric"),
skip = 4)
Salary4Helpers
Wrangle the data
Before we join the BLS data to the shapefile we need to transform the structure of the downloaded BLS data
<- Salary4Helpers %>%
BlsWage_ToJoin rename(Area = "Area Name") %>%
rename(wages = "Annual mean wage(2)") %>%
mutate(State = gsub("\\(\\d{7}\\)", "", Area)) %>%
filter(wages != "NA_character_") %>%
select(State, wages)
#BlsWage_ToJoin
Join data
Use the dplyr::left_join()
function or the base-R merge()
function to append BLS data to the previously loaded shape object
<- left_join(us_geo, BlsWage_ToJoin, by = c("NAME" = "State"))
HelperShapeObject
as_tibble(HelperShapeObject)
Quick Thematic Map
qtm(HelperShapeObject, fill = "wages")
50 states
Filter to only the contiguous 50 states + D.C. Note that tigris::shift_geometry()
will shift and resize Alaska and Hawaii.
<- HelperShapeObject %>%
contiguous_states filter(REGION != 9) %>%
shift_geometry()
Make Choropleth
tm_shape(contiguous_states) +
tm_polygons("wages", id = "Name")
Projection
Mark likes the USA_Contiguous_Albers_Equal_Area_Conic_USGS_version projection for the continental US. EPSG:5070
%>%
contiguous_states st_transform(5070) %>%
tm_shape() +
tm_polygons("wages", id = "Name")
Alternative Syntax
tm_shape(contiguous_states, projection = 5070) +
tm_polygons("wages", id = "Name")
Explore tmap
syntax and functions
tm_shape(contiguous_states, projection = 5070) +
tm_borders(col = "black", alpha = 0.4) +
tm_fill(col = "REGION", alpha = 0.6) +
tm_layout(title = "Regions of the USA",
attr.color = "navy",
title.position = c("center", "top"),
title.bg.color = "lightblue")
End Notes
This session inspired by https://www.computerworld.com/article/3175623/data-analytics/mapping-in-r-just-got-a-whole-lot-easier.html