Thematic Mapping with tmap

library(tidyverse)       # Tidyverse for Tidy Data
library(readxl)
library(tmap)            # Thematic Mapping
library(tmaptools) 
library(tigris)          # Get Census Geography Poloygons
library(sf)

Shapefiles as sf

Using the tigris package, get Census Tiger shapefiles for census geographies. Tigris will return the shapefile in the sf, or simple features, format.

us_geo <- tigris::states(class = "sf") 
us_geo <- tigris::states(class = "sf") 

Data Structure

The “Simple Features” (sf) data structure can easily be viewed and manipulated as a rectangular data frame, before visualizing. As an historical note – an sf predecessor – the sp data structure uses @data slots to hold data. We’ll focus on the sf package. Below are two methods of viewing the structure of the downloaded shapefiles.

class(us_geo)
[1] "sf"         "data.frame"
glimpse(us_geo)
Rows: 56
Columns: 15
$ REGION   <chr> "3", "3", "2", "2", "3", "1", "4", "1", "3", "1", "1", "3", "…
$ DIVISION <chr> "5", "5", "3", "4", "5", "1", "8", "1", "5", "1", "1", "5", "…
$ STATEFP  <chr> "54", "12", "17", "27", "24", "44", "16", "33", "37", "50", "…
$ STATENS  <chr> "01779805", "00294478", "01779784", "00662849", "01714934", "…
$ GEOID    <chr> "54", "12", "17", "27", "24", "44", "16", "33", "37", "50", "…
$ STUSPS   <chr> "WV", "FL", "IL", "MN", "MD", "RI", "ID", "NH", "NC", "VT", "…
$ NAME     <chr> "West Virginia", "Florida", "Illinois", "Minnesota", "Marylan…
$ LSAD     <chr> "00", "00", "00", "00", "00", "00", "00", "00", "00", "00", "…
$ MTFCC    <chr> "G4000", "G4000", "G4000", "G4000", "G4000", "G4000", "G4000"…
$ FUNCSTAT <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "…
$ ALAND    <dbl> 62266298634, 138961722096, 143778561906, 206232627084, 251519…
$ AWATER   <dbl> 489204185, 45972570361, 6216493488, 18949394733, 6979074857, …
$ INTPTLAT <chr> "+38.6472854", "+28.3989775", "+40.1028754", "+46.3159573", "…
$ INTPTLON <chr> "-080.6183274", "-082.5143005", "-089.1526108", "-094.1996043…
$ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-80.85847 3..., MULTIPOLYGON (((…

sf as data frame

And as noted, here’s the data frame view. Notice the geometry (polygon shape) in the far-right column of the data frame.

as_tibble(us_geo)
ABCDEFGHIJ0123456789
REGION
<chr>
DIVISION
<chr>
STATEFP
<chr>
STATENS
<chr>
GEOID
<chr>
STUSPS
<chr>
35540177980554WV
35120029447812FL
23170177978417IL
24270066284927MN
35240171493424MD
11440121983544RI
48160177978316ID
11330177979433NH
35370102761637NC
11500177980250VT

Quick Plotting

If you want to see a very quick view of your mapping data, you can plot the geometry data with the plot function. In this case we use the sf::st_geometry() function to plot only the geometry. You can quickly generate a faceted map by excluding the st_geometry function: e.g. plot(us_geo) but that will consume computation cycles (i.e. wait time). Therefore, I recommend trying the smaller layer for now.

Note: Census geography for the USA will span the globe in part becuase Region 9 includes a multitude of pacific islands. Later we will limit to simply the “lower 48” states.

plot(st_geometry(us_geo))

Get BLS data

I’ve already downloaded and stored some data from the Bureau of Labor Statistics. Those data are stored in an excel file in the data directory of the repository: data/OES_Report.xlsx. The goal is to attach this data to the previously downloaded shapefiles.

But you may be interested in how I gathered the data. below are some summary notes documenting my steps of gathering the data from the Bureau of Labor Statistics.

https://data.bls.gov/oes/#/occGeo/One%20occupation%20for%20multiple%20geographical%20areas

  • One occupation for multiple geographical areas

    • Mental Health and Substance Abuse Social Workers

      • State

      • All States in this list

      • Annual Mean wage

        • Excel
  • Read the Data in with the RStudio “Import Dataset” wizard available in the Environment tab. This will generate the code below and ensure the import

    • Skips the first 4 lines
    • Coerces the 2nd column to numeric
Salary4Helpers <- 
  read_excel("data/OES_Report.xlsx",
             col_types = c("text", "numeric"), 
             skip = 4)

Salary4Helpers
ABCDEFGHIJ0123456789
Area Name
<chr>
Alabama(0100000)
Alaska(0200000)
Arizona(0400000)
Arkansas(0500000)
California(0600000)
Colorado(0800000)
Connecticut(0900000)
Delaware(1000000)
District of Columbia(1100000)
Florida(1200000)

Wrangle the data

Before we join the BLS data to the shapefile we need to transform the structure of the downloaded BLS data

BlsWage_ToJoin <- Salary4Helpers %>% 
  rename(Area = "Area Name") %>% 
  rename(wages = "Annual mean wage(2)") %>% 
  mutate(State = gsub("\\(\\d{7}\\)", "", Area)) %>% 
  filter(wages != "NA_character_") %>% 
  select(State, wages)

#BlsWage_ToJoin

Join data

Use the dplyr::left_join() function or the base-R merge() function to append BLS data to the previously loaded shape object

HelperShapeObject <- left_join(us_geo, BlsWage_ToJoin, by = c("NAME" = "State"))

as_tibble(HelperShapeObject)
ABCDEFGHIJ0123456789
REGION
<chr>
DIVISION
<chr>
STATEFP
<chr>
STATENS
<chr>
GEOID
<chr>
STUSPS
<chr>
35540177980554WV
35120029447812FL
23170177978417IL
24270066284927MN
35240171493424MD
11440121983544RI
48160177978316ID
11330177979433NH
35370102761637NC
11500177980250VT

Quick Thematic Map

qtm(HelperShapeObject, fill = "wages")

50 states

Filter to only the contiguous 50 states + D.C. Note that tigris::shift_geometry() will shift and resize Alaska and Hawaii.

contiguous_states <- HelperShapeObject %>% 
  filter(REGION != 9) %>% 
  shift_geometry()

Make Choropleth

tm_shape(contiguous_states) +
  tm_polygons("wages", id = "Name")

Projection

Mark likes the USA_Contiguous_Albers_Equal_Area_Conic_USGS_version projection for the continental US. EPSG:5070

contiguous_states %>% 
  st_transform(5070) %>% 
  tm_shape() +
  tm_polygons("wages", id = "Name") 

Alternative Syntax

tm_shape(contiguous_states, projection = 5070) +
  tm_polygons("wages", id = "Name")

Explore tmap syntax and functions

tm_shape(contiguous_states, projection = 5070) +
  tm_borders(col = "black", alpha = 0.4) +
  tm_fill(col = "REGION", alpha = 0.6) +
  tm_layout(title = "Regions of the USA", 
                  attr.color = "navy", 
                  title.position = c("center", "top"), 
                  title.bg.color = "lightblue")

End Notes

This session inspired by https://www.computerworld.com/article/3175623/data-analytics/mapping-in-r-just-got-a-whole-lot-easier.html