tidycensus

The tidycensus package, developed by Kyle Walker, is very convenient and easy to use package for making choropleth maps from United States Department of Census data. Tidycensus uses the Decennial or ACS Census reports. This package makes it possible to gather census variables and conveniently join those variables with “Census Geography” (i.e. aka “shapefiles”, or polygons.) Visualization can be done with separate packages such as mapview, leaflet, or ggplot2::geom_sf().

library(tidyverse)
library(sf)
library(tidycensus)
library(mapview)

Census API Key

you need a free Census API key. Kyle Walker’s Basic usage of tidycensus documents this process.

census_api_key("YOUR API KEY GOES HERE")

.Renviron File

See also Kyle’s more detailed documentation for caching the API key in your R environment.

TidyCensus – Get Data

Create a Simple Features (i.e. sf) dataframe using tidycensus::get_acs()

The Census population variable we’ll use is “B01003_001”. More information about identifying Census variables is available at the bottom of this page.

nc_pop <- 
  get_acs(geography = "county",
          variables = "B01003_001",
          state = "NC",
          geometry = TRUE)

Make Choropleth via mapview

Identify which variable will be used to create the color ramp shading. Assign this variable with the zcol argument. The estimate variable was extracted via the tidycensus::get_acs() function.

mapview(nc_pop, zcol = "estimate")

Add another layer

Now we’ll geolocate the Starbucks stores and add those locations as a layer over our choropleth. The Starbucks locations were generated and plotted in the previous exercise. Here we regenerate the StarbuckNC object.

Load Lat/Long Data

starbucks <- read_csv("data/All_Starbucks_Locations_in_the_US_-_Map.csv",
                      show_col_types = FALSE)

Subset Starbucks Data to North Carolina

starbucksNC <- starbucks  %>% 
  filter(State == "NC")

Convert the starbucksNC data frame to a spatial (sf) object and assign the same projection as the nc_pop spatial object.

starbucksNC <- st_as_sf(starbucksNC, coords = c("Longitude", "Latitude"),  crs = st_crs(nc_pop))

Generate the map with multiple layers. You can read more about additional arguments such as homebutton, legend, alpha, cex in the mapview() documentation. Read about the many more mapview functions in the full documentation.

library(leafem)
mymap <- mapview(nc_pop, 
                 zcol = "estimate", 
                 homebutton = FALSE) + 
  mapview(starbucksNC, 
          zcol = "Name", 
          legend = FALSE, 
          alpha = 0.5, cex = 3, 
          col.regions = "orange",
          homebutton = FALSE) 

addLogo(mymap, "images/Rfun3.png",
        position = "bottomright",
        offset.x = 8,
        offset.y = 38,
        width = 100,
        height = 100)

Alaska & Hawaii - Shift

Shift and re-scale Alaska and Hawaii for convenient cartographic display of the entire US.

population <- get_acs(geography = "state",
                variables = "B01003_001",
                geometry = TRUE,
                shift_geo = TRUE)
Getting data from the 2017-2021 5-year ACS
Warning: The `shift_geo` argument is deprecated and will be removed in a future
release. We recommend using `tigris::shift_geometry()` instead.
Using feature geometry obtained from the albersusa package
Please note: Alaska and Hawaii are being shifted and are not to scale.
old-style crs object detected; please recreate object with a recent sf::st_crs()
mapviewOptions(legend.pos = "bottomright")
mapviewOptions(leafletWidth = 800)
mapview(population, zcol = "estimate", native.crs = TRUE, crs = 5070)
Warning: sf layer is not long-lat data
Warning: sf layer has inconsistent datum (+proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs).
Need '+proj=longlat +datum=WGS84'

Census

During the workshop I will discuss the following concepts in more detail.

  • ACS v Decennial
  • Variable Names / Numbers
  • More on Census Geography (shapefiles)

Variables

The Census is a very large collection of data. Many casual users of Census data are interested in a single data point, for example population by County. Given the complexity and richness of available Census data, finding a useful data variable can be quite a bit of work. The links below are a some methods one might use to identify the proper code name with a Census data variable.

Shapefiles

Shapefiles are an important GIS data standard used frequently in thematic mapping. There are many other standards, although shapefiles have a very broad user base. If you need shapefiles for other geographies, please consult the guide to geospatial applications using the R programming language

End Notes

This session based on