Mapping water insecurity in R with tidycensus
Using the `tidycensus` package to access and visualize American Community Survey (ACS) data.
What's on this page
![Banner that displays three choropleth maps displaying percent hispanic, median gross rent, and average household size using 2022 U.S. Census Bureau Data.](https://waterdata.usgs.gov/blog/static/acs-maps/tidycensus-intro-banner.png)
Water insecurity can be influenced by number of social vulnerability indicators—from demographic characteristics to living conditions and socioeconomic status
—that vary spatially across the U.S. This blog shows how the tidycensus
package for R can be used to access U.S. Census Bureau data, including the American Community Surveys, as featured in the “Unequal Access to Water
” data visualization from the USGS Vizlab. It offers reproducible code examples demonstrating use of tidycensus
for easy exploration and visualization of social vulnerability indicators in the Western U.S.
Getting started with tidycensus
tidycensus
is a package developed by Walker and Herman (2024), that helps R users access U.S. Census Bureau APIs in tidyverse
-friendly formats and sf
-compatible spatial layers. Before you can begin using tidycensus
, you’ll need to install the package and set up an API key. Go to http://api.census.gov/data/key_signup.html
for an API key. Then, run the following in the R console:
Exploring U.S. Census Bureau variables
The U.S. Census Bureau collects data such as population, age, education, and income about the American people. These data are available from a variety of sources such as the Decennial Census , American Community Survey (ACS) , and Household Pulse Survey . The Decennial Census is a complete count of the U.S. population conducted every ten years to allocate congressional representation and federal funding, while the ACS serves as an ongoing annual survey that collects detailed demographic, social, and economic data from a sample of households, for example.
The load_variables()
function lets you access U.S. Census Bureau datasets such the ACS and Decennial Census. The American Community Survey (ACS) data is available for each year since its initiation in 2005. The ACS provides both 1-year estimates (for areas with populations of 65,000 or more) and 5-year estimates (for smaller areas), which cover the following periods:
- 1-Year Estimates: Available annually from 2005 onwards.
- 5-Year Estimates: Available for the periods from 2005-2009 to the current year (e.g., the 2018-2022 estimates were released in December 2023).
Let’s begin by viewing the ACS 1-year estimates variables available for 2023:
Once you have identified your variables of interest, you can being pulling the associated data, specifying variables such as year, states of interest, geography (county, census tract, census division, and more). Here, we will pull total counts of owner occupied households lacking plumbing facilities across the Western U.S. (B25049_004
) and total population (B01003_001
) at the county level for 2022 and 2023. Lack of plumbing facilities can be defined as a household that does not have complete plumbing facilities, which are typically defined as:
- Hot and cold running water
- A flush toilet
- A bathtub or shower
With the Census variable identified, we can now begin pulling Census data. To do this we will write our own function to do this. If you are unfamiliar with writing functions, see the Writing functions that work for you and Writing better functions blog posts to get you started.
You can use get_acs()
for obtaining data and feature geometry for the ACS. Similarly, you can use get_decennial()
to obtain data and feature geometry for the Decennial Census. For the case of this blog, we used a function to pull ACS U.S. Census Bureau data.
Processing U.S. Census Bureau data
The total owner occupied households lacking plumbing facilities across counties the Western U.S. for 2023 and 2022 are contained in the western_data_2023
and western_data_2022
data frames, respectively. The datasets, western_data_2023
and western_data_2022
, contain information about counties in the Western U.S., including population numbers, households lacking plumbing, and their geographic boundaries. Each row represents a county, with columns showing the county’s ID (geoid
), county name
, the type of data (e.g., total population or households lacking plumbing), the estimated value, the margin of error (moe
), and the county’s map boundaries (geometry
). Let’s now do some processing to get the data in a wide format, create new column names for the Census variables, and calculate the percent of the total county populations lacking plumbing facilities.
Visualizing U.S. Census Bureau data with tigris
and sf
Let’s plot the data! Using the tigris
package we can directly access TIGER/Line shapefiles
from the U.S. Census Bureau for U.S. political boundaries:
![Choropleth map of percent of households lacking plumbing facilities in 2023 across counties in the Western U.S. Counties with the greatest percent of lacking plumbing facilities include Apache County, AZ (3.9%), McKinley County, NM (2.3%), and Navajo County, AZ (1.9%).](https://waterdata.usgs.gov/blog/static/acs-maps/western_percent_plumbing_facilities_2023.png)
Choropleth map of percent of households lacking plumbing facilities in 2023 across counties in the Western U.S. Counties with the greatest percent of lacking plumbing facilities include Apache County, AZ (3.9%), McKinley County, NM (2.3%), and Navajo County, AZ (1.9%).
![Choropleth map of percent of households lacking plumbing facilities in 2022 across counties in the Western U.S. Counties with the greatest percent of lacking plumbing facilities include Apache County, AZ (3.7%), McKinley County, NM (2.8%), and Navajo County, AZ (1.6%).](https://waterdata.usgs.gov/blog/static/acs-maps/western_percent_plumbing_facilities_2022.png)
Choropleth map of percent of households lacking plumbing facilities in 2022 across counties in the Western U.S. Counties with the greatest percent of lacking plumbing facilities include Apache County, AZ (3.7%), McKinley County, NM (2.8%), and Navajo County, AZ (1.6%).
View differences in percent plumbing facilties for states of interest
Let’s now take a look certain states in the Western U.S. and how lack of complete indoor plumbing compares between 2022 and 2023 Census data. This county-level figure displays patterns and differences that provide deeper insight into infrastructure disparities and water insecurity across the region.
![](https://waterdata.usgs.gov/blog/static/acs-maps/nm_change_plumbing.png)
![](https://waterdata.usgs.gov/blog/static/acs-maps/az_change_plumbing.png)
Visualizing total population vs. percent lacking plumbing across counties in the Western U.S.
While the choropleth map gives us a spatial snapshot of where households lack plumbing facilities, it does not provide the full picture. For instance, how population size relates incomplete plumbing. Low levels of incomplete plumbing in a highly populated county could still affect many households, while less populated counties lacking indoor plumbing may represent fewer absolute households.
To explore this let’s plot total population against the percentage of households lacking plumbing using counties in the Western U.S. This scatter plot enables us to identify both highly populated areas that lack plumbing facilities and less well-populated counties with higher relative plumbing insecurity.
Note: zoom in and explore the details of the interactive plot.
Highlighting use cases associated with water insecurity in the Western U.S.
Using tidycensus
, we can make use of U.S. Census Bureau data to plot various variables such as population, age, education, income, and more. With our recent release “Unequal access to water: How societal factors shape vulnerability to water insecurity
”, we highlight various vulnerability indicators across the Western states that are related to water insecurity using the tidycensus
package. This website highlights recent research “Social Vulnerability and Water Insecurity in the Western United States: A Systematic Review of Framings, Indicators, and Uncertainty
” that finds certain demographic traits and socioeconomic circumstances, along with increased hazard exposures, make some people more susceptible to water insecurity than others. Specifically, indicators such as household sizes, Hispanic populations, disabled populations, income inequalities, and renter disparities are all factors associated with water insecurity in the Western U.S. For example, literature showed indicators of household size, female-headed households, female population, and percentage of females in the labor force were all predominantly positively related and influential to water insecurity conditions (Drakes et al., 2024)
.
![Choropleth map of average household size, of occupied housing units, at the county-level across the contiguous U.S.. The greatest average housing size were in Oglala Lakota County, South Dakota (5), Madison County, Idaho (3.9) and Todd County, South Dakota (3.8). Data from U.S. Census Bureau, 2022.](https://waterdata.usgs.gov/blog/static/acs-maps/avg_household_size_2022.png)
Takeaways
tidycensus
allows us to pull, process, and visualize census data from the U.S. Census Bureau’s API, including Decennial Census data and American Community Survey (ACS) data. The package allows us to seamlessly work with associated R packages such as tidyverse
, sf
, and tigiris
as displayed above. By visualizing census data, we can create informative maps and charts that reveal patterns and disparities across different states and regions, helping to inform research, water-resource partners, and the public.
Additional resources
View tidycensus
developers’, Kyle Walker and Matt Herman, website
to learn more about additional package functionality and documentation. Check out the book “Analyzing US Census Data: Methods, Maps, and Models in R
”, by Kyle Walker, for additional information on wrangling, modeling, and analyzing U.S. Census data. View our site
to learn more about unequal access to water and how this research can inform more equitable water management practices. Additionally, view the code
used to make our open source, reproducible website here.
References
Azadpour E, Carr AN, Clarke A, Drakes O, Restrepo-Osorio DL, Nell C. 2024. Unequal access to water: How societal factors shape vulnerability to water insecurity. U.S. Geological Survey software release. Reston, VA. https://doi.org/10.5066/P19M9WYT .
Drakes, O., Restrepo-Osorio., D.L., Powlen, K.A. and Hines, M.K., 2024, Social vulnerability and water insecurity in the western United States: A systematic review of framings, indicators, and uncertainty, https://doi.org/10.1029/2023WR036284 .
U.S. Census Bureau, 2022, “Tenure by Plumbing Facilities,” American Community Survey, 1-Year Estimates Detailed Tables, Table B25049, accessed on Nov 27, 2024, https://data.census.gov/table?q=B25049&y=2022 .
U.S. Census Bureau, 2022, “Total Population,” American Community Survey, 1-Year Estimates Detailed Tables, Table B01003, accessed on Nov 27, 2024, https://data.census.gov/table?q=B01003&y=2022 .
U.S. Census Bureau, 2023, “Tenure by Plumbing Facilities,” American Community Survey, 1-Year Estimates Detailed Tables, Table B25049, accessed on Nov 27, 2024, https://data.census.gov/table?q=B25049&y=2023 .
U.S. Census Bureau, 2023, “Total Population,” American Community Survey, 1-Year Estimates Detailed Tables, Table B01003, accessed on Nov 27, 2024, https://data.census.gov/table?q=B01003&y=2023 .
Walker K and Herman M. 2024. tidycensus: Load US Census Boundary and Attribute Data as “tidyverse” and “sf”-Ready Data Frames. R package version 1.6.5, https://walker-data.com/tidycensus/ .
Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Related Posts
Reproducible Data Science in R: Flexible functions using tidy evaluation
December 17, 2024
Overview This blog post is part of a series that works up from functional programming foundations through the use of the targets R package to create efficient, reproducible data workflows.
The Hydro Network-Linked Data Index
November 2, 2020
Introduction updated 11-2-2020 after updates described here . updated 9-20-2024 when the NLDI moved from labs.waterdata.usgs.gov to api.water.usgs.gov/nldi/ The Hydro Network-Linked Data Index (NLDI) is a system that can index data to NHDPlus V2 catchments and offers a search service to discover indexed information.
Reproducible Data Science in R: Writing functions that work for you
May 14, 2024
Overview This blog post is part of a series that works up from functional programming foundations through the use of the targets R package to create efficient, reproducible data workflows.
Origin and development of a Snowflake Map
January 11, 2023
The result It’s been a snowy winter, so let’s make a snow cover map! This blog outlines the process of how I made a snowflake hex map of the contiguous U.
Reproducible Data Science in R: Iterate, don't duplicate
July 18, 2024
Overview This blog post is part of a series that works up from functional programming foundations through the use of the targets R package to create efficient, reproducible data workflows.