[Stata] getcensus package for American Community Survey datasets

Exciting news for Stata nerd! There is a package for retrieving the American Community Survey datasets not only in R (tidycensus) but also Stata (getcensus), which was just launched in October 2021. Here are the steps to use it!

Get your own API key

First, you need to request your key to get the census dataset via Stata or R. It is effortless and does not take any time to wait for the approval.

Here is the link for signup: Key Signup (census.gov)

As soon as signing up, you will receive an email like the above very shortly. You need to copy your key and do activate it.

Install packages and put your API key

Stata
ssc install getcensus
ssc install jsonio
global censuskey putyourkeyhere // after installation, you can load getcensus package by using this next time

These are the codes to install the packages when you use the getcensus package for the first time.

Browse the Catalog

To browse the available variables, you can see them using the catalog. You can change the year and sample number (1-year estimates, 3-year estimates, and 5-year estimates are available). There are four product types available in getcensus package: DT, ST, DP, and CP. Please refer to the Census website for more information.

  • DT (default, if you do not specify the product type): Most detailed estimates on all topics for all geographies.
    • Detailed Tables are designed for advanced data users or those who want access to the most comprehensive ACS tables. Detailed Tables are also available through the ACS Summary File and Census Bureau Application Programming Interface.
    • ACS Detailed Tables begin with the letters “B” for base tables and “C” for collapsed tables.
      • The “collapsed” tables cover the same topics as the base table, but with fewer categories.
  • ST: A span of information on a particular ACS subject presented in the format of both estimates and percentages.
    • Subject Tables provide pretabulated estimates and percentages for a wide variety of topics (e.g., employment, education, and income), often available separately by age, sex, or race/ethnicity. ACS Subject Tables begin with the letter “S.”
  • DP: Broad social, economic, housing, and demographic information in a total of four profiles.
    • Data Profiles are a good place to start for novice data users, as they contain the most frequently requested social, economic, housing, and demographic data. Each of these four subject areas is a separate profile.
    • The Data Profiles summarize the data for a single geographic area, providing both estimates and percentages, to cover the most basic data on all ACS topics. ACS Data Profiles begin with the letters “DP.”
  • CP: Comparisons of ACS estimates over time in the same layout as the Data Profiles.
    • Comparison Profiles show ACS data side-by-side from different data releases, indicating where there is a statistically significant difference between estimates.
    • The 1-year Comparison Profiles show data side-by-side for 5 years, indicating where there is a statistically significant difference between the most current year compared to 4 prior years of data.
Stata
* Browse catelog 
getcensus catalog, year(2021) sample(5) product(DT)
getcensus catalog, year(2021) sample(5) product(ST)
getcensus catalog, year(2021) sample(5) product(DP)
getcensus catalog, year(2021) sample(5) product(CP) 

Then let’s go to the Stata Data Editor (Browse).

It provides the list of table_id and variable_id with names. I recommend copying and pasting them to an excel sheet to browse with ease (Ctrl + A -> Ctrl + C on Stata -> Ctrl+V on an excel sheet).

You can also browse the variables on Explore Census Data website. Here is an example: S1701: POVERTY STATUS IN THE PAST 12… – Census Bureau Table

How to use state- and county-level variables

Stata
*get population by county: B01003 is total population 
getcensus B01003, year(2021) sample(1) geography(county) clear // you cannot get multiple tables at once 
getcensus B01001A_001 B01001A_017 B25047_001, year(2021) sample(1) geography(county) clear // you can get multiple variables at once 

If you figure out the table or variable you would love to download, let’s write the code to get it!

Put the id of the table or variable after getcensus command. For your information, you can not download multiple tables at once. However, you can download multiple variables in different tables at once. I recommend you download using variable_id instead of table_id if you need more than two variables across tables.

Then put the year of ACS dataset and the number of estimates (1, 3, or 5), geography unit (state, county, metro, tract, zcta, and so on).

You can see the list of supported geographies here: getcensus – Supported Geographies (centeronbudget.github.io)

How to use zip-code level variables

Stata
getcensus B01003, year(2020) sample(5) geography(zcta) clear 

Zip-code level variables are supported only in 5-year estimates. The latest is 2020 as of today.

I guess it has a glitch in retrieving the state name along with the zip code.

However, it still shows the correct numbers for the zip code.

How to match zip code with state, county, and city names

Then how can you include the state- and county- or city names into the tables we just downloaded at the zip-code level?

The nice blog by Edel Alon provides the excel sheet here: Zipcode to City, State Excel Spreadsheet • Edel Alon (updated 2020)

  1. You can download the Zip-Codes-to-City-County-State-2020 (xlsx) – Updated April 2020
  2. Convert excel sheet with zip-code into dta format for Stata
  3. Match zip-code with state, county, and city names using the merge command on Stata.

Or, this website provides the updated (2022) excel file for zip-code, county, and state. You can download it for your personal use (including research purpose).

ZIP Code Database – ZIP Code List (Updated for 2022) (unitedstateszipcodes.org)

Stata
merge 1:1 zipcodetabulationarea(variable) using "zipcodematching data name.dta"

Useful Links

GitHub – CenterOnBudget/getcensus: Load American Community Survey data from the U.S. Census Bureau API into Stata

Documentation of getcensus STATA package (centeronbudget.github.io)

Understanding Geographic Identifiers (GEOIDs) (census.gov)

How to graph the variable: using maptile package

Using the user-developed packages, you can visualize using state and county fips code easily. Here is the post on how to use maptile package.

  • November 3, 2022