[Stata] getcensus package for American Community Survey datasets
Exciting news for Stata nerd! There is a package for retrieving the American Community Survey datasets not only in R (tidycensus) but also Stata (getcensus), which was just launched in October 2021. Here are the steps to use it!
Get your own API key
First, you need to request your key to get the census dataset via Stata or R. It is effortless and does not take any time to wait for the approval.
Here is the link for signup: Key Signup (census.gov)
As soon as signing up, you will receive an email like the above very shortly. You need to copy your key and do activate it.
Install packages and put your API key
ssc install getcensus
ssc install jsonio
global censuskey putyourkeyhere // after installation, you can load getcensus package by using this next time
These are the codes to install the packages when you use the getcensus
package for the first time.
Browse the Catalog
To browse the available variables, you can see them using the catalog. You can change the year and sample number (1-year estimates, 3-year estimates, and 5-year estimates are available). There are four product types available in getcensus package: DT, ST, DP, and CP. Please refer to the Census website for more information.
- DT (default, if you do not specify the product type): Most detailed estimates on all topics for all geographies.
- Detailed Tables are designed for advanced data users or those who want access to the most comprehensive ACS tables. Detailed Tables are also available through the ACS Summary File and Census Bureau Application Programming Interface.
- ACS Detailed Tables begin with the letters “B” for base tables and “C” for collapsed tables.
- The “collapsed” tables cover the same topics as the base table, but with fewer categories.
- ST: A span of information on a particular ACS subject presented in the format of both estimates and percentages.
- Subject Tables provide pretabulated estimates and percentages for a wide variety of topics (e.g., employment, education, and income), often available separately by age, sex, or race/ethnicity. ACS Subject Tables begin with the letter “S.”
- DP: Broad social, economic, housing, and demographic information in a total of four profiles.
- Data Profiles are a good place to start for novice data users, as they contain the most frequently requested social, economic, housing, and demographic data. Each of these four subject areas is a separate profile.
- The Data Profiles summarize the data for a single geographic area, providing both estimates and percentages, to cover the most basic data on all ACS topics. ACS Data Profiles begin with the letters “DP.”
- CP: Comparisons of ACS estimates over time in the same layout as the Data Profiles.
- Comparison Profiles show ACS data side-by-side from different data releases, indicating where there is a statistically significant difference between estimates.
- The 1-year Comparison Profiles show data side-by-side for 5 years, indicating where there is a statistically significant difference between the most current year compared to 4 prior years of data.
* Browse catelog
getcensus catalog, year(2021) sample(5) product(DT)
getcensus catalog, year(2021) sample(5) product(ST)
getcensus catalog, year(2021) sample(5) product(DP)
getcensus catalog, year(2021) sample(5) product(CP)
Then let’s go to the Stata Data Editor (Browse).
It provides the list of table_id and variable_id with names. I recommend copying and pasting them to an excel sheet to browse with ease (Ctrl + A -> Ctrl + C on Stata -> Ctrl+V on an excel sheet).
You can also browse the variables on Explore Census Data website. Here is an example: S1701: POVERTY STATUS IN THE PAST 12… – Census Bureau Table
How to use state- and county-level variables
*get population by county: B01003 is total population
getcensus B01003, year(2021) sample(1) geography(county) clear // you cannot get multiple tables at once
getcensus B01001A_001 B01001A_017 B25047_001, year(2021) sample(1) geography(county) clear // you can get multiple variables at once
If you figure out the table or variable you would love to download, let’s write the code to get it!
Put the id of the table or variable after
command. For your information, you can not download multiple tables at once. However, you can download multiple variables in different tables at once. I recommend you download using getcensus
variable_id
instead of table_id
if you need more than two variables across tables.
Then put the year of ACS dataset and the number of estimates (1, 3, or 5), geography unit (state
, county
, metro
, tract
, zcta
, and so on).
You can see the list of supported geographies here: getcensus – Supported Geographies (centeronbudget.github.io)
How to use zip-code level variables
getcensus B01003, year(2020) sample(5) geography(zcta) clear
Zip-code level variables are supported only in 5-year estimates. The latest is 2020 as of today.
I guess it has a glitch in retrieving the state name along with the zip code.
However, it still shows the correct numbers for the zip code.
How to match zip code with state, county, and city names
Then how can you include the state- and county- or city names into the tables we just downloaded at the zip-code level?
The nice blog by Edel Alon provides the excel sheet here: Zipcode to City, State Excel Spreadsheet • Edel Alon (updated 2020)
- You can download the Zip-Codes-to-City-County-State-2020 (xlsx) – Updated April 2020
- Convert excel sheet with zip-code into
dta
format for Stata - Match zip-code with state, county, and city names using the
merge
command on Stata.
Or, this website provides the updated (2022) excel file for zip-code, county, and state. You can download it for your personal use (including research purpose).
ZIP Code Database – ZIP Code List (Updated for 2022) (unitedstateszipcodes.org)
merge 1:1 zipcodetabulationarea(variable) using "zipcodematching data name.dta"
Useful Links
Documentation of getcensus STATA package (centeronbudget.github.io)
Understanding Geographic Identifiers (GEOIDs) (census.gov)
How to graph the variable: using maptile
package
Using the user-developed packages, you can visualize using state and county fips code easily. Here is the post on how to use
package.maptile
3 Responses
[…] ▶️[Stata] getcensus package for American Community Survey datasets […]
[…] do you prepare your data? For starters, you can find my post on how to use the getcensus package. Before we dive into how to create the map, here’s a list of the various packages for mapping […]
[…] map I have downloaded the old-age dependency rate and child dependency rate from 2021 ACS (see this post if you are interested in downloading ACS using […]