[Stata] Creating a Codebook (asdoc, htmlcb, wordcb, codebookout)

In this blog post, I will show you how to create a codebook for your Stata dataset using user-created commands: asdoc, htmlcb, wordcb, and codebookout.

A codebook is a document that describes the variables and labels of the dataset. It can help you and others understand your data better. Here is an example of a codebook from the General Social Survey (GSS).

describe command

First, you can see the list of variables in the dataset, with the information of type, label name, and variable label. The command is really simple!

Stata
describe, detail 

asdoc command: Word Format

Another amazing command, asdoc, provides the option to export the list of variables from the describe command with the information of type, label name, and variable label. You can install asdoc command with net install command and then run it. It will save the file in your current directory (you can specify it with cd).

Stata
net install asdoc, from(http://fintechprofessor.com) replace
asdoc des, position type isnumeric format vallab replace

You can see the meaning of each option in the help asdoc file 🙂

htmlcb command: HTML Format

The htmlcb is a user-written command that creates an HTML codebook for your Stata dataset, htmlcb produces a codebook that includes the following information for each variable.

  • Name, label, type, format, and storage type
  • Number of valid and missing observations
  • Summary statistics (mean, standard deviation, minimum, maximum, etc.)
  • Frequency table
  • List of value labels (if any)

To create HTML format codebook, you can simply type htmlcb in Stata to create a codebook for all the variables in your dataset.

Stata
ssc install htmlcb
htmlcb, saving(codebook.html) replace

The command will return the HTML formatted codebook, which can be opened with your web-browser as follows. It includes descriptive statistics as well.

wordcb command: Word Format

Stata
ssc install wordcb
wordcb using "codebook"

The wordcb command is also convenient and useful command to share with other researchers in word docx format! It includes all the information needed with a beautiful format for each variable, including values and labels and their descriptive statistics. It also shows the progress bar since it takes a little time to be exported, depending on the size of the dataset.

codebookout command: Excel format

Stata
ssc install codebookout
codebookout

The output format is in xls, including variable name, variable label, value label, values, and variable type. It is great to see the list of labels and types of variables 🙌

  • September 10, 2023