[Stata] Creating a Codebook (asdoc, htmlcb, wordcb, codebookout)
In this blog post, I will show you how to create a codebook for your Stata dataset using user-created commands: asdoc
, htmlcb
, wordcb
, and codebookout
.
A codebook is a document that describes the variables and labels of the dataset. It can help you and others understand your data better. Here is an example of a codebook from the General Social Survey (GSS).
describe
command
First, you can see the list of variables in the dataset, with the information of type, label name, and variable label. The command is really simple!
describe, detail
asdoc
command: Word Format
Another amazing command, asdoc
, provides the option to export the list of variables from the describe command with the information of type, label name, and variable label. You can install asdoc
command with net install
command and then run it. It will save the file in your current directory (you can specify it with cd
).
net install asdoc, from(http://fintechprofessor.com) replace
asdoc des, position type isnumeric format vallab replace
You can see the meaning of each option in the help asdoc file 🙂
htmlcb
command: HTML Format
The htmlcb
is a user-written command that creates an HTML codebook for your Stata dataset, htmlcb produces a codebook that includes the following information for each variable.
- Name, label, type, format, and storage type
- Number of valid and missing observations
- Summary statistics (mean, standard deviation, minimum, maximum, etc.)
- Frequency table
- List of value labels (if any)
To create HTML format codebook, you can simply type htmlcb
in Stata to create a codebook for all the variables in your dataset.
ssc install htmlcb
htmlcb, saving(codebook.html) replace
The command will return the HTML formatted codebook, which can be opened with your web-browser as follows. It includes descriptive statistics as well.
wordcb
command: Word Format
ssc install wordcb
wordcb using "codebook"
The wordcb
command is also convenient and useful command to share with other researchers in word docx
format! It includes all the information needed with a beautiful format for each variable, including values and labels and their descriptive statistics. It also shows the progress bar since it takes a little time to be exported, depending on the size of the dataset.
codebookout
command: Excel format
ssc install codebookout
codebookout
The output format is in xls
, including variable name, variable label, value label, values, and variable type. It is great to see the list of labels and types of variables 🙌