[Stata] Plotting trend line graph (twoway line) by subgroup (bytwoway)
Two way plot line is a tool for visualizing the relationship between two variables in Stata. It allows you to create line plots of one or more y variables against one x variable, and customize the appearance and behavior of the lines. In this blog post, I will show you how to use two way plot line and some of its options, using the uslifeexp dataset from the webuse command.
The uslifeexp
dataset contains data on life expectancy at birth for males and females in the United States from 1900 to 1999. To load the dataset, type:
webuse uslifeexp
To create a simple two way plot line of life expectancy (le
) against year, type:
twoway line le year
This produces the following graph. As you can see, the graph shows a clear upward trend in life expectancy over time.
To improve the graph, we can use some of the options available for two way plot line. First, let’s plot life expectancy for males (le_male
) and females (le_female
) separately, using different colors and patterns for the lines. To do this, we can specify multiple y variables in the varlist, and use the lcolor()
and lpattern()
options to change the line colors and patterns.
To add labels to the axes and the graph title, we can use the xlabel(), ylabel(), and title() options. For example:
twoway line le_male le_female year, lcolor(blue red) lpattern(dash solid) ///
xlabel(1900(20)2000) ylabel(40(10)80) title("Life expectancy at birth in the United States")
This produces the following graph:
Now we can see that females have higher life expectancy than males throughout the period, and that the gap between them has narrowed over time. We can also see that there are some fluctuations in the trend, especially for males.
To add a legend that identifies the lines, we can use the legend()
option. For example:
twoway line le_male le_female year, lcolor(blue red) lpattern(dash solid) ///
xlabel(1900(20)2000) ylabel(40(10)80) title("Life expectancy at birth in the United States") ///
legend(order(1 "Males" 2 "Females") label(1 "Life expectancy") label(2 "Life expectancy"))
Now we have a complete graph that shows the relationship between life expectancy and year for males and females in the United States.
There are many other options that you can use to customize your two way plot line graphs. For example,
sort
option: to sort the data by x values before plotting them,msymbol()
option: to add marker symbols to the pointsaddplot()
option: to add other plots such as scatterplots or fitted curves to your graph
For more details and examples, you can type help twoway
in Stata.
Other Styles of the twoway
graph
Scatter plot
twoway scatter le_male le_female year
Connected line graph
twoway connected le_male le_female year
Area-filled graph
twoway area le year
Bar graph
twoway bar le year
twoway
graph by subgroup
First, you can use by(groupvar)
option to plot the twoway graph by subgroup.
twoway line yvar xvar, by(groupvar)
It will return the graph like this, separate graphs by subgroup.
bytwoway
for subgroup plotting
By using user-created bytwoway
command, you can plot it by subgroup in one graph as follows. The command is the same except for bytwoway
🪄
* Install command for the first time use
net install bytwoway, from("https://raw.githubusercontent.com/matthieugomez/bytwoway/master/")
You can choose either line or scatter plot as follows 😊
bytwoway line yvar xvar, by(groupvar)
bytwoway scatter yvar xvar, by(groupvar)
Example 1. Line graph with different patterns
bytwoway line yvar xvar, by(groupvar) aes(color lpattern)
Example 2. Line graph with different symbols
bytwoway (scatter yvar xvar, connect(1)), by(groupvar) aes(color msymbol)
Reference
GitHub – matthieugomez/bytwoway: Quickly graph by groups in Stata
RPubs – Stata tutorial – Add 95% Confidence Intervals to Two-way Line Plot