[Stata] Using Macros: local, global, foreach, and program
Macros are one of Stata’s most powerful features, allowing you to store and manipulate text or numbers for later use. They help automate repetitive tasks and make code more readable. In this blog post, I’ll explain the three main types of macros in Stata—local, global, and program. In Stata, a macro is simply a named container that holds text. This text can be variable names, numbers, commands, or any other string of characters. When Stata encounters a macro in your code, it replaces the macro name with its contents.
For our examples, we’ll use the NHANES II (Second National Health and Nutrition Examination Survey) dataset, which comes bundled with Stata. This dataset contains health and nutrition information from a sample of the US population.
webuse nhanese2, clear
Local Macros
Local macros are temporary and exist only within the do-file, program, or command block where they’re defined. They’re perfect for temporary storage and are cleared from memory when their scope ends.
Basic Local Macro Usage
* Define a local macro for a variable name
local height "height"
* Use the local macro in a command
summarize `height'
* Define a local macro with a value from a calculation
summarize age
local mean_age = r(mean)
display "The mean age is `mean_age'"
* Use local macros to store multiple variable names
local demographics "age sex race"
summarize `demographics'
Using Local Macros for Looping
Loops in Stata: Loops are programming structures that allow you to repeat a set of commands multiple times with different values. They are essential for automating repetitive tasks and are especially powerful when combined with macros. Stata offers several types of looping constructs, with foreach
and forvalues
being the most commonly used.
The foreach
Loop: The foreach
loop iterates through a list of items (which could be variable names, numbers, or any text values) and executes the same set of commands for each item. The basic syntax is:
foreach element_name of|in|from list_or_range {
commands using `element_name'
}
Where:
element_name
is a name you choose for the current element in each iterationof|in|from
specifies how the list is defined (explained below)list_or_range
is the collection of items to iterate throughcommands
are the Stata commands to execute for each element
There are several variations of the foreach
command:
foreach x in list
: The simplest form, where you directly list the elementsforeach x of local macname
: Iterates through elements stored in a local macroforeach x of global macname
: Iterates through elements stored in a global macroforeach x of varlist varlist
: Iterates through a list of variables
* Summarize multiple variables with a foreach loop
local vars "age weight height bpsystol bpdiast"
foreach var of local vars {
summarize `var'
local mean_`var' = r(mean)
display "Mean of `var' is `mean_`var''"
}
* Loop through all categorical values of a variable
levelsof region, local(regions)
foreach r of local regions {
display "Region `r'"
count if region == `r'
display "Number of observations: `r(N)'"
}
This code:
- Uses the
levelsof
command to extract all unique values of theregion
variable and store them in a local macro calledregions
- Loops through each region value
- Displays the region number
- Counts observations matching that region
- Displays the result using the
r(N)
return value
Global Macros
Unlike local macros, global macros persist throughout your Stata session until you explicitly clear them or exit Stata. They’re useful for values that need to be accessed across multiple do-files or programs.
Basic Global Macro Usage
* Define a global macro
global datapath "/path/to/your/data"
* Use the global macro
* Note the $ syntax instead of the ` ' syntax
display "$datapath"
* Storing analysis variables
global analysis_vars "age weight height bpsystol bpdiast"
summarize $analysis_vars
When to Use Global vs. Local Macros
As a general rule:
- Use locals for temporary values within a single do-file or program
- Use globals for values needed across multiple do-files or programs
* Using both types together
global dataset "nhanes2"
local year "1976-1980"
display "Analyzing data from $dataset, collected during `year'"
Program Macros
Program macros (also called positional parameters) are special local macros used within Stata programs to refer to arguments passed to the program. Let’s start with a basic program that calculates and displays summary statistics:
* Create a program to calculate and display summary statistics
program define sum_stats
syntax varlist [if] [in]
foreach var of varlist `varlist' {
quietly summarize `var' `if' `in'
display "Variable: `var'"
display "Mean: " %9.4f r(mean) " SD: " %9.4f r(sd) " N: " r(N)
display ""
}
end
* Run the program on some variables
sum_stats age weight height
How This Program Works:
- Program Definition:
program define sum_stats
creates a new program namedsum_stats
- Syntax Command: This powerful command processes the program’s arguments
varlist
requires the user to provide at least one variable name[if]
and[in]
are optional qualifiers (the square brackets indicate they’re optional)
- Accessing Arguments: Inside the program,
varlist
becomes a local macro containing the variables specified by the user - Execution: The program loops through each variable and displays formatted statistics
- End Statement:
end
marks the end of the program definition