[Stata] Using Macros: local, global, foreach, and program
Macros are one of Stata’s most powerful features, allowing you to store and manipulate text or numbers for later use. They help automate repetitive tasks and make code more readable. In this blog post, I’ll explain the three main types of macros in Stata—local, global, and program. In Stata, a macro is simply a named container that holds text. This text can be variable names, numbers, commands, or any other string of characters. When Stata encounters a macro in your code, it replaces the macro name with its contents.
For our examples, we’ll use the NHANES II (Second National Health and Nutrition Examination Survey) dataset, which comes bundled with Stata. This dataset contains health and nutrition information from a sample of the US population.
webuse nhanese2, clear Local Macros
Local macros are temporary and exist only within the do-file, program, or command block where they’re defined. They’re perfect for temporary storage and are cleared from memory when their scope ends.
Basic Local Macro Usage
* Define a local macro for a variable name
local height "height"
* Use the local macro in a command
summarize `height'
* Define a local macro with a value from a calculation
summarize age
local mean_age = r(mean)
display "The mean age is `mean_age'"
* Use local macros to store multiple variable names
local demographics "age sex race"
summarize `demographics'Using Local Macros for Looping
Loops in Stata: Loops are programming structures that allow you to repeat a set of commands multiple times with different values. They are essential for automating repetitive tasks and are especially powerful when combined with macros. Stata offers several types of looping constructs, with foreach and forvalues being the most commonly used.
The foreach Loop: The foreach loop iterates through a list of items (which could be variable names, numbers, or any text values) and executes the same set of commands for each item. The basic syntax is:
foreach element_name of|in|from list_or_range {
commands using `element_name'
}Where:
element_nameis a name you choose for the current element in each iterationof|in|fromspecifies how the list is defined (explained below)list_or_rangeis the collection of items to iterate throughcommandsare the Stata commands to execute for each element
There are several variations of the foreach command:
foreach x in list: The simplest form, where you directly list the elementsforeach x of local macname: Iterates through elements stored in a local macroforeach x of global macname: Iterates through elements stored in a global macroforeach x of varlist varlist: Iterates through a list of variables
* Summarize multiple variables with a foreach loop
local vars "age weight height bpsystol bpdiast"
foreach var of local vars {
summarize `var'
local mean_`var' = r(mean)
display "Mean of `var' is `mean_`var''"
}
* Loop through all categorical values of a variable
levelsof region, local(regions)
foreach r of local regions {
display "Region `r'"
count if region == `r'
display "Number of observations: `r(N)'"
}This code:
- Uses the
levelsofcommand to extract all unique values of theregionvariable and store them in a local macro calledregions - Loops through each region value
- Displays the region number
- Counts observations matching that region
- Displays the result using the
r(N)return value
Global Macros
Unlike local macros, global macros persist throughout your Stata session until you explicitly clear them or exit Stata. They’re useful for values that need to be accessed across multiple do-files or programs.
Basic Global Macro Usage
* Define a global macro
global datapath "/path/to/your/data"
* Use the global macro
* Note the $ syntax instead of the ` ' syntax
display "$datapath"
* Storing analysis variables
global analysis_vars "age weight height bpsystol bpdiast"
summarize $analysis_varsWhen to Use Global vs. Local Macros
As a general rule:
- Use locals for temporary values within a single do-file or program
- Use globals for values needed across multiple do-files or programs
* Using both types together
global dataset "nhanes2"
local year "1976-1980"
display "Analyzing data from $dataset, collected during `year'"Program Macros
Program macros (also called positional parameters) are special local macros used within Stata programs to refer to arguments passed to the program. Let’s start with a basic program that calculates and displays summary statistics:
* Create a program to calculate and display summary statistics
program define sum_stats
syntax varlist [if] [in]
foreach var of varlist `varlist' {
quietly summarize `var' `if' `in'
display "Variable: `var'"
display "Mean: " %9.4f r(mean) " SD: " %9.4f r(sd) " N: " r(N)
display ""
}
end
* Run the program on some variables
sum_stats age weight heightHow This Program Works:
- Program Definition:
program define sum_statscreates a new program namedsum_stats - Syntax Command: This powerful command processes the program’s arguments
varlistrequires the user to provide at least one variable name[if]and[in]are optional qualifiers (the square brackets indicate they’re optional)
- Accessing Arguments: Inside the program,
varlistbecomes a local macro containing the variables specified by the user - Execution: The program loops through each variable and displays formatted statistics
- End Statement:
endmarks the end of the program definition
