[Stata] Using Macros: local, global, foreach, and program

Macros are one of Stata’s most powerful features, allowing you to store and manipulate text or numbers for later use. They help automate repetitive tasks and make code more readable. In this blog post, I’ll explain the three main types of macros in Stata—local, global, and program. In Stata, a macro is simply a named container that holds text. This text can be variable names, numbers, commands, or any other string of characters. When Stata encounters a macro in your code, it replaces the macro name with its contents.

For our examples, we’ll use the NHANES II (Second National Health and Nutrition Examination Survey) dataset, which comes bundled with Stata. This dataset contains health and nutrition information from a sample of the US population.

Stata
webuse nhanese2, clear 

Local Macros

Local macros are temporary and exist only within the do-file, program, or command block where they’re defined. They’re perfect for temporary storage and are cleared from memory when their scope ends.

Basic Local Macro Usage

Stata
* Define a local macro for a variable name
local height "height"

* Use the local macro in a command
summarize `height'

* Define a local macro with a value from a calculation
summarize age
local mean_age = r(mean)
display "The mean age is `mean_age'"

* Use local macros to store multiple variable names
local demographics "age sex race"
summarize `demographics'

Using Local Macros for Looping

Loops in Stata: Loops are programming structures that allow you to repeat a set of commands multiple times with different values. They are essential for automating repetitive tasks and are especially powerful when combined with macros. Stata offers several types of looping constructs, with foreach and forvalues being the most commonly used.

The foreach Loop: The foreach loop iterates through a list of items (which could be variable names, numbers, or any text values) and executes the same set of commands for each item. The basic syntax is:

Stata
foreach element_name of|in|from list_or_range {
    commands using `element_name'
}

Where:

  • element_name is a name you choose for the current element in each iteration
  • of|in|from specifies how the list is defined (explained below)
  • list_or_range is the collection of items to iterate through
  • commands are the Stata commands to execute for each element

There are several variations of the foreach command:

  1. foreach x in list: The simplest form, where you directly list the elements
  2. foreach x of local macname: Iterates through elements stored in a local macro
  3. foreach x of global macname: Iterates through elements stored in a global macro
  4. foreach x of varlist varlist: Iterates through a list of variables
Stata
* Summarize multiple variables with a foreach loop
local vars "age weight height bpsystol bpdiast"
foreach var of local vars {
    summarize `var'
    local mean_`var' = r(mean)
    display "Mean of `var' is `mean_`var''"
}

* Loop through all categorical values of a variable
levelsof region, local(regions)
foreach r of local regions {
    display "Region `r'"
    count if region == `r'
    display "Number of observations: `r(N)'"
}

This code:

  1. Uses the levelsof command to extract all unique values of the region variable and store them in a local macro called regions
  2. Loops through each region value
  3. Displays the region number
  4. Counts observations matching that region
  5. Displays the result using the r(N) return value

Global Macros

Unlike local macros, global macros persist throughout your Stata session until you explicitly clear them or exit Stata. They’re useful for values that need to be accessed across multiple do-files or programs.

Basic Global Macro Usage

Stata
* Define a global macro
global datapath "/path/to/your/data"

* Use the global macro
* Note the $ syntax instead of the ` ' syntax
display "$datapath"

* Storing analysis variables
global analysis_vars "age weight height bpsystol bpdiast"
summarize $analysis_vars

When to Use Global vs. Local Macros

As a general rule:

  • Use locals for temporary values within a single do-file or program
  • Use globals for values needed across multiple do-files or programs
Stata
* Using both types together
global dataset "nhanes2"
local year "1976-1980"

display "Analyzing data from $dataset, collected during `year'"

Program Macros

Program macros (also called positional parameters) are special local macros used within Stata programs to refer to arguments passed to the program. Let’s start with a basic program that calculates and displays summary statistics:

Stata
* Create a program to calculate and display summary statistics
program define sum_stats
    syntax varlist [if] [in]
    
    foreach var of varlist `varlist' {
        quietly summarize `var' `if' `in'
        display "Variable: `var'"
        display "Mean: " %9.4f r(mean) "  SD: " %9.4f r(sd) "  N: " r(N)
        display ""
    }
end

* Run the program on some variables
sum_stats age weight height

How This Program Works:

  1. Program Definition: program define sum_stats creates a new program named sum_stats
  2. Syntax Command: This powerful command processes the program’s arguments
    • varlist requires the user to provide at least one variable name
    • [if] and [in] are optional qualifiers (the square brackets indicate they’re optional)
  3. Accessing Arguments: Inside the program, varlist becomes a local macro containing the variables specified by the user
  4. Execution: The program loops through each variable and displays formatted statistics
  5. End Statement: end marks the end of the program definition

  • March 11, 2025