[Stata] Propensity Score Matching: psmatch2, teffects

Propensity score matching (PSM) is a statistical technique that allows us to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment. PSM is widely used in observational studies where random assignment to treatments is not feasible.

Propensity scores: Everything you need to know in 5min

General steps for propensity score matching

Step 1: Identify the treatment and control groups

  • Determine which group received the treatment or intervention (treatment group) and which group did not (control group).

Step 2: Select confounding variables

  • Identify the variables that may influence both the treatment assignment and the outcome of interest. These are called confounding variables.

Step 3: Estimate propensity scores

  • Using logistic regression or other suitable methods, estimate the probability of each individual receiving the treatment based on their confounding variables. This probability is called the propensity score.

Step 4: Match individuals based on propensity scores

  • Match each individual in the treatment group with one or more individuals in the control group who have similar propensity scores. This can be done using various methods, such as: a. One-to-one matching: Each treated individual is matched with one control individual with the closest propensity score. b. Many-to-one matching: Each treated individual is matched with multiple control individuals with similar propensity scores. c. Caliper matching: Each treated individual is matched with control individuals within a specified range (caliper) of propensity scores.

Step 5: Assess balance

  • Check if the matched treatment and control groups are balanced in terms of the confounding variables. This can be done by comparing the distributions of these variables between the two groups using statistical tests or graphical methods.

Step 6: Estimate the treatment effect

  • After obtaining balanced groups, estimate the treatment effect by comparing the outcomes between the matched treatment and control groups. This can be done using appropriate statistical methods, such as t-tests, regression analysis, or survival analysis, depending on the nature of the outcome variable.

Stata Commands for matching

Differences between teffects, psmatch2, and kmatch:

  1. teffects is a built-in Stata command, while psmatch2 and kmatch are user-written commands.
  2. teffects supports various methods for estimating treatment effects, including propensity score matching, inverse-probability weighting, and regression adjustment. psmatch2 and kmatch focus specifically on propensity score matching.
  3. teffects and psmatch2 allow for easy estimation of the average treatment effect (ATE) and the average treatment effect on the treated (ATT). kmatch focuses on estimating the ATT.
  4. psmatch2 and kmatch provide additional options for assessing balance and overlap, such as common support graphs and covariate balance tables.

In this blog post, we’ll walk through the steps of conducting PSM in Stata using the webuse nlswork dataset. First, we need to load the National Longitudinal Survey of Young Working Women (nlswork) dataset into Stata. This can be done using the webuse command:

Stata
webuse nlswork, clear

Step 1: Specify the treatment, outcome, and confounding variables

Define your outcome variable, treatment variable, and confounders.

  • Treatment variable: union
  • Outcome variable: ln_wage
  • Confounding variables: age, race, msp, collgrad, not_smsa, c_city, south, occ_code, ttl_exp, tenure, hours

Step 2: Perform propensity score matching using the teffects command

The psmatch2 command in Stata is used to estimate propensity scores and conduct the matching. Suppose we have a binary treatment variable treat and a set of covariates x1x2, …, xn. The basic syntax is as follows:

Stata
// basic syntax 
ssc install psmatch2
psmatch2 treat x1 x2 x3 xn, out(outcome) common

In our example, we can perform the matching using the code:

Stata
// PSM code - Outcome: Wage / Treatment: Union 
psmatch2 union age race msp collgrad not_smsa c_city south occ_code ttl_exp tenure hours, out(ln_wage) common logit

This command will match each treated observation (union member) with one or more non-treated observations (non-union members) based on the propensity score, which is calculated from the specified confounders.

The difference in ATT is approximately 0.197. This means that, on average, being in a union is associated with an increase in wages by about 19.7% after propensity score matching.

Advanced: Average Treatment Effect on the Treated (ATET)

  • Average Treatment Effect (ATE): This measures the expected effect of the treatment across the entire population, regardless of whether they received the treatment or not. It answers the question, “What would be the average effect of the treatment if we were to apply it to the whole population?”
  • Average Treatment Effect on the Treated (ATET): This measures the effect of the treatment only on those who actually received the treatment. It answers the question, “What is the average effect of the treatment on those individuals who were actually treated?”

By default, the teffects psmatch command performs the analysis based on the average treatment effect (ATE). The teffects psmatch command with the atet option provides the Average Treatment Effect on the Treated (ATET):

Stata
// PSM code - Outcome: Wage / Treatment: Union 
teffects psmatch (ln_wage) (union age race msp collgrad not_smsa c_city south occ_code ttl_exp tenure hours), atet 

ATET for Union Membership:

  • The coefficient for union is 0.198, with a standard error of 0.01.
  • This suggests that being in a union increases the natural logarithm of wages by about 19.8% for union members, compared to what their wages would have been if they were not in a union.

Step 3: Assess the balance of confounding variables after matching

It’s important to assess the quality of the matching. To assess the quality of matching, you can use the psgraph and pstest command to check for balance in the covariates after matching. In other words, we should perform the same model before running pstest.

By using psgraph command, you can see the propensity score histogram by treatment status.

Stata
psgraph

It looks great, but it is difficult to interpret with statistical significance. So, we will use pstest command for that purpose. The pstest command in Stata provides a balance test after propensity score matching. It checks whether the covariates in the treated and control groups are balanced, meaning they have similar distributions, which is crucial for unbiased estimation of treatment effects. Here’s an interpretation of your output:

Stata
pstest, graph
  • %bias: This column shows the percentage bias for each covariate between the treated and control groups. After matching, the biases should be lower, indicating better balance.
  • t-test: This tests whether the means of each covariate are statistically different between the treated and control groups. A high p-value (p>|t|) suggests no significant difference.
  • V(T)/V©: The variance ratio compares the variances of each covariate in the treated and control groups. A ratio close to 1 indicates a similar variance.

From the output, it appears that the matching has improved the balance between the treated and control groups, as indicated by the reduced %bias across covariates. The variance ratios are also close to 1 for most covariates, except for a few marked with an asterisk (*), which indicates that the variance ratio is outside the acceptable range of [0.94; 1.06].

We focus on analyzing t-tests between treated and control. It seems like south and ttl_exp, tenure, race variables are statistically significantly different between groups (p < .05). We might need to consider other matching algorithms (e.g., nearest neighbor matching) or adjustment of covariates to improve this output.

Reference

Propensity Score Matching in Stata using teffects (wisc.edu)

Treatment effects in Stata®: Propensity-score matching – YouTube

Propensity Score Matching in Stata – psmatch2 (youtube.com)

Propensity Score Matching and Analysis

propensity_guide.pdf (manchester.ac.uk)

Elizabeth Stuart’s Propensity Score Software Page (jhsph.edu)

Week 8: Matching estimators and propensity scores (ucdenver.edu)

https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013-workshop-PSA-brief-Stata-example.pdf

  • April 11, 2024