Introduction

This tutorial focuses on Marginal Structural Models (MSM) using Inverse Probability Weighting (IPW) to estimate the average causal effects of always taking treatment compared to never taking treatment. MSMs provide an alternative to the g-formula for handling time-varying confounders and are particularly useful when modeling complex treatment-confounder relationships is challenging.

From our knowledge of the data-generating process, we know the true average causal effect to be 50.

Load Required Libraries

library(dplyr)
library(knitr)
library(kableExtra)
library(ggplot2)

Our Settings

The empirical setting is to treat HIV with a therapy regimen (A) in two time periods (t = 0, t = 1). We measure the time-varying confounder, HIV viral load (Z), at times t = 0 and t = 1. Our outcome is the CD4 count (cells/mm³) observed at t = 2.

Objective: Estimate the causal effect of always being treated vs. never being treated on HIV outcomes.

Variables:

  • A0, A1: Treatment at times 0 and 1 (0 = no treatment, 1 = treatment)
  • Z0 = 1 for all subjects (high viral load at baseline); 1 indicates worse baseline condition.
  • Z1: Time-varying confounder (HIV viral load at time 1; 0 = low, 1 = high)
  • Y: Outcome (CD4 count at time 2); higher values indicate better immune function.

Causal structure: Z0 → A0 → Z1 → A1 → Y

Target estimand: E[Y^{1,1}] - E[Y^{0,0}] (always treated vs. never treated)

Data Structure: Z₀ → A₀ → Z₁ → A₁ → Y

We assume Z₀ = 1 (high viral load) for all subjects at baseline.

Data Setup

# Data in Table 1 
# A0: Treatment at time 0 (for HIV)
# Z1: HIV viral load measurement prior to time 1 (Higher: Worse health condition)
# A1: Treatment at time 1
# Y: Mean of CD4 counts 
# N: Number of subjects 

a0 <- c(0, 0, 0, 0, 1, 1, 1, 1)
z1 <- c(0, 0, 1, 1, 0, 0, 1, 1)
a1 <- c(0, 1, 0, 1, 0, 1, 0, 1)
y <- c(87.288, 112.107, 119.654, 144.842, 105.282, 130.184, 137.720, 162.832)
n <- c(209271, 93779, 60657, 136293, 134781, 60789, 93903, 210527)

data <- data.frame(a0, z1, a1, y, n)

# Display the data
kable(data, caption = "Original Data Structure") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Original Data Structure
a0 z1 a1 y n
0 0 0 87.288 209271
0 0 1 112.107 93779
0 1 0 119.654 60657
0 1 1 144.842 136293
1 0 0 105.282 134781
1 0 1 130.184 60789
1 1 0 137.720 93903
1 1 1 162.832 210527

Marginal Structural Models Approach

Introduction to MSM

Marginal Structural Models map the marginal summary (e.g., average) of potential outcomes to treatment trajectories. Unlike the g-formula, MSMs require less modeling of time-varying variables by using inverse probability weighting to create a pseudo-population where treatment assignment is independent of confounders.

The MSM approach involves three key steps: 1. Model treatment probabilities at each time point 2. Calculate inverse probability weights 3. Fit outcome model using weighted data

Step 1: Expand Data to Individual Level

First, we need to expand our aggregated data to individual-level observations for MSM analysis.

# Expand data to individual level based on sample sizes
expanded_data <- data %>%
  rowwise() %>%
  do(data.frame(
    a0 = rep(.$a0, .$n),
    z1 = rep(.$z1, .$n),
    a1 = rep(.$a1, .$n),
    y = rep(.$y, .$n)
  )) %>%
  ungroup()

# Add individual IDs
expanded_data$id <- 1:nrow(expanded_data)

print(paste("Total individuals:", nrow(expanded_data)))
## [1] "Total individuals: 1000000"
print("Random sample of individuals:")
## [1] "Random sample of individuals:"
set.seed(123)  # For reproducible random sample
expanded_data %>%
  sample_n(10) %>%
  arrange(id) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
a0 z1 a1 y id
0 0 0 87.288 124022
0 0 0 87.288 134058
0 0 0 87.288 188942
0 0 0 87.288 193627
0 0 1 112.107 226318
0 1 1 144.842 365209
1 0 1 130.184 648795
1 0 1 130.184 685285
1 1 1 162.832 969167
1 1 1 162.832 985797

Step 2: Model Treatment Probabilities

We model the probability of receiving treatment at each time point conditional on past history.

# Model for A0: Treatment at time 0
# Since Z0 = 1 for everyone, we model A0 as depending on baseline characteristics
# For simplicity, we'll model P(A0 = 1) as a constant
prob_a0_1 <- mean(expanded_data$a0)
print(paste("Probability of A0 = 1:", round(prob_a0_1, 4)))
## [1] "Probability of A0 = 1: 0.5"
# Model for A1: Treatment at time 1 given Z1 and A0
a1_model <- glm(a1 ~ z1 + a0, family = binomial(), data = expanded_data)
summary(a1_model)
## 
## Call:
## glm(formula = a1 ~ z1 + a0, family = binomial(), data = expanded_data)
## 
## Coefficients:
##              Estimate Std. Error  z value Pr(>|z|)    
## (Intercept) -0.800985   0.003521 -227.469   <2e-16 ***
## z1           1.607930   0.004431  362.884   <2e-16 ***
## a0           0.002107   0.004431    0.476    0.634    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1386287  on 999999  degrees of freedom
## Residual deviance: 1236799  on 999997  degrees of freedom
## AIC: 1236805
## 
## Number of Fisher Scoring iterations: 4
# Predict treatment probabilities
expanded_data$prob_a0 <- ifelse(expanded_data$a0 == 1, prob_a0_1, 1 - prob_a0_1)
expanded_data$prob_a1_given_history <- predict(a1_model, type = "response")
expanded_data$prob_a1 <- ifelse(expanded_data$a1 == 1, 
                                expanded_data$prob_a1_given_history,
                                1 - expanded_data$prob_a1_given_history)

Step 3: Calculate Inverse Probability Weights

The inverse probability weights create a pseudo-population where treatment is independent of confounders.

# Calculate inverse probability weights
expanded_data$ipw <- 1 / (expanded_data$prob_a0 * expanded_data$prob_a1)

# Check weight distributions
print("Weight Summary Statistics:")
## [1] "Weight Summary Statistics:"
weight_summary <- expanded_data %>%
  summarise(
    Mean_IPW = mean(ipw),
    SD_IPW = sd(ipw),
    Min_IPW = min(ipw),
    Max_IPW = max(ipw)
  )

weight_summary %>%
  kable(digits = 4, caption = "Inverse Probability Weight Distribution Summary") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Inverse Probability Weight Distribution Summary
Mean_IPW SD_IPW Min_IPW Max_IPW
4 1.6521 2.8906 6.4916

Step 4: Visualize Weight Distribution

# Create weight distribution plot
ggplot(expanded_data, aes(x = ipw)) +
  geom_histogram(alpha = 0.7, bins = 50, fill = "steelblue") +
  theme_minimal() +
  labs(title = "Distribution of Inverse Probability Weights",
       x = "Weight Value",
       y = "Frequency")

Step 5: Fit Marginal Structural Model

Now we fit the outcome model using the weighted data to estimate causal effects.

# Fit MSM using inverse probability weights
# The model directly estimates marginal causal effects
msm_model <- lm(y ~ a0 + a1 + a0*a1, data = expanded_data, weights = ipw)

print("MSM Model Summary:")
## [1] "MSM Model Summary:"
summary(msm_model)
## 
## Call:
## lm(formula = y ~ a0 + a1 + a0 * a1, data = expanded_data, weights = ipw)
## 
## Weighted Residuals:
##    Min     1Q Median     3Q    Max 
## -50.42 -32.80  21.74  32.29  49.99 
## 
## Coefficients:
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept) 100.018850   0.031793 3145.988   <2e-16 ***
## a0           25.028755   0.044952  556.782   <2e-16 ***
## a1           24.997993   0.044961  555.988   <2e-16 ***
## a0:a1        -0.001744   0.063572   -0.027    0.978    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.79 on 999996 degrees of freedom
## Multiple R-squared:  0.5533, Adjusted R-squared:  0.5533 
## F-statistic: 4.128e+05 on 3 and 999996 DF,  p-value: < 2.2e-16
# Extract coefficients for interpretation
coef_summary <- broom::tidy(msm_model)
kable(coef_summary, digits = 4, caption = "MSM Model Coefficients") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
MSM Model Coefficients
term estimate std.error statistic p.value
(Intercept) 100.0188 0.0318 3145.9883 0.0000
a0 25.0288 0.0450 556.7824 0.0000
a1 24.9980 0.0450 555.9883 0.0000
a0:a1 -0.0017 0.0636 -0.0274 0.9781

Step 6: Estimate Causal Effects

Using the fitted MSM, we can estimate potential outcomes under different treatment regimens.

# Create data for prediction under different treatment scenarios
scenario_data <- data.frame(
  scenario = c("Never treated", "Always treated", "Treated at t=0 only", "Treated at t=1 only"),
  a0 = c(0, 1, 1, 0),
  a1 = c(0, 1, 0, 1)
)

# Predict potential outcomes
scenario_data$potential_outcome <- predict(msm_model, newdata = scenario_data)

# Calculate causal effect: Always treated vs Never treated
tau_msm <- scenario_data$potential_outcome[2] - scenario_data$potential_outcome[1]

print(paste("MSM Estimated Causal Effect (τ):", round(tau_msm, 4)))
## [1] "MSM Estimated Causal Effect (τ): 50.025"
print(paste("True Causal Effect:", 50))
## [1] "True Causal Effect: 50"
print(paste("Bias:", round(tau_msm - 50, 4)))
## [1] "Bias: 0.025"
# Display all scenarios
kable(scenario_data, digits = 4, 
      caption = "Potential Outcomes Under Different Treatment Scenarios") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Potential Outcomes Under Different Treatment Scenarios
scenario a0 a1 potential_outcome
Never treated 0 0 100.0188
Always treated 1 1 150.0439
Treated at t=0 only 1 0 125.0476
Treated at t=1 only 0 1 125.0168

Step 7: Check Balance After Weighting

One key advantage of IPW is that it should balance confounders across treatment groups in the weighted sample.

# Check balance of Z1 across A0 groups before and after weighting
balance_check <- expanded_data %>%
  group_by(a0) %>%
  summarise(
    n = n(),
    mean_z1_unweighted = mean(z1),
    mean_z1_weighted = weighted.mean(z1, ipw),
    .groups = 'drop'
  )

print("Balance Check - Z1 by A0:")
## [1] "Balance Check - Z1 by A0:"
kable(balance_check, digits = 4, 
      caption = "Balance of Z1 across A0 groups (Before and After Weighting)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Balance of Z1 across A0 groups (Before and After Weighting)
a0 n mean_z1_unweighted mean_z1_weighted
0 5e+05 0.3939 0.3939
1 5e+05 0.6089 0.6088
# Check balance of Z1 across A1 groups
balance_check_a1 <- expanded_data %>%
  group_by(a1) %>%
  summarise(
    n = n(),
    mean_z1_unweighted = mean(z1),
    mean_z1_weighted = weighted.mean(z1, ipw),
    .groups = 'drop'
  )

print("Balance Check - Z1 by A1:")
## [1] "Balance Check - Z1 by A1:"
kable(balance_check_a1, digits = 4,
      caption = "Balance of Z1 across A1 groups (Before and After Weighting)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Balance of Z1 across A1 groups (Before and After Weighting)
a1 n mean_z1_unweighted mean_z1_weighted
0 498612 0.3100 0.5014
1 501388 0.6917 0.5014
# Create comprehensive results table
results_summary <- data.frame(
  Method = c("True Effect", "MSM (Inverse Probability Weighting)"),
  Estimated_Effect = c(50, round(tau_msm, 4)),
  Bias = c(0, round(tau_msm - 50, 4))
)

kable(results_summary, 
      caption = "Summary of Causal Effect Estimates",
      digits = 4) %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Summary of Causal Effect Estimates
Method Estimated_Effect Bias
True Effect 50.000 0.000
MSM (Inverse Probability Weighting) 50.025 0.025
cat("\n")
cat("Key Findings:\n")
## Key Findings:
cat("- MSM with inverse probability weighting provides a causal effect estimate\n")
## - MSM with inverse probability weighting provides a causal effect estimate
cat("- IPW successfully balances confounders in the weighted pseudo-population\n")
## - IPW successfully balances confounders in the weighted pseudo-population
cat("- MSM offers a flexible alternative to g-formula for time-varying confounders\n")
## - MSM offers a flexible alternative to g-formula for time-varying confounders
cat("- Less modeling required compared to g-formula approach\n")
## - Less modeling required compared to g-formula approach

Advantages and Limitations of MSM

Advantages:

  • Less modeling required: No need to model complex relationships for time-varying confounders
  • Intuitive interpretation: Direct estimation of marginal causal effects
  • Flexibility: Can handle complex treatment regimens
  • Balance checking: Easy to verify that weighting achieves balance

Limitations:

  • Positivity assumption: Requires overlap in treatment probabilities
  • Weight variability: Extreme weights can lead to instability
  • Model specification: Still requires correct specification of treatment models
  • Large sample requirement: Performance depends on adequate sample size

Conclusion

Marginal Structural Models with Inverse Probability Weighting provide a powerful alternative to the g-formula for causal inference with time-varying confounders. The method creates a pseudo-population where treatment assignment is independent of confounders, allowing for unbiased estimation of causal effects.

In our HIV treatment example, MSM successfully estimated the causal effect of always being treated versus never being treated, demonstrating the practical utility of this approach for longitudinal causal inference.