Introduction

This tutorial focuses on Marginal Structural Models (MSM) using Inverse Probability Weighting (IPW) to estimate the average causal effects of always taking treatment compared to never taking treatment. MSMs provide an alternative to the g-formula for handling time-varying confounders and are particularly useful when modeling complex treatment-confounder relationships is challenging.

From our knowledge of the data-generating process, we know the true average causal effect to be 50.

Load Required Libraries

library(dplyr)
library(knitr)
library(kableExtra)
library(ggplot2)

Our Settings

The empirical setting is to treat HIV with a therapy regimen (A) in two time periods (t = 0, t = 1). We measure the time-varying confounder, HIV viral load (Z), at times t = 0 and t = 1. Our outcome is the CD4 count (cells/mm³) observed at t = 2.

Objective: Estimate the causal effect of always being treated vs. never being treated on HIV outcomes.

Variables:

A0, A1: Treatment at times 0 and 1 (0 = no treatment, 1 = treatment)
Z0 = 1 for all subjects (high viral load at baseline); 1 indicates worse baseline condition.
Z1: Time-varying confounder (HIV viral load at time 1; 0 = low, 1 = high)
Y: Outcome (CD4 count at time 2); higher values indicate better immune function.

Causal structure: Z0 → A0 → Z1 → A1 → Y

Target estimand: E[Y^{1,1}] - E[Y^{0,0}] (always treated vs. never treated)

Data Structure: Z₀ → A₀ → Z₁ → A₁ → Y

We assume Z₀ = 1 (high viral load) for all subjects at baseline.

Data Setup

# Data in Table 1 
# A0: Treatment at time 0 (for HIV)
# Z1: HIV viral load measurement prior to time 1 (Higher: Worse health condition)
# A1: Treatment at time 1
# Y: Mean of CD4 counts 
# N: Number of subjects 

a0 <- c(0, 0, 0, 0, 1, 1, 1, 1)
z1 <- c(0, 0, 1, 1, 0, 0, 1, 1)
a1 <- c(0, 1, 0, 1, 0, 1, 0, 1)
y <- c(87.288, 112.107, 119.654, 144.842, 105.282, 130.184, 137.720, 162.832)
n <- c(209271, 93779, 60657, 136293, 134781, 60789, 93903, 210527)

data <- data.frame(a0, z1, a1, y, n)

# Display the data
kable(data, caption = "Original Data Structure") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Original Data Structure
a0	z1	a1	y	n
0	0	0	87.288	209271
0	0	1	112.107	93779
0	1	0	119.654	60657
0	1	1	144.842	136293
1	0	0	105.282	134781
1	0	1	130.184	60789
1	1	0	137.720	93903
1	1	1	162.832	210527

Marginal Structural Models Approach

Introduction to MSM

Marginal Structural Models map the marginal summary (e.g., average) of potential outcomes to treatment trajectories. Unlike the g-formula, MSMs require less modeling of time-varying variables by using inverse probability weighting to create a pseudo-population where treatment assignment is independent of confounders.

The MSM approach involves three key steps: 1. Model treatment probabilities at each time point 2. Calculate inverse probability weights 3. Fit outcome model using weighted data

Step 1: Expand Data to Individual Level

First, we need to expand our aggregated data to individual-level observations for MSM analysis.

# Expand data to individual level based on sample sizes
expanded_data <- data %>%
  rowwise() %>%
  do(data.frame(
    a0 = rep(.$a0, .$n),
    z1 = rep(.$z1, .$n),
    a1 = rep(.$a1, .$n),
    y = rep(.$y, .$n)
  )) %>%
  ungroup()

# Add individual IDs
expanded_data$id <- 1:nrow(expanded_data)

print(paste("Total individuals:", nrow(expanded_data)))

## [1] "Total individuals: 1000000"

print("Random sample of individuals:")

## [1] "Random sample of individuals:"

set.seed(123)  # For reproducible random sample
expanded_data %>%
  sample_n(10) %>%
  arrange(id) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

a0	z1	a1	y	id
0	0	0	87.288	124022
0	0	0	87.288	134058
0	0	0	87.288	188942
0	0	0	87.288	193627
0	0	1	112.107	226318
0	1	1	144.842	365209
1	0	1	130.184	648795
1	0	1	130.184	685285
1	1	1	162.832	969167
1	1	1	162.832	985797

Step 2: Model Treatment Probabilities

We model the probability of receiving treatment at each time point conditional on past history.

# Model for A0: Treatment at time 0
# Since Z0 = 1 for everyone, we model A0 as depending on baseline characteristics
# For simplicity, we'll model P(A0 = 1) as a constant
prob_a0_1 <- mean(expanded_data$a0)
print(paste("Probability of A0 = 1:", round(prob_a0_1, 4)))

## [1] "Probability of A0 = 1: 0.5"

# Model for A1: Treatment at time 1 given Z1 and A0
a1_model <- glm(a1 ~ z1 + a0, family = binomial(), data = expanded_data)
summary(a1_model)

## 
## Call:
## glm(formula = a1 ~ z1 + a0, family = binomial(), data = expanded_data)
## 
## Coefficients:
##              Estimate Std. Error  z value Pr(>|z|)    
## (Intercept) -0.800985   0.003521 -227.469   <2e-16 ***
## z1           1.607930   0.004431  362.884   <2e-16 ***
## a0           0.002107   0.004431    0.476    0.634    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1386287  on 999999  degrees of freedom
## Residual deviance: 1236799  on 999997  degrees of freedom
## AIC: 1236805
## 
## Number of Fisher Scoring iterations: 4

# Predict treatment probabilities
expanded_data$prob_a0 <- ifelse(expanded_data$a0 == 1, prob_a0_1, 1 - prob_a0_1)
expanded_data$prob_a1_given_history <- predict(a1_model, type = "response")
expanded_data$prob_a1 <- ifelse(expanded_data$a1 == 1, 
                                expanded_data$prob_a1_given_history,
                                1 - expanded_data$prob_a1_given_history)

Step 3: Calculate Inverse Probability Weights

The inverse probability weights create a pseudo-population where treatment is independent of confounders.

# Calculate inverse probability weights
expanded_data$ipw <- 1 / (expanded_data$prob_a0 * expanded_data$prob_a1)

# Check weight distributions
print("Weight Summary Statistics:")

## [1] "Weight Summary Statistics:"

weight_summary <- expanded_data %>%
  summarise(
    Mean_IPW = mean(ipw),
    SD_IPW = sd(ipw),
    Min_IPW = min(ipw),
    Max_IPW = max(ipw)
  )

weight_summary %>%
  kable(digits = 4, caption = "Inverse Probability Weight Distribution Summary") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Inverse Probability Weight Distribution Summary
Mean_IPW	SD_IPW	Min_IPW	Max_IPW
4	1.6521	2.8906	6.4916

Step 4: Visualize Weight Distribution

# Create weight distribution plot
ggplot(expanded_data, aes(x = ipw)) +
  geom_histogram(alpha = 0.7, bins = 50, fill = "steelblue") +
  theme_minimal() +
  labs(title = "Distribution of Inverse Probability Weights",
       x = "Weight Value",
       y = "Frequency")

Step 5: Fit Marginal Structural Model

Now we fit the outcome model using the weighted data to estimate causal effects.

# Fit MSM using inverse probability weights
# The model directly estimates marginal causal effects
msm_model <- lm(y ~ a0 + a1 + a0*a1, data = expanded_data, weights = ipw)

print("MSM Model Summary:")

## [1] "MSM Model Summary:"

summary(msm_model)

## 
## Call:
## lm(formula = y ~ a0 + a1 + a0 * a1, data = expanded_data, weights = ipw)
## 
## Weighted Residuals:
##    Min     1Q Median     3Q    Max 
## -50.42 -32.80  21.74  32.29  49.99 
## 
## Coefficients:
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept) 100.018850   0.031793 3145.988   <2e-16 ***
## a0           25.028755   0.044952  556.782   <2e-16 ***
## a1           24.997993   0.044961  555.988   <2e-16 ***
## a0:a1        -0.001744   0.063572   -0.027    0.978    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.79 on 999996 degrees of freedom
## Multiple R-squared:  0.5533, Adjusted R-squared:  0.5533 
## F-statistic: 4.128e+05 on 3 and 999996 DF,  p-value: < 2.2e-16

# Extract coefficients for interpretation
coef_summary <- broom::tidy(msm_model)
kable(coef_summary, digits = 4, caption = "MSM Model Coefficients") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

MSM Model Coefficients
term	estimate	std.error	statistic	p.value
(Intercept)	100.0188	0.0318	3145.9883	0.0000
a0	25.0288	0.0450	556.7824	0.0000
a1	24.9980	0.0450	555.9883	0.0000
a0:a1	-0.0017	0.0636	-0.0274	0.9781

Step 6: Estimate Causal Effects

Using the fitted MSM, we can estimate potential outcomes under different treatment regimens.

# Create data for prediction under different treatment scenarios
scenario_data <- data.frame(
  scenario = c("Never treated", "Always treated", "Treated at t=0 only", "Treated at t=1 only"),
  a0 = c(0, 1, 1, 0),
  a1 = c(0, 1, 0, 1)
)

# Predict potential outcomes
scenario_data$potential_outcome <- predict(msm_model, newdata = scenario_data)

# Calculate causal effect: Always treated vs Never treated
tau_msm <- scenario_data$potential_outcome[2] - scenario_data$potential_outcome[1]

print(paste("MSM Estimated Causal Effect (τ):", round(tau_msm, 4)))

## [1] "MSM Estimated Causal Effect (τ): 50.025"

print(paste("True Causal Effect:", 50))

## [1] "True Causal Effect: 50"

print(paste("Bias:", round(tau_msm - 50, 4)))

## [1] "Bias: 0.025"

# Display all scenarios
kable(scenario_data, digits = 4, 
      caption = "Potential Outcomes Under Different Treatment Scenarios") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Potential Outcomes Under Different Treatment Scenarios
scenario	a0	a1	potential_outcome
Never treated	0	0	100.0188
Always treated	1	1	150.0439
Treated at t=0 only	1	0	125.0476
Treated at t=1 only	0	1	125.0168

Step 7: Check Balance After Weighting

One key advantage of IPW is that it should balance confounders across treatment groups in the weighted sample.

# Check balance of Z1 across A0 groups before and after weighting
balance_check <- expanded_data %>%
  group_by(a0) %>%
  summarise(
    n = n(),
    mean_z1_unweighted = mean(z1),
    mean_z1_weighted = weighted.mean(z1, ipw),
    .groups = 'drop'
  )

print("Balance Check - Z1 by A0:")

## [1] "Balance Check - Z1 by A0:"

kable(balance_check, digits = 4, 
      caption = "Balance of Z1 across A0 groups (Before and After Weighting)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Balance of Z1 across A0 groups (Before and After Weighting)
a0	n	mean_z1_unweighted	mean_z1_weighted
0	5e+05	0.3939	0.3939
1	5e+05	0.6089	0.6088

# Check balance of Z1 across A1 groups
balance_check_a1 <- expanded_data %>%
  group_by(a1) %>%
  summarise(
    n = n(),
    mean_z1_unweighted = mean(z1),
    mean_z1_weighted = weighted.mean(z1, ipw),
    .groups = 'drop'
  )

print("Balance Check - Z1 by A1:")

## [1] "Balance Check - Z1 by A1:"

kable(balance_check_a1, digits = 4,
      caption = "Balance of Z1 across A1 groups (Before and After Weighting)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Balance of Z1 across A1 groups (Before and After Weighting)
a1	n	mean_z1_unweighted	mean_z1_weighted
0	498612	0.3100	0.5014
1	501388	0.6917	0.5014

# Create comprehensive results table
results_summary <- data.frame(
  Method = c("True Effect", "MSM (Inverse Probability Weighting)"),
  Estimated_Effect = c(50, round(tau_msm, 4)),
  Bias = c(0, round(tau_msm - 50, 4))
)

kable(results_summary, 
      caption = "Summary of Causal Effect Estimates",
      digits = 4) %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Summary of Causal Effect Estimates
Method	Estimated_Effect	Bias
True Effect	50.000	0.000
MSM (Inverse Probability Weighting)	50.025	0.025

cat("\n")

cat("Key Findings:\n")

## Key Findings:

cat("- MSM with inverse probability weighting provides a causal effect estimate\n")

## - MSM with inverse probability weighting provides a causal effect estimate

cat("- IPW successfully balances confounders in the weighted pseudo-population\n")

## - IPW successfully balances confounders in the weighted pseudo-population

cat("- MSM offers a flexible alternative to g-formula for time-varying confounders\n")

## - MSM offers a flexible alternative to g-formula for time-varying confounders

cat("- Less modeling required compared to g-formula approach\n")

## - Less modeling required compared to g-formula approach

Advantages and Limitations of MSM

Advantages:

Less modeling required: No need to model complex relationships for time-varying confounders
Intuitive interpretation: Direct estimation of marginal causal effects
Flexibility: Can handle complex treatment regimens
Balance checking: Easy to verify that weighting achieves balance

Limitations:

Positivity assumption: Requires overlap in treatment probabilities
Weight variability: Extreme weights can lead to instability
Model specification: Still requires correct specification of treatment models
Large sample requirement: Performance depends on adequate sample size

Conclusion

Marginal Structural Models with Inverse Probability Weighting provide a powerful alternative to the g-formula for causal inference with time-varying confounders. The method creates a pseudo-population where treatment assignment is independent of confounders, allowing for unbiased estimation of causal effects.

In our HIV treatment example, MSM successfully estimated the causal effect of always being treated versus never being treated, demonstrating the practical utility of this approach for longitudinal causal inference.

Marginal Structural Models for Time-Varying Confounders

Nanum Jeon

2025-06-27