This tutorial focuses on Marginal Structural Models (MSM) using Inverse Probability Weighting (IPW) to estimate the average causal effects of always taking treatment compared to never taking treatment. MSMs provide an alternative to the g-formula for handling time-varying confounders and are particularly useful when modeling complex treatment-confounder relationships is challenging.
From our knowledge of the data-generating process, we know the true average causal effect to be 50.
library(dplyr)
library(knitr)
library(kableExtra)
library(ggplot2)
The empirical setting is to treat HIV with a therapy regimen (A) in two time periods (t = 0, t = 1). We measure the time-varying confounder, HIV viral load (Z), at times t = 0 and t = 1. Our outcome is the CD4 count (cells/mm³) observed at t = 2.
Objective: Estimate the causal effect of always being treated vs. never being treated on HIV outcomes.
Variables:
Causal structure: Z0 → A0 → Z1 → A1 → Y
Target estimand: E[Y^{1,1}] - E[Y^{0,0}] (always treated vs. never treated)
Data Structure: Z₀ → A₀ → Z₁ → A₁ → Y
We assume Z₀ = 1 (high viral load) for all subjects at baseline.
# Data in Table 1
# A0: Treatment at time 0 (for HIV)
# Z1: HIV viral load measurement prior to time 1 (Higher: Worse health condition)
# A1: Treatment at time 1
# Y: Mean of CD4 counts
# N: Number of subjects
<- c(0, 0, 0, 0, 1, 1, 1, 1)
a0 <- c(0, 0, 1, 1, 0, 0, 1, 1)
z1 <- c(0, 1, 0, 1, 0, 1, 0, 1)
a1 <- c(87.288, 112.107, 119.654, 144.842, 105.282, 130.184, 137.720, 162.832)
y <- c(209271, 93779, 60657, 136293, 134781, 60789, 93903, 210527)
n
<- data.frame(a0, z1, a1, y, n)
data
# Display the data
kable(data, caption = "Original Data Structure") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a0 | z1 | a1 | y | n |
---|---|---|---|---|
0 | 0 | 0 | 87.288 | 209271 |
0 | 0 | 1 | 112.107 | 93779 |
0 | 1 | 0 | 119.654 | 60657 |
0 | 1 | 1 | 144.842 | 136293 |
1 | 0 | 0 | 105.282 | 134781 |
1 | 0 | 1 | 130.184 | 60789 |
1 | 1 | 0 | 137.720 | 93903 |
1 | 1 | 1 | 162.832 | 210527 |
Marginal Structural Models map the marginal summary (e.g., average) of potential outcomes to treatment trajectories. Unlike the g-formula, MSMs require less modeling of time-varying variables by using inverse probability weighting to create a pseudo-population where treatment assignment is independent of confounders.
The MSM approach involves three key steps: 1. Model treatment probabilities at each time point 2. Calculate inverse probability weights 3. Fit outcome model using weighted data
First, we need to expand our aggregated data to individual-level observations for MSM analysis.
# Expand data to individual level based on sample sizes
<- data %>%
expanded_data rowwise() %>%
do(data.frame(
a0 = rep(.$a0, .$n),
z1 = rep(.$z1, .$n),
a1 = rep(.$a1, .$n),
y = rep(.$y, .$n)
%>%
)) ungroup()
# Add individual IDs
$id <- 1:nrow(expanded_data)
expanded_data
print(paste("Total individuals:", nrow(expanded_data)))
## [1] "Total individuals: 1000000"
print("Random sample of individuals:")
## [1] "Random sample of individuals:"
set.seed(123) # For reproducible random sample
%>%
expanded_data sample_n(10) %>%
arrange(id) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a0 | z1 | a1 | y | id |
---|---|---|---|---|
0 | 0 | 0 | 87.288 | 124022 |
0 | 0 | 0 | 87.288 | 134058 |
0 | 0 | 0 | 87.288 | 188942 |
0 | 0 | 0 | 87.288 | 193627 |
0 | 0 | 1 | 112.107 | 226318 |
0 | 1 | 1 | 144.842 | 365209 |
1 | 0 | 1 | 130.184 | 648795 |
1 | 0 | 1 | 130.184 | 685285 |
1 | 1 | 1 | 162.832 | 969167 |
1 | 1 | 1 | 162.832 | 985797 |
We model the probability of receiving treatment at each time point conditional on past history.
# Model for A0: Treatment at time 0
# Since Z0 = 1 for everyone, we model A0 as depending on baseline characteristics
# For simplicity, we'll model P(A0 = 1) as a constant
<- mean(expanded_data$a0)
prob_a0_1 print(paste("Probability of A0 = 1:", round(prob_a0_1, 4)))
## [1] "Probability of A0 = 1: 0.5"
# Model for A1: Treatment at time 1 given Z1 and A0
<- glm(a1 ~ z1 + a0, family = binomial(), data = expanded_data)
a1_model summary(a1_model)
##
## Call:
## glm(formula = a1 ~ z1 + a0, family = binomial(), data = expanded_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.5345 -0.8619 0.8583 0.8590 1.5309
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.800985 0.003521 -227.469 <2e-16 ***
## z1 1.607930 0.004431 362.884 <2e-16 ***
## a0 0.002107 0.004431 0.476 0.634
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1386287 on 999999 degrees of freedom
## Residual deviance: 1236799 on 999997 degrees of freedom
## AIC: 1236805
##
## Number of Fisher Scoring iterations: 4
# Predict treatment probabilities
$prob_a0 <- ifelse(expanded_data$a0 == 1, prob_a0_1, 1 - prob_a0_1)
expanded_data$prob_a1_given_history <- predict(a1_model, type = "response")
expanded_data$prob_a1 <- ifelse(expanded_data$a1 == 1,
expanded_data$prob_a1_given_history,
expanded_data1 - expanded_data$prob_a1_given_history)
The inverse probability weights create a pseudo-population where treatment is independent of confounders.
# Calculate inverse probability weights
$ipw <- 1 / (expanded_data$prob_a0 * expanded_data$prob_a1)
expanded_data
# Check weight distributions
print("Weight Summary Statistics:")
## [1] "Weight Summary Statistics:"
<- expanded_data %>%
weight_summary summarise(
Mean_IPW = mean(ipw),
SD_IPW = sd(ipw),
Min_IPW = min(ipw),
Max_IPW = max(ipw)
)
%>%
weight_summary kable(digits = 4, caption = "Inverse Probability Weight Distribution Summary") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
Mean_IPW | SD_IPW | Min_IPW | Max_IPW |
---|---|---|---|
4 | 1.6521 | 2.8906 | 6.4916 |
# Create weight distribution plot
ggplot(expanded_data, aes(x = ipw)) +
geom_histogram(alpha = 0.7, bins = 50, fill = "steelblue") +
theme_minimal() +
labs(title = "Distribution of Inverse Probability Weights",
x = "Weight Value",
y = "Frequency")
Now we fit the outcome model using the weighted data to estimate causal effects.
# Fit MSM using inverse probability weights
# The model directly estimates marginal causal effects
<- lm(y ~ a0 + a1, data = expanded_data, weights = ipw)
msm_model
print("MSM Model Summary:")
## [1] "MSM Model Summary:"
summary(msm_model)
##
## Call:
## lm(formula = y ~ a0 + a1, data = expanded_data, weights = ipw)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -50.42 -32.80 21.74 32.29 49.99
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 100.01929 0.02753 3632.9 <2e-16 ***
## a0 25.02788 0.03179 787.4 <2e-16 ***
## a1 24.99712 0.03179 786.4 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 31.79 on 999997 degrees of freedom
## Multiple R-squared: 0.5533, Adjusted R-squared: 0.5533
## F-statistic: 6.192e+05 on 2 and 999997 DF, p-value: < 2.2e-16
# Extract coefficients for interpretation
<- broom::tidy(msm_model)
coef_summary kable(coef_summary, digits = 4, caption = "MSM Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 100.0193 | 0.0275 | 3632.9316 | 0 |
a0 | 25.0279 | 0.0318 | 787.3820 | 0 |
a1 | 24.9971 | 0.0318 | 786.4142 | 0 |
Using the fitted MSM, we can estimate potential outcomes under different treatment regimens.
# Create data for prediction under different treatment scenarios
<- data.frame(
scenario_data scenario = c("Never treated", "Always treated", "Treated at t=0 only", "Treated at t=1 only"),
a0 = c(0, 1, 1, 0),
a1 = c(0, 1, 0, 1)
)
# Predict potential outcomes
$potential_outcome <- predict(msm_model, newdata = scenario_data)
scenario_data
# Calculate causal effect: Always treated vs Never treated
<- scenario_data$potential_outcome[2] - scenario_data$potential_outcome[1]
tau_msm
print(paste("MSM Estimated Causal Effect (τ):", round(tau_msm, 4)))
## [1] "MSM Estimated Causal Effect (τ): 50.025"
print(paste("True Causal Effect:", 50))
## [1] "True Causal Effect: 50"
print(paste("Bias:", round(tau_msm - 50, 4)))
## [1] "Bias: 0.025"
# Display all scenarios
kable(scenario_data, digits = 4,
caption = "Potential Outcomes Under Different Treatment Scenarios") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
scenario | a0 | a1 | potential_outcome |
---|---|---|---|
Never treated | 0 | 0 | 100.0193 |
Always treated | 1 | 1 | 150.0443 |
Treated at t=0 only | 1 | 0 | 125.0472 |
Treated at t=1 only | 0 | 1 | 125.0164 |
One key advantage of IPW is that it should balance confounders across treatment groups in the weighted sample.
# Check balance of Z1 across A1 groups
<- expanded_data %>%
balance_check_a1 group_by(a1) %>%
summarise(
n = n(),
mean_z1_unweighted = mean(z1),
mean_z1_weighted = weighted.mean(z1, ipw),
.groups = 'drop'
)
print("Balance Check - Z1 by A1:")
## [1] "Balance Check - Z1 by A1:"
kable(balance_check_a1, digits = 4,
caption = "Balance of Z1 across A1 groups (Before and After Weighting)") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a1 | n | mean_z1_unweighted | mean_z1_weighted |
---|---|---|---|
0 | 498612 | 0.3100 | 0.5014 |
1 | 501388 | 0.6917 | 0.5014 |
# Create comprehensive results table
<- data.frame(
results_summary Method = c("True Effect", "MSM (Inverse Probability Weighting)"),
Estimated_Effect = c(50, round(tau_msm, 4)),
Bias = c(0, round(tau_msm - 50, 4))
)
kable(results_summary,
caption = "Summary of Causal Effect Estimates",
digits = 4) %>%
kable_styling(bootstrap_options = c("striped", "hover"))
Method | Estimated_Effect | Bias |
---|---|---|
True Effect | 50.000 | 0.000 |
MSM (Inverse Probability Weighting) | 50.025 | 0.025 |
cat("\n")
cat("Key Findings:\n")
## Key Findings:
cat("- MSM with inverse probability weighting provides a causal effect estimate\n")
## - MSM with inverse probability weighting provides a causal effect estimate
cat("- IPW successfully balances confounders in the weighted pseudo-population\n")
## - IPW successfully balances confounders in the weighted pseudo-population
cat("- MSM offers a flexible alternative to g-formula for time-varying confounders\n")
## - MSM offers a flexible alternative to g-formula for time-varying confounders
cat("- Less modeling required compared to g-formula approach\n")
## - Less modeling required compared to g-formula approach
Marginal Structural Models with Inverse Probability Weighting is an alternative method to the g-formula for causal inference with time-varying confounders. The method creates a pseudo-population where treatment assignment is independent of confounders, allowing for unbiased estimation of causal effects.
Naimi, A. I., Cole, S. R., & Kennedy, E. H. (2017). An introduction to g methods. International Journal of Epidemiology, 46(2), 756-762.