This tutorial focuses on Marginal Structural Models (MSM) using Inverse Probability Weighting (IPW) to estimate the average causal effects of always taking treatment compared to never taking treatment. MSMs provide an alternative to the g-formula for handling time-varying confounders and are particularly useful when modeling complex treatment-confounder relationships is challenging.
From our knowledge of the data-generating process, we know the true average causal effect to be 50.
The empirical setting is to treat HIV with a therapy regimen (A) in two time periods (t = 0, t = 1). We measure the time-varying confounder, HIV viral load (Z), at times t = 0 and t = 1. Our outcome is the CD4 count (cells/mm³) observed at t = 2.
Objective: Estimate the causal effect of always being treated vs. never being treated on HIV outcomes.
Variables:
Causal structure: Z0 → A0 → Z1 → A1 → Y
Target estimand: E[Y^{1,1}] - E[Y^{0,0}] (always treated vs. never treated)
Data Structure: Z₀ → A₀ → Z₁ → A₁ → Y
We assume Z₀ = 1 (high viral load) for all subjects at baseline.
# Data in Table 1
# A0: Treatment at time 0 (for HIV)
# Z1: HIV viral load measurement prior to time 1 (Higher: Worse health condition)
# A1: Treatment at time 1
# Y: Mean of CD4 counts
# N: Number of subjects
a0 <- c(0, 0, 0, 0, 1, 1, 1, 1)
z1 <- c(0, 0, 1, 1, 0, 0, 1, 1)
a1 <- c(0, 1, 0, 1, 0, 1, 0, 1)
y <- c(87.288, 112.107, 119.654, 144.842, 105.282, 130.184, 137.720, 162.832)
n <- c(209271, 93779, 60657, 136293, 134781, 60789, 93903, 210527)
data <- data.frame(a0, z1, a1, y, n)
# Display the data
kable(data, caption = "Original Data Structure") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a0 | z1 | a1 | y | n |
---|---|---|---|---|
0 | 0 | 0 | 87.288 | 209271 |
0 | 0 | 1 | 112.107 | 93779 |
0 | 1 | 0 | 119.654 | 60657 |
0 | 1 | 1 | 144.842 | 136293 |
1 | 0 | 0 | 105.282 | 134781 |
1 | 0 | 1 | 130.184 | 60789 |
1 | 1 | 0 | 137.720 | 93903 |
1 | 1 | 1 | 162.832 | 210527 |
Marginal Structural Models map the marginal summary (e.g., average) of potential outcomes to treatment trajectories. Unlike the g-formula, MSMs require less modeling of time-varying variables by using inverse probability weighting to create a pseudo-population where treatment assignment is independent of confounders.
The MSM approach involves three key steps: 1. Model treatment probabilities at each time point 2. Calculate inverse probability weights 3. Fit outcome model using weighted data
First, we need to expand our aggregated data to individual-level observations for MSM analysis.
# Expand data to individual level based on sample sizes
expanded_data <- data %>%
rowwise() %>%
do(data.frame(
a0 = rep(.$a0, .$n),
z1 = rep(.$z1, .$n),
a1 = rep(.$a1, .$n),
y = rep(.$y, .$n)
)) %>%
ungroup()
# Add individual IDs
expanded_data$id <- 1:nrow(expanded_data)
print(paste("Total individuals:", nrow(expanded_data)))
## [1] "Total individuals: 1000000"
## [1] "Random sample of individuals:"
set.seed(123) # For reproducible random sample
expanded_data %>%
sample_n(10) %>%
arrange(id) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a0 | z1 | a1 | y | id |
---|---|---|---|---|
0 | 0 | 0 | 87.288 | 124022 |
0 | 0 | 0 | 87.288 | 134058 |
0 | 0 | 0 | 87.288 | 188942 |
0 | 0 | 0 | 87.288 | 193627 |
0 | 0 | 1 | 112.107 | 226318 |
0 | 1 | 1 | 144.842 | 365209 |
1 | 0 | 1 | 130.184 | 648795 |
1 | 0 | 1 | 130.184 | 685285 |
1 | 1 | 1 | 162.832 | 969167 |
1 | 1 | 1 | 162.832 | 985797 |
We model the probability of receiving treatment at each time point conditional on past history.
# Model for A0: Treatment at time 0
# Since Z0 = 1 for everyone, we model A0 as depending on baseline characteristics
# For simplicity, we'll model P(A0 = 1) as a constant
prob_a0_1 <- mean(expanded_data$a0)
print(paste("Probability of A0 = 1:", round(prob_a0_1, 4)))
## [1] "Probability of A0 = 1: 0.5"
# Model for A1: Treatment at time 1 given Z1 and A0
a1_model <- glm(a1 ~ z1 + a0, family = binomial(), data = expanded_data)
summary(a1_model)
##
## Call:
## glm(formula = a1 ~ z1 + a0, family = binomial(), data = expanded_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.800985 0.003521 -227.469 <2e-16 ***
## z1 1.607930 0.004431 362.884 <2e-16 ***
## a0 0.002107 0.004431 0.476 0.634
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1386287 on 999999 degrees of freedom
## Residual deviance: 1236799 on 999997 degrees of freedom
## AIC: 1236805
##
## Number of Fisher Scoring iterations: 4
# Predict treatment probabilities
expanded_data$prob_a0 <- ifelse(expanded_data$a0 == 1, prob_a0_1, 1 - prob_a0_1)
expanded_data$prob_a1_given_history <- predict(a1_model, type = "response")
expanded_data$prob_a1 <- ifelse(expanded_data$a1 == 1,
expanded_data$prob_a1_given_history,
1 - expanded_data$prob_a1_given_history)
The inverse probability weights create a pseudo-population where treatment is independent of confounders.
# Calculate inverse probability weights
expanded_data$ipw <- 1 / (expanded_data$prob_a0 * expanded_data$prob_a1)
# Check weight distributions
print("Weight Summary Statistics:")
## [1] "Weight Summary Statistics:"
weight_summary <- expanded_data %>%
summarise(
Mean_IPW = mean(ipw),
SD_IPW = sd(ipw),
Min_IPW = min(ipw),
Max_IPW = max(ipw)
)
weight_summary %>%
kable(digits = 4, caption = "Inverse Probability Weight Distribution Summary") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
Mean_IPW | SD_IPW | Min_IPW | Max_IPW |
---|---|---|---|
4 | 1.6521 | 2.8906 | 6.4916 |
# Create weight distribution plot
ggplot(expanded_data, aes(x = ipw)) +
geom_histogram(alpha = 0.7, bins = 50, fill = "steelblue") +
theme_minimal() +
labs(title = "Distribution of Inverse Probability Weights",
x = "Weight Value",
y = "Frequency")
Now we fit the outcome model using the weighted data to estimate causal effects.
# Fit MSM using inverse probability weights
# The model directly estimates marginal causal effects
msm_model <- lm(y ~ a0 + a1 + a0*a1, data = expanded_data, weights = ipw)
print("MSM Model Summary:")
## [1] "MSM Model Summary:"
##
## Call:
## lm(formula = y ~ a0 + a1 + a0 * a1, data = expanded_data, weights = ipw)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -50.42 -32.80 21.74 32.29 49.99
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 100.018850 0.031793 3145.988 <2e-16 ***
## a0 25.028755 0.044952 556.782 <2e-16 ***
## a1 24.997993 0.044961 555.988 <2e-16 ***
## a0:a1 -0.001744 0.063572 -0.027 0.978
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 31.79 on 999996 degrees of freedom
## Multiple R-squared: 0.5533, Adjusted R-squared: 0.5533
## F-statistic: 4.128e+05 on 3 and 999996 DF, p-value: < 2.2e-16
# Extract coefficients for interpretation
coef_summary <- broom::tidy(msm_model)
kable(coef_summary, digits = 4, caption = "MSM Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 100.0188 | 0.0318 | 3145.9883 | 0.0000 |
a0 | 25.0288 | 0.0450 | 556.7824 | 0.0000 |
a1 | 24.9980 | 0.0450 | 555.9883 | 0.0000 |
a0:a1 | -0.0017 | 0.0636 | -0.0274 | 0.9781 |
Using the fitted MSM, we can estimate potential outcomes under different treatment regimens.
# Create data for prediction under different treatment scenarios
scenario_data <- data.frame(
scenario = c("Never treated", "Always treated", "Treated at t=0 only", "Treated at t=1 only"),
a0 = c(0, 1, 1, 0),
a1 = c(0, 1, 0, 1)
)
# Predict potential outcomes
scenario_data$potential_outcome <- predict(msm_model, newdata = scenario_data)
# Calculate causal effect: Always treated vs Never treated
tau_msm <- scenario_data$potential_outcome[2] - scenario_data$potential_outcome[1]
print(paste("MSM Estimated Causal Effect (τ):", round(tau_msm, 4)))
## [1] "MSM Estimated Causal Effect (τ): 50.025"
## [1] "True Causal Effect: 50"
## [1] "Bias: 0.025"
# Display all scenarios
kable(scenario_data, digits = 4,
caption = "Potential Outcomes Under Different Treatment Scenarios") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
scenario | a0 | a1 | potential_outcome |
---|---|---|---|
Never treated | 0 | 0 | 100.0188 |
Always treated | 1 | 1 | 150.0439 |
Treated at t=0 only | 1 | 0 | 125.0476 |
Treated at t=1 only | 0 | 1 | 125.0168 |
One key advantage of IPW is that it should balance confounders across treatment groups in the weighted sample.
# Check balance of Z1 across A0 groups before and after weighting
balance_check <- expanded_data %>%
group_by(a0) %>%
summarise(
n = n(),
mean_z1_unweighted = mean(z1),
mean_z1_weighted = weighted.mean(z1, ipw),
.groups = 'drop'
)
print("Balance Check - Z1 by A0:")
## [1] "Balance Check - Z1 by A0:"
kable(balance_check, digits = 4,
caption = "Balance of Z1 across A0 groups (Before and After Weighting)") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a0 | n | mean_z1_unweighted | mean_z1_weighted |
---|---|---|---|
0 | 5e+05 | 0.3939 | 0.3939 |
1 | 5e+05 | 0.6089 | 0.6088 |
# Check balance of Z1 across A1 groups
balance_check_a1 <- expanded_data %>%
group_by(a1) %>%
summarise(
n = n(),
mean_z1_unweighted = mean(z1),
mean_z1_weighted = weighted.mean(z1, ipw),
.groups = 'drop'
)
print("Balance Check - Z1 by A1:")
## [1] "Balance Check - Z1 by A1:"
kable(balance_check_a1, digits = 4,
caption = "Balance of Z1 across A1 groups (Before and After Weighting)") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
a1 | n | mean_z1_unweighted | mean_z1_weighted |
---|---|---|---|
0 | 498612 | 0.3100 | 0.5014 |
1 | 501388 | 0.6917 | 0.5014 |
# Create comprehensive results table
results_summary <- data.frame(
Method = c("True Effect", "MSM (Inverse Probability Weighting)"),
Estimated_Effect = c(50, round(tau_msm, 4)),
Bias = c(0, round(tau_msm - 50, 4))
)
kable(results_summary,
caption = "Summary of Causal Effect Estimates",
digits = 4) %>%
kable_styling(bootstrap_options = c("striped", "hover"))
Method | Estimated_Effect | Bias |
---|---|---|
True Effect | 50.000 | 0.000 |
MSM (Inverse Probability Weighting) | 50.025 | 0.025 |
## Key Findings:
## - MSM with inverse probability weighting provides a causal effect estimate
## - IPW successfully balances confounders in the weighted pseudo-population
## - MSM offers a flexible alternative to g-formula for time-varying confounders
## - Less modeling required compared to g-formula approach
Marginal Structural Models with Inverse Probability Weighting provide a powerful alternative to the g-formula for causal inference with time-varying confounders. The method creates a pseudo-population where treatment assignment is independent of confounders, allowing for unbiased estimation of causal effects.
In our HIV treatment example, MSM successfully estimated the causal effect of always being treated versus never being treated, demonstrating the practical utility of this approach for longitudinal causal inference.