This chapter introduces inverse probability (IP) weighting, a method for estimating causal effects that creates a pseudo-population in which treatment is independent of measured confounders. IP weighting is used to fit marginal structural models, which provide a natural framework for estimating marginal causal effects when treatment and confounding vary over time.
We return to the NHEFS study to estimate the average causal effect of quitting smoking on weight gain.
Population: 1,566 cigarette smokers from NHEFS who had a baseline visit and were seen again approximately 10 years later.
Treatment: \(A = 1\) if quit smoking between visits, \(A = 0\) if continued smoking
Outcome: \(Y\) = weight change in kg between visits
Causal estimand: \[E[Y^{a=1}] - E[Y^{a=0}]\]
The average treatment effect of smoking cessation on weight gain.
We have measured baseline covariates \(L\) that may confound the relationship:
Assumption: Conditional exchangeability given \(L\): \[Y^a \perp\!\!\!\perp A \mid L\]
The core idea of IP weighting is to create a pseudo-population by weighting each individual by the inverse of their probability of receiving the treatment they actually received.
Definition 1 (Inverse Probability Weights) For individual \(i\), the IP weight is:
\[W^A_i = \frac{1}{f(A_i \mid L_i)}\]
where \(f(A_i \mid L_i) = \Pr[A = A_i \mid L = L_i]\) is the propensity score - the probability of receiving the treatment actually received, given confounders.
For a dichotomous treatment:
Step 1: Fit a model for \(\Pr[A = 1 \mid L]\)
For dichotomous treatment, use logistic regression:
\[\text{logit}\Pr[A = 1 \mid L] = \beta_0 + \beta_1 L_1 + \beta_2 L_2 + \ldots + \beta_p L_p\]
Step 2: Predict \(\hat{f}(A_i \mid L_i)\) for each individual
Step 3: Calculate IP weights
\[\hat{W}^A_i = \frac{1}{\hat{f}(A_i \mid L_i)}\]
In the NHEFS study:
Propensity score model: Logistic regression including sex, age, race, education, smoking intensity, smoking duration, exercise, weight, etc.
Typical weights: - Median weight: approximately 1.0 - Range: 0.3 to 16.7 - Mean: approximately 1.0 (by construction in simple settings)
Some individuals have very large weights, indicating their treatment was unusual given their covariates.
Standard IP weights can have extreme values, leading to unstable estimates. Stabilized weights reduce variability.
Definition 2 (Stabilized IP Weights) \[SW^A = \frac{f(A)}{f(A \mid L)}\]
where \(f(A) = \Pr[A]\) is the marginal probability of treatment.
For dichotomous \(A\):
Advantages: 1. Mean is exactly 1.0 2. Smaller range than unstandardized weights 3. More stable variance estimates 4. Still create pseudo-population with \(A \perp\!\!\!\perp L\)
Estimation: - Numerator: Fit model for \(\Pr[A = 1]\) (intercept-only logistic regression) - Denominator: Same as unstabilized weights
Stabilized weights: - Median: approximately 1.0 - Range: 0.3 to 13.3 (compared to 0.3 to 16.7 for unstabilized) - Mean: exactly 1.0
IP weighting is used to fit marginal structural models - models for the marginal distribution of the potential outcomes.
Definition 3 (Marginal Structural Model) A marginal structural model (MSM) is a model for the marginal mean of the potential outcome \(Y^a\) as a function of treatment \(a\) (and possibly other variables):
\[E[Y^a] = \beta_0 + \beta_1 a\]
For dichotomous \(A\), parameter \(\beta_1\) equals the average causal effect:
\[\beta_1 = E[Y^{a=1}] - E[Y^{a=0}]\]
Procedure:
Important: The model is fit using the observed data \((A, Y)\), but weighted by IP weights. This approximates what we would see if we fit an unweighted model in the pseudo-population.
MSM: \[E[Y^a] = \beta_0 + \beta_1 a\]
Weighted linear regression:
Results: - \(\hat{\beta}_1 \approx 3.4\) kg (95% CI: 2.4, 4.5) - Interpretation: Quitting smoking causes an average weight gain of 3.4 kg
MSMs can model effect modification by including interactions with baseline covariates.
Definition 4 (MSM with Effect Modifier) To assess effect modification by variable \(V\):
\[E[Y^a \mid V] = \beta_0 + \beta_1 a + \beta_2 V + \beta_3 a \times V\]
where \(\beta_3\) quantifies effect modification:
Procedure:
MSM: \[E[Y^a \mid \text{Sex}] = \beta_0 + \beta_1 a + \beta_2 \text{Sex} + \beta_3 a \times \text{Sex}\]
Results (hypothetical): - \(\hat{\beta}_1 = 2.5\) kg (effect in men) - \(\hat{\beta}_3 = 1.8\) kg (additional effect in women) - Effect in women: \(2.5 + 1.8 = 4.3\) kg
IP weighting can also handle censoring and missing data under appropriate assumptions.
Let \(C = 1\) if censored (data missing), \(C = 0\) if uncensored (data observed).
IP weight for censoring:
\[W^C = \frac{1}{\Pr[C = 0 \mid A, L]}\]
These weights create a pseudo-population of only uncensored individuals.
When we have both confounding and censoring:
\[W^{A,C} = W^A \times W^C = \frac{1}{\Pr[A \mid L]} \times \frac{1}{\Pr[C = 0 \mid A, L]}\]
Stabilized version:
\[SW^{A,C} = \frac{\Pr[A]}{\Pr[A \mid L]} \times \frac{\Pr[C = 0 \mid A]}{\Pr[C = 0 \mid A, L]}\]
Setting: Some individuals lost to follow-up by the second visit
Assumption: Censoring is independent of potential outcomes given \((A, L)\):
\[C \perp\!\!\!\perp Y^a \mid A, L\]
Procedure:
IP weighting can be viewed through the lens of likelihood theory.
The IP weighted estimator solves the weighted estimating equations:
\[\sum_{i=1}^n W^A_i \times \frac{\partial \log f(Y_i \mid A_i; \beta)}{\partial \beta} = 0\]
This is equivalent to maximizing a weighted likelihood:
\[L_W(\beta) = \prod_{i=1}^n [f(Y_i \mid A_i; \beta)]^{W^A_i}\]
Without confounding: Standard MLE of \(\beta\) in model \(E[Y \mid A] = g(A; \beta)\)
With confounding: IP weighted MLE of \(\beta\) in MSM \(E[Y^a] = g(a; \beta)\)
The IP weights “adjust” the likelihood to account for confounding.
Key concepts introduced:
Advantages of IP weighting: - Natural for marginal effects - Handles continuous confounders easily - Extends naturally to time-varying treatments (Part III) - Can combine treatment and censoring weights
Limitations: - Requires correct specification of treatment model - Can be unstable with extreme weights - Positivity violations lead to extreme weights - Efficiency loss compared to outcome modeling (when that model is correct)