This chapter introduces G-estimation, a method for estimating the parameters of structural nested models (SNMs). Unlike IP weighting and standardization, G-estimation does not directly model either the treatment mechanism or the outcome mechanism. Instead, it models the causal effect itself, making it robust to certain types of model misspecification.
Structural nested models directly parameterize the causal effect rather than the mean outcome or treatment probability.
Definition 1 (Structural Nested Mean Model) A structural nested mean model (SNMM) specifies how the mean of \(Y^a\) differs from the mean of \(Y^{a'}\) as a function of treatment and covariates:
\[E[Y^a - Y^{a'} \mid L] = \gamma(a, a'; \psi, L)\]
For dichotomous treatment with \(a = 1\) and \(a' = 0\):
\[E[Y^1 - Y^0 \mid L] = \psi_0 + \psi_1^{\top} L\]
where \(\psi = (\psi_0, \psi_1)\) are the parameters of interest.
Marginal structural model (IP weighting): \[E[Y^a] = \beta_0 + \beta_1 a\] Models the mean outcome under treatment \(a\).
Outcome regression (standardization): \[E[Y \mid A, L] = \beta_0 + \beta_1 A + \beta_2^{\top} L + \beta_3^{\top} (A \times L)\] Models the conditional mean outcome given treatment and confounders.
Structural nested model (G-estimation): \[E[Y^1 - Y^0 \mid L] = \psi_0 + \psi_1^{\top} L\] Models the conditional causal effect directly.
G-estimation relies on the assumption that treatment affects everyone in the same direction (though possibly by different amounts).
Definition 2 (Rank Preservation) Rank preservation (also called monotonicity or no qualitative interaction) assumes:
If \(Y_i^1 > Y_j^1\), then \(Y_i^0 > Y_j^0\) for all individuals \(i, j\).
Equivalently: Treatment does not reverse the ranking of individuals with respect to the outcome.
Allowed under rank preservation: - Individual causal effects \(Y_i^1 - Y_i^0\) can differ across individuals - Some individuals can have large effects, others small effects - Effects can vary with covariates \(L\)
NOT allowed under rank preservation: - Treatment helps some individuals (\(Y_i^1 > Y_i^0\)) and harms others (\(Y_j^1 < Y_j^0\)) - Qualitative interactions where treatment reverses rankings
Rank preservation: - Some people gain more weight than others when quitting - But quitting increases weight for everyone (or at least doesn’t decrease it for anyone)
Violation: - Some people gain weight when quitting, others lose weight when quitting
The key idea of G-estimation: under the null hypothesis of no causal effect with specific parameters, we can construct a pseudo-outcome that is independent of treatment.
Definition 3 (G-Null Hypothesis) For a given parameter value \(\psi\), define the G-null hypothesis \(H_0(\psi)\):
\[H_0(\psi): Y^1 - Y^0 = \psi_0 + \psi_1^{\top} L \text{ for all individuals}\]
Under rank preservation, this is equivalent to:
\[H_0(\psi): Y_i^1 - Y_i^0 = \psi_0 + \psi_1^{\top} L_i \text{ for all } i\]
Under \(H_0(\psi)\), we can construct:
\[H(\psi) = Y - A(\psi_0 + \psi_1^{\top} L)\]
Key property: If \(H_0(\psi)\) is true, then:
\[H(\psi) = Y^0 \text{ for all individuals}\]
Since \(Y^0\) is the potential outcome under no treatment, it should be independent of actual treatment \(A\) given confounders \(L\):
\[H(\psi) \perp\!\!\!\perp A \mid L\]
G-estimation finds the value of \(\psi\) that makes \(H(\psi)\) independent of \(A\) conditional on \(L\).
Step 1: Specify a structural nested model \[E[Y^1 - Y^0 \mid L] = \psi_0 + \psi_1^{\top} L\]
Step 2: For a candidate value \(\psi\), compute pseudo-outcome \[H(\psi) = Y - A(\psi_0 + \psi_1^{\top} L)\]
Step 3: Test whether \(H(\psi) \perp\!\!\!\perp A \mid L\) by fitting \[E[H(\psi) \mid A, L] = \alpha_0 + \alpha_1 A + \alpha_2^{\top} L\]
Step 4: The correct \(\psi\) is the one that makes \(\alpha_1 = 0\)
Step 5: In practice, solve the estimating equation: \[\sum_{i=1}^n A_i[Y_i - A_i(\psi_0 + \psi_1^{\top} L_i)] = 0\] or more generally: \[\sum_{i=1}^n U_i(\psi)[Y_i - A_i(\psi_0 + \psi_1^{\top} L_i)] = 0\] where \(U_i(\psi)\) is an appropriate function (often \(U_i = A_i\) or \(U_i = A_i(1, L_i)^{\top}\)).
SNMM: \(E[Y^1 - Y^0] = \psi_0\) (constant effect)
Estimating equation: \[\sum_{i=1}^n A_i(Y_i - A_i \psi_0) = 0\]
Solution: \[\hat{\psi}_0 = \frac{\sum_i A_i Y_i}{\sum_i A_i^2} = \frac{\sum_i A_i Y_i}{n_1}\] where \(n_1 = \sum_i A_i\) is the number of treated individuals.
This is the mean outcome among the treated when there is no confounding.
G-estimation has robustness properties that differ from IP weighting and standardization.
When SNMM is correctly specified: - G-estimation is consistent even if \(E[A \mid L]\) is misspecified - Need to correctly model the effect \(E[Y^1 - Y^0 \mid L]\), not the full outcome model \(E[Y \mid A, L]\)
When treatment model is correctly specified: - G-estimation is consistent even if the effect model is misspecified in certain ways - Specific robustness depends on the choice of estimating function \(U(\psi)\)
Double robustness: - Some G-estimators are doubly robust: consistent if either the effect model or a working model for \(E[H(\psi) \mid A, L]\) is correct - This is similar to doubly robust IP weighted estimators
| Method | Requires Correct | Robust To |
|---|---|---|
| IP weighting | \(\Pr[A \mid L]\) | Outcome model misspec. |
| Standardization | \(E[Y \mid A, L]\) | Treatment model misspec. |
| G-estimation | \(E[Y^1 - Y^0 \mid L]\) | Full outcome model misspec. |
| Doubly robust | Either model | One model misspecification |
From the SNMM, we can compute the average causal effect.
SNMM: \(E[Y^1 - Y^0 \mid L] = \psi_0 + \psi_1^{\top} L\)
Average causal effect: \[E[Y^1 - Y^0] = E_L[E[Y^1 - Y^0 \mid L]] = E_L[\psi_0 + \psi_1^{\top} L] = \psi_0 + \psi_1^{\top} E[L]\]
Estimator: \[\widehat{E[Y^1 - Y^0]} = \hat{\psi}_0 + \hat{\psi}_1^{\top} \bar{L}\]
where \(\bar{L} = n^{-1} \sum_i L_i\) is the sample mean of \(L\).
Effect at \(L = \ell\): \[E[Y^1 - Y^0 \mid L = \ell] = \psi_0 + \psi_1^{\top} \ell\]
Effect in the treated (ATT): \[E[Y^1 - Y^0 \mid A = 1] = \psi_0 + \psi_1^{\top} E[L \mid A = 1]\]
Estimator for ATT: \[\widehat{E[Y^1 - Y^0 \mid A = 1]} = \hat{\psi}_0 + \hat{\psi}_1^{\top} \bar{L}_{A=1}\]
where \(\bar{L}_{A=1}\) is the mean of \(L\) among the treated.
SNMMs can include multiple effect modifiers.
\[E[Y^1 - Y^0 \mid L] = \psi_0 + \psi_1 L_1 + \psi_2 L_2 + \psi_3 L_1 L_2 + \ldots\]
Parameters: \(\psi = (\psi_0, \psi_1, \psi_2, \psi_3, \ldots)\)
Estimating equations: Need as many equations as parameters
\[\sum_{i=1}^n U_{ij}(\psi)[Y_i - A_i(\psi_0 + \psi_1 L_{i1} + \psi_2 L_{i2} + \ldots)] = 0\]
for \(j = 1, 2, \ldots, p\) where \(p\) is the number of parameters.
Common choices for \(U_{ij}\):
The choice affects: - Efficiency (variance of estimator) - Robustness properties - Computational complexity
G-estimation extends to handle censoring and missing outcomes.
Let \(C = 1\) if censored, \(C = 0\) if observed.
Assumption: \(C \perp\!\!\!\perp Y^a \mid A, L\) (censoring independent of potential outcomes given treatment and covariates)
Weighted estimating equation:
\[\sum_{i: C_i = 0} \frac{1}{\Pr[C_i = 0 \mid A_i, L_i]} U_i(\psi)[Y_i - A_i \gamma(A_i, 0; \psi, L_i)] = 0\]
This weights each uncensored observation by the inverse probability of being uncensored.
When we have both confounding and censoring:
\[\sum_{i: C_i = 0} W_i U_i(\psi)[Y_i - A_i \gamma(A_i, 0; \psi, L_i)] = 0\]
where:
\[W_i = \frac{1}{\Pr[A_i \mid L_i] \times \Pr[C_i = 0 \mid A_i, L_i]}\]
Or using stabilized weights for improved stability.
G-estimation naturally estimates conditional effects \(E[Y^1 - Y^0 \mid L]\). We can average to get marginal effects.
Marginal effect (population average): \[E[Y^1 - Y^0]\]
Conditional effect (within levels of \(L\)): \[E[Y^1 - Y^0 \mid L]\]
Individual effect: \[Y_i^1 - Y_i^0\]
| Method | Natural Estimand | To Get Other Estimands |
|---|---|---|
| IP weighting | Marginal effect | Model \(E[Y^a \mid V]\) for conditional |
| Standardization | Conditional effect | Average over \(L\) for marginal |
| G-estimation | Conditional effect | Average over \(L\) for marginal |
Advantage of SNMMs: By modeling \(E[Y^1 - Y^0 \mid L]\) directly, G-estimation provides natural inference for effect modification while still allowing marginal effect estimation.
Key concepts introduced:
Comparison of methods:
| Aspect | IP Weighting | Standardization | G-Estimation |
|---|---|---|---|
| Models | Treatment mechanism | Outcome mechanism | Causal effect |
| Estimand | Marginal effect | Conditional effect | Conditional effect |
| Assumptions | Conditional exchangeability | Conditional exchangeability | + Rank preservation |
| Robustness | Treatment model | Outcome model | Effect model |
Advantages of G-estimation:
Limitations: