Chapter 16: Instrumental Variable Estimation

This chapter introduces instrumental variable (IV) estimation, a method for identifying causal effects when there is unmeasured confounding. Unlike the methods in previous chapters, IV estimation does not rely on conditional exchangeability. Instead, it uses a special variable (the instrument) that affects treatment but not the outcome directly.

1 16.1 The Three Instrumental Conditions (pp. 227-231)

An instrumental variable \(Z\) must satisfy three conditions to identify causal effects.

Definition 1 (Instrumental Variable) A variable \(Z\) is an instrumental variable for the effect of \(A\) on \(Y\) if:

Relevance: \(Z\) is associated with \(A\)
Exchangeability: \(Z\) is independent of unmeasured confounders \(U\) (i.e., \(Y^{a,z} \perp\!\!\!\perp Z\))
Exclusion restriction: \(Z\) affects \(Y\) only through \(A\) (i.e., \(Y^{a,z} = Y^a\) for all \(a, z\))

Where \(Y^{a,z}\) denotes the potential outcome under treatment \(A = a\) and instrument \(Z = z\).

Condition 1: Relevance

Statement: \(Z\) is associated with \(A\)

Meaning: The instrument must actually affect treatment assignment.

Example: Randomized encouragement

\(Z = 1\) if encouraged to take treatment, \(Z = 0\) if not encouraged
Relevance requires that encouragement increases the probability of treatment

Testing: Relevance can be tested empirically by checking \(\Pr[A = 1 \mid Z = 1] \neq \Pr[A = 1 \mid Z = 0]\)

Condition 2: Exchangeability

Statement: \(Y^{a,z} \perp\!\!\!\perp Z\) (or \(Y^a \perp\!\!\!\perp Z\) under exclusion)

Meaning: The instrument is “as good as randomly assigned” with respect to potential outcomes.

Example: Randomized encouragement

If encouragement is randomized, exchangeability holds by design
\(Z \perp\!\!\!\perp U\) where \(U\) are unmeasured confounders

Testing: Exchangeability generally cannot be tested (involves unmeasured confounders)

Condition 3: Exclusion Restriction

Statement: \(Y^{a,z} = Y^a\) for all \(a, z\)

Meaning: The instrument affects the outcome ONLY through its effect on treatment.

Example: Randomized encouragement

Encouragement affects outcome only by changing treatment received
NOT through psychological effects, information effects, etc.

Testing: Exclusion generally cannot be tested (untestable assumption)

2 16.2 The Usual IV Estimand (pp. 231-234)

Under the three IV conditions, we can identify a causal effect.

IV Estimand for Binary \(Z\) and \(A\)

Setting: Binary instrument \(Z\), binary treatment \(A\), outcome \(Y\)

IV estimand:

\[\frac{\text{E}{\left[Y \mid Z = 1\right]} - \text{E}{\left[Y \mid Z = 0\right]}}{\text{E}{\left[A \mid Z = 1\right]} - \text{E}{\left[A \mid Z = 0\right]}}\]

This is the Wald estimator or ratio estimator.

Interpretation: The effect of \(Z\) on \(Y\), divided by the effect of \(Z\) on \(A\).

Why This Works

Numerator: \(\text{E}{\left[Y \mid Z = 1\right]} - \text{E}{\left[Y \mid Z = 0\right]}\)

By exchangeability: equals \(\text{E}{\left[Y^{z=1}\right]} - \text{E}{\left[Y^{z=0}\right]}\)
By exclusion: equals \(\text{E}{\left[Y^{A^{z=1}}\right]} - \text{E}{\left[Y^{A^{z=0}}\right]}\)

Denominator: \(\text{E}{\left[A \mid Z = 1\right]} - \text{E}{\left[A \mid Z = 0\right]}\)

By exchangeability: equals \(\Pr[A^{z=1} = 1] - \Pr[A^{z=0} = 1]\)

Ratio: Under additional assumptions (see next section), this estimates the average causal effect in a specific subgroup.

Example: Randomized Encouragement

Design: Randomize individuals to receive encouragement to exercise (\(Z\))

Not everyone encouraged will exercise (\(A = 1\))
Some not encouraged will exercise anyway

IV estimate:

\[\frac{\text{Mean health in encouraged} - \text{Mean health in not encouraged}}{\Pr[\text{Exercise} \mid \text{Encouraged}] - \Pr[\text{Exercise} \mid \text{Not encouraged}]}\]

If 60% exercise when encouraged vs 30% when not, and mean health differs by 6 points:

\[\frac{6}{0.60 - 0.30} = \frac{6}{0.30} = 20\]

Effect of exercise on health (in a subgroup) is 20 points.

3 16.3 Instrumental Variable Estimation versus Randomized Experiments (pp. 234-237)

IV estimation is like an imperfect randomized experiment.

Perfect Compliance

If everyone complied with their assigned treatment (\(A = Z\)):

IV estimand = average causal effect in the full population
This is just a standard randomized experiment

Imperfect Compliance

When \(A \neq Z\) for some individuals:

IV estimand estimates effect in a subgroup (compliers)
Not the average effect in the full population

Compliance Types

Definition 2 (Principal Strata) Individuals can be classified into principal strata based on potential treatments \(A^{z=1}\) and \(A^{z=0}\):

Compliers: \(A^{z=1} = 1, A^{z=0} = 0\) (take treatment if and only if \(Z = 1\))
Always-takers: \(A^{z=1} = 1, A^{z=0} = 1\) (always take treatment)
Never-takers: \(A^{z=1} = 0, A^{z=0} = 0\) (never take treatment)
Defiers: \(A^{z=1} = 0, A^{z=0} = 1\) (do opposite of instrument)

Monotonicity assumption: No defiers exist.

Local Average Treatment Effect (LATE)

Under IV conditions plus monotonicity:

\[\text{IV estimand} = \text{E}{\left[Y^{a=1} - Y^{a=0} \mid \text{Complier}\right]}\]

This is the average causal effect in compliers, not in the full population.

Interpretation: IV tells us the effect of treatment for those who would comply with the instrument.

Limitation: We don’t know who the compliers are (unobservable principal stratum).

4 16.4 Two-Stage Least Squares Estimation (pp. 237-240)

Two-stage least squares (2SLS) is the most common IV method for continuous outcomes.

2SLS Algorithm

Stage 1: Regress treatment on instrument (and covariates if present)

\[A_i = \alpha_0 + \alpha_1 Z_i + \epsilon_i\]

Obtain predicted treatment: \(\hat{A}_i = \hat{\alpha}_0 + \hat{\alpha}_1 Z_i\)

Stage 2: Regress outcome on predicted treatment

\[Y_i = \beta_0 + \beta_1 \hat{A}_i + \eta_i\]

The coefficient \(\hat{\beta}_1\) is the 2SLS estimate of the causal effect.

Why This Works

Intuition:

Stage 1 extracts the variation in \(A\) that is “caused by” \(Z\)
\(\hat{A}\) is the part of \(A\) that is free of confounding (because \(Z\) is randomized)
Stage 2 estimates the effect of this “clean” variation on \(Y\)

Mathematical equivalence: For binary \(Z\) and \(A\), 2SLS equals the Wald estimator.

Including Covariates

With measured confounders \(L\) that confound \(A \to Y\) but not \(Z \to A\):

Stage 1: \[A_i = \alpha_0 + \alpha_1 Z_i + \alpha_2^{\top} L_i + \epsilon_i\]

Stage 2: \[Y_i = \beta_0 + \beta_1 \hat{A}_i + \beta_2^{\top} L_i + \eta_i\]

Including \(L\) can improve efficiency even if not necessary for identification.

5 16.5 Instrumental Variable Estimation with Measured Confounders (pp. 240-242)

IV estimation can be combined with adjustment for measured confounders.

Two Scenarios

Scenario 1: Confounders of \(A \to Y\) that don’t affect \(Z\)

Adjust by including \(L\) in both stages of 2SLS
Improves efficiency but not necessary for identification

Scenario 2: Confounders of \(Z \to Y\)

More problematic - threatens IV exchangeability assumption
Need \(Y^a \perp\!\!\!\perp Z \mid L\) (conditional exchangeability of instrument)
Use conditional IV methods

Conditional IV Estimation

Modified IV conditions:

Conditional relevance: \(Z \not\perp\!\!\!\perp A \mid L\)
Conditional exchangeability: \(Y^a \perp\!\!\!\perp Z \mid L\)
Exclusion restriction: \(Y^{a,z} = Y^a\) (still unconditional)

Estimation: Use 2SLS with \(L\) as covariates, then standardize over \(L\).

6 16.6 Instrumental Variable Estimation versus Regression (pp. 242-244)

How do IV estimates compare to regression-based estimates?

Comparison

Regression (e.g., outcome regression or IP weighting):

Assumes no unmeasured confounding: \(Y^a \perp\!\!\!\perp A \mid L\)
Identifies average causal effect: \(\text{E}{\left[Y^{a=1}\right]} - \text{E}{\left[Y^{a=0}\right]}\)
Efficient when assumptions hold

IV estimation:

Allows unmeasured confounding of \(A \to Y\)
Assumes valid instrument with IV conditions
Identifies LATE: \(\text{E}{\left[Y^{a=1} - Y^{a=0} \mid \text{Complier}\right]}\)
Less efficient (larger standard errors)

When Estimates Differ

If IV and regression give different estimates:

Unmeasured confounding: Regression is biased, IV may be valid
Effect heterogeneity: IV estimates LATE, regression estimates ATE
IV violations: IV assumptions may be violated
Both wrong: Both methods could have issues

Interpretation: Differences suggest either unmeasured confounding or effect heterogeneity (or both).

7 16.7 The Survivor Average Causal Effect (pp. 244-246)

IV methods can be extended to handle survival outcomes and time-to-event data.

Challenges with Survival Outcomes

Issue: With time-to-event outcomes, some individuals are censored before the event.

Question: How do we interpret IV estimates when the outcome is survival time?

Survivor Average Causal Effect (SACE)

Definition 3 (Survivor Average Causal Effect) The survivor average causal effect (SACE) is:

\[\text{E}{\left[Y^{a=1} - Y^{a=0} \mid S^{a=1} = 1, S^{a=0} = 1\right]}\]

where \(S^a\) is an indicator for surviving (or remaining uncensored) under treatment \(a\).

This is the effect in always-survivors - those who would survive under both treatment and control.

Identification

Setting: Survival \(S\) is affected by treatment \(A\), and outcome \(Y\) is only observed if \(S = 1\).

IV approach: Under IV conditions plus additional assumptions (monotonicity for survival), IV can identify SACE.

Interpretation: Effect of treatment on the outcome for those who would survive regardless of treatment.

8 Summary

Key concepts:

Instrumental variable: A variable \(Z\) that affects treatment but not the outcome directly
IV conditions: Relevance, exchangeability, exclusion restriction
Wald estimator: \(\frac{\text{E}{\left[Y \mid Z=1\right]} - \text{E}{\left[Y \mid Z=0\right]}}{\text{E}{\left[A \mid Z=1\right]} - \text{E}{\left[A \mid Z=0\right]}}\) for binary \(Z, A\)
Principal strata: Compliers, always-takers, never-takers, defiers
LATE: Local average treatment effect in compliers
2SLS: Two-stage least squares for continuous outcomes
SACE: Survivor average causal effect for survival outcomes

When to use IV methods:

Unmeasured confounding is a concern
Valid instrument is available (strong assumptions)
LATE is a meaningful estimand (compliers are of interest)
Efficiency loss is acceptable (IV estimates have larger SEs)

Common instruments:

Setting	Instrument	Treatment	Outcome
Randomized encouragement	Encouragement	Behavior change	Health
Geographic variation	Distance to facility	Healthcare use	Health
Mendelian randomization	Genetic variant	Biomarker	Disease
Draft lottery	Lottery number	Military service	Earnings
Physician preference	Physician tendency	Treatment choice	Outcome

Assumptions to check:

Relevance: Test empirically (\(Z\) associated with \(A\))
Exchangeability: Justify by design or argue plausibility
Exclusion: Requires subject-matter knowledge (cannot be tested)
Monotonicity: No defiers (often plausible, sometimes testable)

Advantages:

Allows causal inference with unmeasured confounding
Uses only treatment variation “caused by” instrument
Provides a different estimand than regression methods

Limitations:

Strong, untestable assumptions (especially exclusion)
Estimates LATE, not ATE (interpretation challenge)
Less efficient than regression (when regression assumptions hold)
Weak instruments lead to bias and large variance

Hernán, Miguel A, and James M Robins. 2020. Causal Inference: What If. Chapman & Hall/CRC. https://miguelhernan.org/whatifbook.