In Chapter 3, we introduced exchangeability as a key identifiability condition. In Chapter 6, we learned to represent causal relationships using DAGs and introduced the backdoor criterion for identifying confounding. This chapter provides a detailed examination of confounding—the most common threat to validity in observational studies.
Confounding occurs when a common cause of treatment and outcome creates a non-causal association between them.
Definition 1 (Confounding Structure) A variable \(L\) is a confounder of the effect of \(A\) on \(Y\) if:
Causal diagram representation:
L → A → Y
L → Y
The path \(A \leftarrow L \rightarrow Y\) is a backdoor path that creates non-causal association.
Example 1: Healthy worker bias
Example 2: Confounding by indication
Confounding is equivalent to lack of (conditional) exchangeability.
No confounding means: \[Y^a \perp\!\!\!\perp A \quad \text{for all } a\]
This is marginal exchangeability: the counterfactual outcomes are independent of treatment.
Confounding means exchangeability does not hold: \[Y^a \not\perp\!\!\!\perp A\]
The treated and untreated differ with respect to their potential outcomes.
Example 1 (Confounding and Exchangeability) Suppose exercise (\(A\)) affects heart disease (\(Y\)), and both are affected by age (\(L\)):
Without confounding:
With confounding:
Even when marginal exchangeability fails, we may achieve conditional exchangeability by adjusting for confounders:
\[Y^a \perp\!\!\!\perp A \mid L \quad \text{for all } a\]
Within levels of \(L\), the treated and untreated are exchangeable.
The backdoor criterion (Chapter 6) provides a graphical method for identifying confounding.
A backdoor path from \(A\) to \(Y\):
Confounding exists if backdoor paths are open (unblocked).
Example 2 (Identifying Confounders with the Backdoor Criterion) Diagram 1:
L → A → Y
L → Y
Backdoor path: \(A \leftarrow L \rightarrow Y\) Confounders: \(L\) Solution: Adjust for \(L\)
Diagram 2:
U → L → A → Y
L → Y
Backdoor paths: \(A \leftarrow L \rightarrow Y\), \(A \leftarrow L \leftarrow U \rightarrow Y\) (if U causes Y) Confounders: \(L\) (and \(U\) if it affects \(Y\)) Solution: Adjust for \(L\) (and \(U\) if measured)
Diagram 3:
A → M → Y
L → A
L → Y
Backdoor path: \(A \leftarrow L \rightarrow Y\) Confounders: \(L\) Do NOT adjust for \(M\): \(M\) is a mediator (on the causal path), not a confounder
The traditional definition of “confounder” in epidemiology differs slightly from the causal DAG perspective.
Traditionally, a variable \(L\) is considered a confounder if: 1. \(L\) is associated with treatment \(A\) 2. \(L\) is associated with outcome \(Y\) (among the untreated) 3. \(L\) is not affected by treatment \(A\)
From the DAG perspective, \(L\) is a confounder if:
Single-World Intervention Graphs (SWIGs) are an extension of DAGs that explicitly represent interventions and counterfactual outcomes.
Once confounders are identified, several methods can adjust for them.
Stratification: Estimate effects within strata of \(L\), then combine (standardization)
Regression adjustment: Include \(L\) as covariates in a regression model
Inverse probability weighting: Weight by \(1/Pr[A | L]\) to create a pseudo-population where \(A\) and \(L\) are independent (Chapter 12)
Matching: Match treated and untreated individuals on \(L\)
Example 3 (Comparing Adjustment Methods) Data: Effect of smoking (\(A\)) on lung cancer (\(Y\)), adjusting for age (\(L\))
Stratification:
Regression:
IP weighting (Chapter 12):
Matching:
This chapter provided a detailed examination of confounding.
Key concepts:
Confounding structure: Common causes of treatment and outcome create backdoor paths
Exchangeability: Confounding = lack of exchangeability; conditional exchangeability can be achieved by adjusting for confounders
Backdoor criterion: Provides a graphical method to identify which variables to adjust for
DAG vs. traditional definitions: DAG-based confounding identification is preferred over association-based criteria
Adjustment methods: Stratification, regression, IP weighting, and matching can all adjust for confounding
Critical assumptions: