So far, we have represented causal relationships using mathematical notation and counterfactual outcomes. This chapter introduces causal diagrams—visual tools for representing causal assumptions and determining which variables need to be adjusted for when estimating causal effects.
Causal diagrams are directed acyclic graphs (DAGs) that encode our knowledge about the causal structure of the problem. They provide an intuitive way to identify confounding, avoid selection bias, and select appropriate adjustment sets.
A causal diagram is a graph where:
Definition 1 (Directed Acyclic Graph (DAG)) A directed acyclic graph is a causal diagram where: 1. All edges are directed (have arrowheads indicating the direction of causation) 2. There are no cycles (you cannot start at a node, follow the arrows, and return to that node)
Consider a simple causal diagram:
L → A → Y
L → Y
This diagram represents:
Key terms:
Example 1 (Simple Causal Diagram) Consider the causal relationships between:
Plausible causal diagram:
A → Y
L → Y
L → A
Interpretation:
Causal diagrams encode information about statistical independence relationships.
From a DAG, we can determine which variables are marginally independent (unconditionally).
Example 2 (Marginal Independence) Diagram 1: \(A \rightarrow Y\)
Diagram 2: \(A \quad Y\) (no arrow)
Diagram 3: \(L \rightarrow A \rightarrow Y\) (chain)
Diagram 4: \(A \rightarrow Y \leftarrow L\) (fork)
Conditioning on (stratifying by) a variable can change independence relationships.
There are three fundamental structures in causal diagrams:
Example 3 (Collider Bias Example) Suppose:
Diagram: \(A \rightarrow Y \leftarrow E\)
Marginal: Among the general population, natural ability and training effort are independent (some people train hard, some don’t; some are naturally gifted, some aren’t; these are unrelated).
Conditional on \(Y=1\) (among professional athletes): Natural ability and training effort are negatively associated!
Why? Among those who made it to the pros, if someone has low natural ability, they must have compensated with high training effort. Conversely, high natural ability allows one to reach the pros with less effort.
This is collider stratification bias.
Causal diagrams help us understand the identifiability assumptions introduced in Chapter 3.
The positivity assumption requires that all levels of treatment occur at all levels of confounders: \[\Pr[A = a | L = l] > 0 \quad \text{for all } a, l\]
In a DAG, positivity violations can occur when:
The consistency assumption requires well-defined interventions: \[Y = Y^A\]
In DAGs, consistency requires: 1. No treatment variation: All individuals receiving \(A=a\) receive exactly the same version of treatment 2. No interference: One individual’s treatment doesn’t affect another’s outcome
Causal diagrams allow us to classify different types of bias based on graph structure.
Definition 2 (Confounding (DAG Definition)) Confounding occurs when there exists a backdoor path from treatment \(A\) to outcome \(Y\):
A backdoor path is a path from \(A\) to \(Y\) that: 1. Starts with an arrow pointing into \(A\) (i.e., \(\cdot \rightarrow A\)) 2. Does not pass through any descendants of \(A\)
Example:
L → A → Y
L → Y
The path \(A \leftarrow L \rightarrow Y\) is a backdoor path (it starts with an arrow into \(A\)). Therefore, \(L\) is a confounder.
Definition 3 (Backdoor Criterion) A set of variables \(L\) satisfies the backdoor criterion for the effect of \(A\) on \(Y\) if:
If \(L\) satisfies the backdoor criterion, adjusting for \(L\) eliminates confounding.
Example 4 (Backdoor Criterion Example) Diagram:
U → L → A → Y
L → Y
Backdoor paths from \(A\) to \(Y\):
Does \(L\) satisfy the backdoor criterion? 1. \(L\) is not a descendant of \(A\) ✓ 2. \(L\) blocks the backdoor path \(A \leftarrow L \rightarrow Y\) ✓
Yes! Adjusting for \(L\) is sufficient to identify the causal effect.
Does \(U\) satisfy the backdoor criterion?
Selection bias occurs when we condition on a collider (or its descendant) that lies on a path from \(A\) to \(Y\).
Example:
A → S ← Y
If we restrict analysis to individuals with \(S = 1\), we induce spurious association between \(A\) and \(Y\) (collider bias).
This will be covered in detail in Chapter 8.
Measurement bias can be represented in DAGs by including nodes for both:
Example:
A_true → Y
A_measured ← A_true
If we use \(A_{\text{measured}}\) instead of \(A_{\text{true}}\), the estimated effect will be biased (Chapter 9).
Effect modification can also be represented in causal diagrams.
Effect modification by \(V\) means the effect of \(A\) on \(Y\) differs across levels of \(V\). This can be represented by:
V → Y
A → Y
(with the understanding that the A→Y effect depends on V)
Some authors include an arrow \(V \rightarrow A \cdot Y\) to explicitly denote interaction, but this is not standard DAG notation.
A variable \(V\) can be:
Example 5 (Confounder and Modifier in DAGs) Scenario 1: \(V\) is a confounder only
V → A → Y
V → Y
\(V\) opens the backdoor path \(A \leftarrow V \rightarrow Y\) (confounding). We must adjust for \(V\).
Scenario 2: \(V\) is a modifier only (in a randomized trial)
V → Y
A → Y
(A randomized, so no V → A arrow)
\(V\) modifies the effect of \(A\), but does not confound (no backdoor path). We should report stratum-specific effects but don’t need to adjust for \(V\) to eliminate bias.
Scenario 3: \(V\) is both
V → A → Y
V → Y
(and the A→Y effect varies by V)
We must adjust for \(V\) (confounding) AND report stratum-specific effects (modification).
This chapter introduced causal diagrams (DAGs) as tools for representing and reasoning about causal relationships.
Key concepts:
DAGs: Directed acyclic graphs with nodes (variables) and directed edges (causal effects)
Three basic structures:
Collider bias: Conditioning on a common effect of two variables induces spurious association
Backdoor paths: Non-causal paths from \(A\) to \(Y\) that begin with an arrow into \(A\)
Backdoor criterion: A set \(L\) satisfies the backdoor criterion if:
Adjustment: If \(L\) satisfies the backdoor criterion, adjusting for \(L\) eliminates confounding