Compute Inverse Probability Weighting (IPW) Weights

Description

Computes stabilized inverse probability weights for each participant-arm-month row in long-format data, based on the predicted probability of receiving a screening mammogram. Weights are truncated at the 99th percentile computed separately within each arm.

Usage

compute_ipw_weights(
  long_data,
  pred_prob_col,
  arm_col = "arm",
  id_col = "id",
  month2_col = "month2",
  bc_month_col = "monthBC",
  scrmammo_col = "scrmammo",
  tslm_lag_col = "tslm_lag",
  grace_months = 11L
)

Arguments

long_data A data frame in long format (one row per participant-arm-month), as produced by expand_to_long() and augmented with mammogram and time-since-last-mammogram columns.
pred_prob_col Name of the column containing the model-predicted probability of a screening mammogram at each row.
arm_col Name of the trial arm column. Default: “arm”.
id_col Name of the participant identifier column. Default: “id”.
month2_col Name of the 0-indexed month-from-entry column. Default: “month2”.
bc_month_col Name of the column containing the month2 at which breast cancer was diagnosed (NA if no diagnosis). Default: “monthBC”.
scrmammo_col Name of the binary screening-mammogram indicator column. Default: “scrmammo”.
tslm_lag_col Name of the lagged time-since-last-mammogram column (months since the last any mammogram, screening or diagnostic). Default: “tslm_lag”.
grace_months Number of months from trial entry during which weights are held at 1 for the STOPBASE arm. Default: 11L.

Details

Separate weight models are used for the two trial arms:

  • STOPBASE: The weight tracks the probability of not receiving a screening mammogram. After a grace period of grace_months months, the denominator is 1 - p_scrmammo at each month (or 1 if a breast cancer diagnosis has occurred). If the time since last mammogram is 10 or fewer months (tslm_lag <= 10), the effective screening probability is set to 0 (no screening expected that soon).

  • CONTINUE: The compliance window is reset by any mammogram (screening or diagnostic), as measured by tslm_lag. A weight update occurs at every month within the compliance window (tslm_lag = 11, 12, or 13). The numerator uses the conditional probability of the observed action under a discrete uniform distribution over months 11–13: 1/3 at month 11, 1/2 at month 12 (given no earlier screening), and 1 at month 13. The denominator is the model-predicted probability. Weights stop updating after a breast-cancer diagnosis.

Weights within each arm are cumulative products initialized at 1. The final column wp99 is the weight truncated at the 99th percentile of w computed separately within each arm.

Value

long_data with two additional columns:

  • w: Cumulative IPW weight at each participant-arm-month.

  • wp99: IPW weight truncated at the 99th percentile of w within each arm separately.

References

García-Albéniz X, Uno H, Bhatt DL, McArdle PH, Joffe MM, Hernán MA. Continuation of Annual Screening Mammography and Breast Cancer Mortality in Women Older Than 70 Years: A Prospective Observational Study. Ann Intern Med. 2020;172(6):381–389. doi:10.7326/M18-1199

See Also

predict_survival_ipw(), expand_to_long()

Examples

Code
library("ettbc")

cloned <- clone_censor(cohort, screening_mammograms, diagnostic_mammograms)
long_data <- expand_to_long(cloned)
long_data$p_scrmammo <- 0.3
long_data$monthBC <- NA_integer_
long_data$scrmammo <- 0L
long_data$tslm_lag <- 5L
result <- compute_ipw_weights(long_data, pred_prob_col = "p_scrmammo")
head(result[, c("id", "arm", "month2", "w", "wp99")])
  id      arm month2 w wp99
1  1 STOPBASE      0 1    1
2  1 STOPBASE      1 1    1
3  1 STOPBASE      2 1    1
4  1 STOPBASE      3 1    1
5  1 STOPBASE      4 1    1
6  1 STOPBASE      5 1    1