Compute Inverse Probability Weighting (IPW) Weights
Description
Computes stabilized inverse probability weights for each participant-arm-month row in long-format data, based on the predicted probability of receiving a screening mammogram. Weights are truncated at the 99th percentile computed separately within each arm.
A data frame in long format (one row per participant-arm-month), as produced by expand_to_long() and augmented with mammogram and time-since-last-mammogram columns.
pred_prob_col
Name of the column containing the model-predicted probability of a screening mammogram at each row.
arm_col
Name of the trial arm column. Default: “arm”.
id_col
Name of the participant identifier column. Default: “id”.
month2_col
Name of the 0-indexed month-from-entry column. Default: “month2”.
bc_month_col
Name of the column containing the month2 at which breast cancer was diagnosed (NA if no diagnosis). Default: “monthBC”.
scrmammo_col
Name of the binary screening-mammogram indicator column. Default: “scrmammo”.
tslm_lag_col
Name of the lagged time-since-last-mammogram column (months since the last any mammogram, screening or diagnostic). Default: “tslm_lag”.
grace_months
Number of months from trial entry during which weights are held at 1 for the STOPBASE arm. Default: 11L.
Details
Separate weight models are used for the two trial arms:
STOPBASE: The weight tracks the probability of not receiving a screening mammogram. After a grace period of grace_months months, the denominator is 1 - p_scrmammo at each month (or 1 if a breast cancer diagnosis has occurred). If the time since last mammogram is 10 or fewer months (tslm_lag <= 10), the effective screening probability is set to 0 (no screening expected that soon).
CONTINUE: The compliance window is reset by any mammogram (screening or diagnostic), as measured by tslm_lag. A weight update occurs at every month within the compliance window (tslm_lag = 11, 12, or 13). The numerator uses the conditional probability of the observed action under a discrete uniform distribution over months 11–13: 1/3 at month 11, 1/2 at month 12 (given no earlier screening), and 1 at month 13. The denominator is the model-predicted probability. Weights stop updating after a breast-cancer diagnosis.
Weights within each arm are cumulative products initialized at 1. The final column wp99 is the weight truncated at the 99th percentile of w computed separately within each arm.
Value
long_data with two additional columns:
w: Cumulative IPW weight at each participant-arm-month.
wp99: IPW weight truncated at the 99th percentile of w within each arm separately.
References
García-Albéniz X, Uno H, Bhatt DL, McArdle PH, Joffe MM, Hernán MA. Continuation of Annual Screening Mammography and Breast Cancer Mortality in Women Older Than 70 Years: A Prospective Observational Study. Ann Intern Med. 2020;172(6):381–389. doi:10.7326/M18-1199