bootstrap_ci – ettbc

Bootstrap Confidence Intervals for Survival Difference

Description

Uses the nonparametric bootstrap to compute 95% percentile confidence intervals for the difference in marginal survival between the CONTINUE and STOPBASE arms, as estimated by predict_survival_ipw().

Usage

bootstrap_ci(
  long_data,
  pred_prob_col,
  covariate_cols = NULL,
  id_col = "id",
  outcome_col = "dead_t1",
  arm_col = "arm",
  month_col = "month2",
  max_month = 95L,
  rcs_knots = c(6, 48, 72),
  n_boot = 500L,
  seed = NULL,
  fail_threshold = 0.1
)

Arguments

long_data A data frame in long format (one row per participant-arm-month), as produced by expand_to_long() and augmented with the columns required by compute_ipw_weights().

pred_prob_col Name of the column containing the model-predicted probability of a screening mammogram, passed to compute_ipw_weights().

covariate_cols Character vector of covariate column names for predict_survival_ipw(). Set to NULL for unadjusted estimation. Default: NULL.

id_col Name of the participant identifier column. Default: “id”.

outcome_col Name of the binary outcome column. Default: “dead_t1”.

arm_col Name of the trial arm column. Default: “arm”.

month_col Name of the 0-indexed month-from-entry column. Default: “month2”.

max_month Maximum month for survival prediction. Default: 95L.

rcs_knots Numeric vector with at least 3 elements specifying the knots for the restricted cubic spline: the first element is the left boundary knot, the last element is the right boundary knot, and any middle elements are interior knots. Must have at least one interior knot. Default: c(6, 48, 72) (one interior knot at month 48).

n_boot Number of bootstrap iterations. Default: 500L.

seed Integer seed for reproducibility. NULL means no seed is set. Default: NULL.

fail_threshold Maximum proportion of bootstrap iterations that may fail before a warning is issued. Default: 0.1 (warn if more than 10% of iterations fail).

Details

At each bootstrap iteration, participant IDs are sampled with replacement. All long-format rows belonging to a sampled ID are included in the bootstrap dataset, with duplicate draws receiving distinct synthetic identifiers. IPW weights are recomputed on each bootstrap sample before estimating survival curves.

The point estimate uses the original (unbootstrapped) data. Confidence intervals are taken as the 2.5th and 97.5th percentiles of the bootstrap distribution.

Value

A data frame with one row per month (0 through max_month), containing:

month: Month index (0-indexed from trial entry).
diff: Point estimate of s_continue - s_stopbase.
diff_lo: 2.5th percentile bootstrap estimate of the survival difference.
diff_hi: 97.5th percentile bootstrap estimate of the survival difference.
s_continue: Point estimate of survival in the CONTINUE arm.
s_stopbase: Point estimate of survival in the STOPBASE arm.
s_continue_lo: 2.5th percentile bootstrap estimate for CONTINUE survival.
s_continue_hi: 97.5th percentile bootstrap estimate for CONTINUE survival.
s_stopbase_lo: 2.5th percentile bootstrap estimate for STOPBASE survival.
s_stopbase_hi: 97.5th percentile bootstrap estimate for STOPBASE survival.

References

García-Albéniz X, Uno H, Bhatt DL, McArdle PH, Joffe MM, Hernán MA. Continuation of Annual Screening Mammography and Breast Cancer Mortality in Women Older Than 70 Years: A Prospective Observational Study. Ann Intern Med. 2020;172(6):381–389. doi:10.7326/M18-1199