Predict Baseline-Adjusted Survival Curves

Description

Fits a pooled logistic regression that includes baseline covariates alongside time-by-arm interaction terms, then uses g-computation (standardization) to generate marginal survival curves for each trial arm.

Usage

predict_survival_baseline_adjusted(
  long_data,
  covariate_cols = NULL,
  outcome_col = "dead_t1",
  arm_col = "arm",
  month_col = "month2",
  id_col = "id",
  max_month = 95L,
  rcs_knots = c(6, 48, 72)
)

Arguments

long_data A data frame in long format (one row per participant-arm-month), as produced by expand_to_long(). Must contain columns specified by outcome_col, arm_col, and month_col. Both arms (“STOPBASE” and “CONTINUE”) must be present; a non-empty subset with only one arm will raise an error.
covariate_cols Character vector of column names to include as additional baseline adjustment terms in the model. Set to NULL for no adjustment (equivalent to predict_survival_unadjusted()). Default: NULL.
outcome_col Name of the binary outcome column (0/1, NA for censored). Default: “dead_t1”.
arm_col Name of the trial arm column (“STOPBASE” / “CONTINUE”). Default: “arm”.
month_col Name of the time variable column (0-indexed month relative to trial entry). Default: “month2”.
id_col Name of the participant identifier column, used to deduplicate baseline rows during standardization. Default: “id”.
max_month Maximum month for survival prediction. Rows with month beyond this value are excluded from both model fitting and prediction. Default: 95.
rcs_knots Numeric vector with at least 3 elements specifying the knots for the restricted cubic spline: the first element is the left boundary knot, the last element is the right boundary knot, and any middle elements are interior knots. Must have at least one interior knot. Default: c(6, 48, 72) (one interior knot at month 48).

Details

Extends predict_survival_unadjusted() by adding baseline covariate columns to the right-hand side of the regression formula. Standardization averages predicted survival over the empirical distribution of baseline covariates so that the returned curves are marginal (population-averaged) rather than conditional.

Value

A data frame with one row per month (0 through max_month), containing:

  • month: Month index (0-indexed from trial entry).

  • s_continue: Estimated marginal survival probability in the CONTINUE arm.

  • s_stopbase: Estimated marginal survival probability in the STOPBASE arm.

References

García-Albéniz X, Uno H, Bhatt DL, McArdle PH, Joffe MM, Hernán MA. Continuation of Annual Screening Mammography and Breast Cancer Mortality in Women Older Than 70 Years: A Prospective Observational Study. Ann Intern Med. 2020;172(6):381–389. doi:10.7326/M18-1199

See Also

predict_survival_unadjusted(), predict_survival_ipw(), expand_to_long()

Examples

Code
library("ettbc")

cloned <- clone_censor(cohort, screening_mammograms, diagnostic_mammograms)
long_data <- expand_to_long(cloned)
surv <- predict_survival_baseline_adjusted(long_data)
head(surv)
  month s_continue s_stopbase
1     0  0.9961485  0.9979071
2     1  0.9924180  0.9957995
3     2  0.9888043  0.9936770
4     3  0.9853035  0.9915398
5     4  0.9819117  0.9893876
6     5  0.9786251  0.9872205