Expand Cloned Data to Long Format

Description

Converts the output of clone_censor() from one row per participant-arm to one row per participant-arm-month. The resulting dataset is in a format suitable for discrete-time survival analysis with inverse probability weighting (IPW).

Usage

expand_to_long(
  data,
  id_col = "id",
  arm_col = "arm",
  start_col = "start_month",
  end_col = "end_month",
  died_col = "died",
  bc_died_col = "bc_died",
  bc_month_col = "bc_month"
)

Arguments

data Output from clone_censor(), or any data frame with the same structure. Must contain columns id, arm, start_month, end_month, died, bc_died, and bc_month (or as specified via *_col arguments).
id_col Name of the participant ID column. Default: “id”.
arm_col Name of the trial arm column. Default: “arm”.
start_col Name of the start month column. Default: “start_month”.
end_col Name of the end month column. Default: “end_month”.
died_col Name of the overall death indicator column. Default: “died”.
bc_died_col Name of the breast cancer death indicator column. Default: “bc_died”.
bc_month_col Name of the breast cancer diagnosis month column. Default: “bc_month”.

Details

The binary outcome variable dead_t1 encodes whether the participant died in the interval from month t to month t+1:

  • 0: alive at t+1 (confirmed)

  • 1: died by t+1

  • NA: censored at month t (follow-up ends, outcome unknown)

For participants who died:

  • Rows span from start_month to end_month - 1.

  • The last row (at end_month - 1) has dead_t1 = 1.

  • No row is created for end_month.

For censored or surviving participants:

  • Rows span from start_month to end_month.

  • The last row (at end_month) has dead_t1 = NA.

  • All prior rows have dead_t1 = 0.

If a participant died in the same month they entered the study (end_month == start_month and died == 1), dead_t1 is set to NA (outcome indeterminate within a single month).

The breast-cancer-specific outcome bc_dead_t1 mirrors dead_t1 but is set to 0 at the event row when the death was not attributed to breast cancer (bc_died == 0).

Value

A data frame with one row per participant-arm-month, containing:

  • id: Participant identifier

  • arm: Trial arm (“STOPBASE” or “CONTINUE”)

  • month: Calendar month (integer, same scale as input)

  • month2: Month relative to trial entry, 0-indexed

  • dead_t1: Death in the next interval: 1 / 0 / NA

  • bc_dead_t1: Breast cancer death in the next interval

  • bc_long: Breast cancer diagnosis at this month (0/1)

References

García-Albéniz X, Uno H, Bhatt DL, McArdle PH, Joffe MM, Hernán MA. Continuation of Annual Screening Mammography and Breast Cancer Mortality in Women Older Than 70 Years: A Prospective Observational Study. Ann Intern Med. 2020;172(6):381–389. doi:10.7326/M18-1199

See Also

clone_censor() for the preceding step.

Examples

Code
library("ettbc")

cloned <- clone_censor(
  cohort,
  screening_mammograms,
  diagnostic_mammograms
)
long_data <- expand_to_long(cloned)
head(long_data)
  id      arm month month2 dead_t1 bc_dead_t1 bc_long
1  1 STOPBASE    26      0       0          0       0
2  1 STOPBASE    27      1       0          0       0
3  1 STOPBASE    28      2       0          0       0
4  1 STOPBASE    29      3       0          0       0
5  1 STOPBASE    30      4       0          0       0
6  1 STOPBASE    31      5       0          0       0