Skip to contents

Creates 1:1 matched pairs of exposed ("cases") and unexposed ("controls") individuals. Uses a rolling cohort design where individuals who become exposed are matched with an eligible control, defined as individuals who are unexposed and at-risk at the time of matching.

Usage

match_rolling_cohort(
  data,
  outcome_time,
  exposure,
  exposure_time,
  matching_vars,
  id_name,
  replace = FALSE,
  seed = NULL
)

Arguments

data

A data frame with one row per individual containing the columns named in outcome_time, exposure, exposure_time, matching_vars and id_name. Missing values in any column except exposure_time are not allowed.

outcome_time

Name of the follow-up time for the outcome of interest, i.e. time to either the event or right-censoring, whichever occurs first. Time should be measured from a chosen time origin (e.g. study start, enrollment, or age).

exposure

Name of the exposure indicator. The underlying column should be numeric (1 = exposed during follow-up, 0 = never exposed during follow-up).

exposure_time

Name of the time to exposure, measured on the same time scale as that used for outcome_time. Must be a non-missing numeric value for exposed individuals and must be set to NA for unexposed individuals.

matching_vars

Character vector of variables to use for exact matching.

id_name

Name of unique identifier variable of individuals.

replace

Logical. Allow controls to be reused? Default: FALSE. If TRUE, allows controls to be matched with exposed individuals at different timepoints, but not to multiple exposed individuals within the same timepoint.

seed

Integer seed for reproducible matching results. Can be useful because when there are multiple eligible controls to be matched, controls are randomly chosen. Default: NULL

Value

A list containing the following:

matched_data

Data frame of matched pairs with original variables plus:

  • match_index_time: Time at which individuals were matched

  • match_type: "case" or "control"

  • match_<exposure>: Exposure at time of matching

  • match_id: Pair identifier

Data provides the matched pairs and matching information, but no other changes are made to the original data associated with each individual.

n_unmatched_cases

Number of unmatched exposed individuals

discarded

Logical vector indicating which rows in the original data are excluded from the matched dataset

Details

For each exposure time, newly exposed individuals are matched to eligible controls using exact covariate matching. Controls are eligible if they are unexposed and event-free at the time of matching. Exposed individuals may appear in the final matched dataset twice, as a control (when they are not yet exposed) and as a case.

Examples

matched_cohort <- match_rolling_cohort(
  data = simdata,
  outcome_time =  "Y",
  exposure = "V",
  exposure_time = "D_obs",
  matching_vars = c("x1", "x2"),
  id_name = "ID",
  seed = 5678
)