Compute a matching-based estimator of VE with confidence intervals

This function is the main function for computing a matching-based estimator.

Usage

matching_ve(
  matched_data,
  outcome_time,
  outcome_status,
  exposure,
  exposure_time,
  tau,
  eval_times,
  effect = c("vaccine_effectiveness", "risk_ratio"),
  ci_type = c("wald", "percentile", "both"),
  boot_reps = 0,
  alpha = 0.05,
  keep_models = TRUE,
  keep_boot_samples = TRUE,
  n_cores = 1
)

Arguments

matched_data

A data frame for the matched cohort

outcome_time

Name of the time-to-event/censoring variable. Time should be measured from a given time origin (e.g. study start, enrollment, or age) for all individuals.

outcome_status

Name of the event indicator. The underlying column should be numeric (1 = event, 0 = censored).

exposure

Name of the exposure indicator. The underlying column should be numeric (1 = exposed during follow-up, 0 = never exposed during follow-up).

exposure_time

Name of the time to exposure, measured from the chosen time origin; use NA if not exposed. Time must be measured in the same units (e.g. days) as that used for outcome_time.

tau

Non-negative numeric value specifying the time after exposure that should be excluded from the risk evaluation period. This argument is primarily intended for vaccination exposures, where it is common to exclude the time after vaccination when immunity is still building. Time must be measured in the same units as that used for outcome_time and exposure_time and should reflect the biological understanding of when vaccine-induced immunity develops (usually 1-2 weeks). For non-vaccine exposures, tau can be set to 0 (no delay period).

eval_times

Numeric vector specifying the timepoints at which to compute cumulative incidence and the derived effect measures. The timepoints should be expressed in terms of time since exposure. All values must be greater than tau and and should correspond to clinically meaningful follow-up durations, such as 30, 60, or 90 days after exposure. A fine grid of timepoints (e.g., eval_times = (tau+1):100) can be provided if cumulative incidence curves over time are desired.

effect

Character. Type of effect measure to compute and return, based on the estimated cumulative incidences. Either "vaccine_effectiveness" (default) or "risk_ratio".

ci_type

Method for constructing bootstrap confidence intervals. One of "wald", "percentile", or "both".

"wald" (default): Computes Wald-style intervals using bootstrap standard errors. See Confidence intervals section for details.
"percentile": Computes percentile bootstrap intervals.
"both": Computes and returns both sets of intervals.

boot_reps

Number of bootstrap replicates for confidence intervals. Recommended to use at least 1000 for publication-quality results. Use smaller values (e.g., 10-100) for initial exploration. Default: 0 (no bootstrapping).

alpha

Significance level for confidence intervals (Confidence level = 100*(1-alpha)%). Default: 0.05.

keep_models

Logical; return the two fitted hazard models used to compute cumulative incidences? Default: TRUE.

keep_boot_samples

Logical; return bootstrap samples? Default: TRUE. Must be set to TRUE if user plans to use add_simultaneous_ci() to obtain simultaneous confidence intervals.

n_cores

Integer; parallel cores for bootstrapping. Passed to parallel::mclapply as mc.cores. On Unix-like OS only; not available on Windows. Default: 1.

Value

A list containing the following:

estimates: A list of matrices of the estimates at each timepoint. Rows of each matrix are the terms "cuminc_0", "cuminc_1", "vaccine_effectiveness". Columns of each matrix gives the point estimate and confidence intervals at the specified time point.
eval_times: The timepoints at which VE was evaluated
n_success_boot: A numeric vector of the number of successful bootstrap samples for each time point.(Success bootstrap samples are those that result in non-missing valid point estimates.)
boot_samples: If keep_boot_samples = TRUE, a list of matrices for each term that contain the bootstrap estimates where the rows are the bootstrap iterations and the columns are the time points.