Title: | Nonparametric Assessment Between Competing Risks Hazard Ratios |
---|---|
Description: | Nonparametric cumulative-incidence based estimation of the ratios of sub-hazard ratios to cause-specific hazard ratios using the approach from Ng et al. (2020). |
Authors: | Daniel Antiporta <[email protected]>; Matthew Matheson <[email protected]>; Derek Ng <[email protected]>; Alvaro Munoz <[email protected]> |
Maintainer: | Daniel Antiporta <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2025-02-16 03:15:59 UTC |
Source: | https://github.com/antiportad/hrcomprisk |
Bootstrap 95% Confidence Intervals limits for estimated Ratios of sHR/csHR.
bootCRCumInc( df, exit, event, exposure, entry = NULL, weights = NULL, ipwvars = NULL, rep = 0, print.attr = T, seed = 54321 )
bootCRCumInc( df, exit, event, exposure, entry = NULL, weights = NULL, ipwvars = NULL, rep = 0, print.attr = T, seed = 54321 )
df |
A data frame containing, at a minimum, exit, event, and exposure. |
exit |
Name of the column in df containing times of event or censoring. |
event |
Name of the column in df containing codes for censoring (0) and event types (1-4). Analysis of more than 4 competing events is not supported by this function. |
exposure |
Name of the column in df containing a binary (0/1) exposure variable for stratification. |
entry |
Name of the column in df containing late entry times. |
weights |
Name of the column in df containing user-supplied weights. If ipwvars is utilized, this argument is ignored. |
ipwvars |
A vector of names of columns in 'df' containing predictor variables for building a propensity score model for exposure and creating standardized inverse probability weights using this model. Overrides the weights argument. |
rep |
Number of replicates for bootstrapping if confidence intervals for the sHR/csHR estimate are desired. See more details on bootstrapping below. |
print.attr |
A logical indicator for whether results should be returned in console. |
seed |
A seed number start for the bootstrap estimation. |
A data frame with the 95% confidence interval limits (upper and lower) for Sub-hazard ratio/Cause-specific hazard ratio for each event:
Lower limit of the 95%CI of the Sub-hazard ratio/Cause-specific hazard ratio for event 1 at time t
Upper limit of the 95%CI of the Sub-hazard ratio/Cause-specific hazard ratio for event 1 at time t
Lower limit of the 95%CI of the Sub-hazard ratio/Cause-specific hazard ratio for event 2 at time t
Upper limit of the 95%CI of the Sub-hazard ratio/Cause-specific hazard ratio for event 2 at time t
#data from the package data <- hrcomprisk::dat_ckid #Obtain the 95%CI by bootstraping ciCIF<-bootCRCumInc(df=data, exit=exit, event=event, exposure=b1nb0, rep=10, print.attr=TRUE)
#data from the package data <- hrcomprisk::dat_ckid #Obtain the 95%CI by bootstraping ciCIF<-bootCRCumInc(df=data, exit=exit, event=event, exposure=b1nb0, rep=10, print.attr=TRUE)
Estimation of Cumulative Incidence Functions (CIF) of competing events.
This function is based on the CIF estimated by the survival package.
CRCumInc( df, time, event, exposed, entry = NULL, weights = NULL, ipwvars = NULL, print.attr = T )
CRCumInc( df, time, event, exposed, entry = NULL, weights = NULL, ipwvars = NULL, print.attr = T )
df |
A data frame containing, at a minimum, exit, event, and exposure. |
time |
Name of the column in df containing times of event or censoring. |
event |
Name of the column in df containing codes for censoring (0) and event types (1-4). Analysis of more than 4 competing events is not supported by this function. |
exposed |
Name of the column in df containing a binary (0/1) exposure variable for stratification. |
entry |
Name of the column in df containing late entry times. |
weights |
Name of the column in df containing user-supplied weights. If ipwvars is utilized, this argument is ignored. |
ipwvars |
A vector of names of columns in 'df' containing predictor variables for building a propensity score model for exposure and creating standardized inverse probability weights using this model. Overrides the weights argument. |
print.attr |
A logical indicator for whether results should be returned in console. |
A data frame with the following columns:
Type of event that occurs at the given time.
Exposure group in which the event happens.
Time of the event.
Value of the unexposed (denoted by “o”) composite cumulative incidence at the given time.
Value of the exposed (denoted by “x”) composite cumulative incidence at the given time.
Value of the unexposed cumulative incidence of event 1 at the given time.
Value of the exposed cumulative incidence of event 1 at the given time.
Sub-hazard ratio/Cause-specific hazard ratio for event 1.
Sub-hazard ratio/Cause-specific hazard ratio for event 2.
#data from the package data <- hrcomprisk::dat_ckid #Estimate the Cumulative Incidence Functions and Ratios of sHR and csHR mydat.CIF<-CRCumInc(df=data, time=exit, event=event, exposed=b1nb0, print.attr=TRUE)
#data from the package data <- hrcomprisk::dat_ckid #Estimate the Cumulative Incidence Functions and Ratios of sHR and csHR mydat.CIF<-CRCumInc(df=data, time=exit, event=event, exposed=b1nb0, print.attr=TRUE)
A dataset containing time, socieconomic and outcome variables of 626 subjects from the Chronic Kidney Disease in Children (CKiD) Study.
dat_ckid
dat_ckid
A data frame with 626 rows and 13 variables:
Binary indicator for race: black=1, non-black=0
Years since onset of chronic kidney disease at entry into study
Renal replacement therapy indicator: 0=none, 1=dialysis, 2=transplant
Years since onset of chronic kidney disease at event/censoring time
Binary indicator for use of food assistance
Years in study (=exit-entry)
Household income > $75,000 per year
Household income < $30,000 per year
Binary indicator of low birth weight, premature birth, or small for gestational age
Binary indicator for sex: male=1, female=0
Maternal education less than college
Binary indicator for private doctor
Binary indicator for public insurance
s
https://statepi.jhsph.edu/ckid/ckid.html
A function to ensure that the data frame fulfills the relevant variable content and type requirements.
datcheck(df, qexit, qevent, qexposure, qentry, qweights, qipwvars, eoi = -1)
datcheck(df, qexit, qevent, qexposure, qentry, qweights, qipwvars, eoi = -1)
df |
A data frame containing, at a minimum, exit, event, and exposure. |
qexit |
Name of the column in df containing times of event or censoring. |
qevent |
Name of the column in df containing codes for censoring (0) and event types (1-4). Analysis of more than 4 competing events is not supported by this function. |
qexposure |
Name of the column in df containing a binary (0/1) exposure variable for stratification. |
qentry |
Name of the column in df containing late entry times. |
qweights |
Name of the column in df containing user-supplied weights. If ipwvars is utilized, this argument is ignored. |
qipwvars |
A vector of names of columns in 'df' containing predictor variables for building a propensity score model for exposure and creating standardized inverse probability weights using this model. Overrides the weights argument. |
eoi |
Event number for the event of interest, useful when more than two events exist. If utilized, only two cumulative incidence curves will be plotted: one for the event of interest, and one for the composite of all competing events. Each event will still have its sHR/csHR ratio plotted. |
Check dataset
Estimate nonparametric cumulative-incidence based estimation of the ratios of sub-hazard ratios to cause-specific hazard ratios from Ng, Antiporta, Matheson and Muñoz (2019)[1] to compare sub-hazard ratio (a la Fine and Gray; sHR) and cause-specific hazard ratio (csHR) approaches.
While doing either analysis individually involves parametric or semi-parametric estimation, because of the fact that the derivatives of the cumulative incidences involved in the quantities cancel out when their ratio is considered, this ratio can be characterized completely using nonparametric estimates of the event-specific cumulative incidences. This provides a useful diagnostic when both analyses are performed as well as a method for estimating the sub-hazard ratios in a way that is valid and free of tethering assumptions characterized by Muñoz et al. [2].
1. Bootstrapped confidence intervals for the sHR/csHR quantities
If a positive number of bootstrap replicates is requested via the rep argument, the program will calculate and provide pointwise percentile-based bootstrap confidence intervals for the sHR/csHR ratios. The bootstrapping process uses two loops. In the first loop, rep bootstrap samples are taken stratified by exposure (so each sample has the same exposure prevalence as the original data) and all event-specific cumulative incidences are calculated and stored for each of them. In the second loop, for each event time (i.e., each change in any one of the cumulative incidence functions), the 2.5th and 97.5th percentiles of the bootstrap estimates of the rep sHR/csHR ratios are stored as the lower and upper confidence limits. These are not directly returned to the user, but are used in the plotting of the sHR/csHR ratios.
2. User-supplied vs. program-generated weights
If confidence intervals are not desired, the user can supply a column of weights (e.g, inverse probability weights from a model predicting exposure) which will be used in the estimation of the cumulative incidences and the sHR/csHR ratios derived from them. However, the use of bootstrapping for calculation of confidence intervals as described above necessitates that such a model be refit for each bootstrap sample, generating new weights for new estimates of all quantities. If this is desired, the user should include all the predictor variables as columns in the data frame so that the appropriate model can be fit automatically using the ipwvars argument. If this method is used, the program uses a logistic regression model to calculate probability of exposure, stabilizes the resulting weights to the sample size, and winsorizes weights that fall outside ±4 standard deviations on the log scale. Using this method can increase computation time as the model must be refit on each bootstrap replicate.
3. Use of the nonparametric sHR/csHR ratio for calculation of sHR estimates
Simultaneous estimation of all subhazard ratios and cause-specific hazard ratios is often fraught with problematic results due to incompatible modeling assumptions (e.g., not all ratios can be proportional) and the tethering inherent in subhazard analysis of multiple events as described by Muñoz et al. [2]. Cause-specific hazard ratios are not subject to such tethering constraints, and will admissible regardless of whether the model is misspecified. As such, the output of this function – valid nonparametric estimates of the sHR/csHR ratio – can be combined with (i.e., multiplied by) cause-specific hazard ratio estimates (e.g., from a proportional or loglinear cause-specific hazards model) to produce subhazard ratio estimates which do not violate the principles of tethering.
The hrcomprisk package provides 3 main functions and a wrapper function:
- npcrest : Main wrapper function.
- CRCumInc : Estimation of Cumulative Incidence Functions (CIF) of competing events.
- plotCIF : Plot Cumulative Incidence and Ratio of sHR/csHR.
- bootCRCumInc : Bootstrap 95
Mantainer: Daniel Antiporta <[email protected]>
Authors:
- Daniel Antiporta
- Matthew Matheson
- Derek Ng
- Alvaro Muñoz
1. Ng D, Antiporta DA, Matheson M, Munoz A. Nonparametric assessment of differences between competing risks hazard ratios: application to racial differences in pediatric chronic kidney disease progression. Clinical Epidemiology, 2020 (in press)
2. Muñoz A, Abraham AG, Matheson M, Wada N. In: Risk Assessment and Evaluation of Predictions. Lee MLT, Gail M, Pfeiffer R, Satten G, Cai T, Gandy A, editor. New York: Springer; 2013. Non-proportionality of hazards in the competing risks framework; pp. 3–22. [Google Scholar]
Useful links:
https://github.com/AntiportaD/hrcomprisk
#data from the package - See fuctions for specific examples data <- hrcomprisk::dat_ckid
#data from the package - See fuctions for specific examples data <- hrcomprisk::dat_ckid
A comprehensive wrapper function for implementing the competing risks diagnostic of Ng, Antiporta, Matheson and Muñoz (2019) to compare sub-hazard ratio (a la Fine and Gray; sHR) and cause-specific hazard ratio (csHR) approaches. While doing either analysis individually involves parametric or semi-parametric estimation, because of the fact that the derivatives of the cumulative incidences involved in the quantities cancel out when their ratio is considered, this ratio can be characterized completely using nonparametric estimates of the event-specific cumulative incidences. This provides a useful diagnostic when both analyses are performed as well as a method for estimating the sub-hazard ratios in a way that is valid and free of tethering assumptions characterized by Muñoz et al. (2013).
This function calls datcheck, CRCumInc, bootCRCumInc (if confidence intervals are requested), and plotCIF, all of which are also included in the hrcomprisk package. These functions should generally not be utilized directly.
npcrest( df, exit, event, exposure, entry = NULL, weights = NULL, ipwvars = NULL, maxtime = Inf, rep = NULL, eoi = -1, print.attr = T )
npcrest( df, exit, event, exposure, entry = NULL, weights = NULL, ipwvars = NULL, maxtime = Inf, rep = NULL, eoi = -1, print.attr = T )
df |
A data frame containing, at a minimum, exit, event, and exposure. |
exit |
Name of the column in df containing times of event or censoring. |
event |
Name of the column in df containing codes for censoring (0) and event types (1-4). Analysis of more than 4 competing events is not supported by this function. |
exposure |
Name of the column in df containing a binary (0/1) exposure variable for stratification. |
entry |
Name of the column in df containing late entry times. |
weights |
Name of the column in df containing user-supplied weights. If ipwvars is utilized, this argument is ignored. |
ipwvars |
A vector of names of columns in 'df' containing predictor variables for building a propensity score model for exposure and creating standardized inverse probability weights using this model. Overrides the weights argument. |
maxtime |
Largest time to display on the x-axis of all output plots. As data can become sparse and thus more widely variable at times get large, this argument may be used to restrict plots to a range of the data that is discerned to be more accurate and reliable. |
rep |
Number of replicates for bootstrapping if confidence intervals for the sHR/csHR estimate are desired. See more details on bootstrapping below. |
eoi |
Event number for the event of interest, useful when more than two events exist. If utilized, only two cumulative incidence curves will be plotted: one for the event of interest, and one for the composite of all competing events. Each event will still have its sHR/csHR ratio plotted. |
print.attr |
A logical indicator for whether results should be returned in console. |
1. Bootstrapped confidence intervals for the sHR/csHR quantities
If a positive number of bootstrap replicates is requested via the rep argument, the program will calculate and provide pointwise percentile-based bootstrap confidence intervals for the sHR/csHR ratios. The bootstrapping process uses two loops. In the first loop, rep bootstrap samples are taken stratified by exposure (so each sample has the same exposure prevalence as the original data) and all event-specific cumulative incidences are calculated and stored for each of them. In the second loop, for each event time (i.e., each change in any one of the cumulative incidence functions), the 2.5th and 97.5th percentiles of the bootstrap estimates of the rep sHR/csHR ratios are stored as the lower and upper confidence limits. These are not directly returned to the user, but are used in the plotting of the sHR/csHR ratios.
2. User-supplied vs. program-generated weights
If confidence intervals are not desired, the user can supply a column of weights (e.g, inverse probability weights from a model predicting exposure) which will be used in the estimation of the cumulative incidences and the sHR/csHR ratios derived from them. However, the use of bootstrapping for calculation of confidence intervals as described above necessitates that such a model be refit for each bootstrap sample, generating new weights for new estimates of all quantities. If this is desired, the user should include all the predictor variables as columns in the data frame so that the appropriate model can be fit automatically using the ipwvars argument. If this method is used, the program uses a logistic regression model to calculate probability of exposure, stabilizes the resulting weights to the sample size, and winsorizes weights that fall outside ±4 standard deviations on the log scale. Using this method can increase computation time as the model must be refit on each bootstrap replicate.
3. Use of the nonparametric sHR/csHR ratio for calculation of sHR estimates
Simultaneous estimation of all subhazard ratios and cause-specific hazard ratios is often fraught with problematic results due to incompatible modeling assumptions (e.g., not all ratios can be proportional) and the tethering inherent in subhazard analysis of multiple events as described by Muñoz et al. (2013). Cause-specific hazard ratios are not subject to such tethering constraints, and will admissible regardless of whether the model is misspecified. As such, the output of this function – valid nonparametric estimates of the sHR/csHR ratio – can be combined with (i.e., multiplied by) cause-specific hazard ratio estimates (e.g., from a proportional or loglinear cause-specific hazards model) to produce subhazard ratio estimates which do not violate the principles of tethering. See below for an example of how to implement this.
An object containing the plotted figures ($plots
) and a data frame ($cuminc
) with the following columns:
Type of event that occurs at the given time.
Exposure group in which the event happens.
Time of the event.
Value of the unexposed (denoted by “o”) composite cumulative incidence at the given time.
Value of the exposed (denoted by “x”) composite cumulative incidence at the given time.
Value of the unexposed cumulative incidence of event 1 at the given time.
Value of the exposed cumulative incidence of event 1 at the given time.
Sub-hazard ratio/Cause-specific hazard ratio for event 1.
Sub-hazard ratio/Cause-specific hazard ratio for event 2.
1. Ng D, Antiporta DA, Matheson M, Munoz A. Nonparametric assessment of differences between competing risks hazard ratios: application to racial differences in pediatric chronic kidney disease progression. (Clinical Epidemiology, 2019-in print)
2. Muñoz A, Abraham AG, Matheson M, Wada N. In: Risk Assessment and Evaluation of Predictions. Lee MLT, Gail M, Pfeiffer R, Satten G, Cai T, Gandy A, editor. New York: Springer; 2013. Non-proportionality of hazards in the competing risks framework; pp. 3–22. [Google Scholar](https://link.springer.com/chapter/10.1007/978-1-4614-8981-8_1)
#data from the package data <- hrcomprisk::dat_ckid #Using the wrapper function npcrest(df=data, exit=exit, event=event, exposure=b1nb0,rep=10, maxtime=20, print.attr=TRUE)
#data from the package data <- hrcomprisk::dat_ckid #Using the wrapper function npcrest(df=data, exit=exit, event=event, exposure=b1nb0,rep=10, maxtime=20, print.attr=TRUE)
Plot Cumulative Incidence and Ratio of sHR/csHR.
plotCIF(cifobj, maxtime = Inf, ci = NULL, eoi = -1)
plotCIF(cifobj, maxtime = Inf, ci = NULL, eoi = -1)
cifobj |
A dataframe containing the Cumulative Incidence of each competing event by exposure group. |
maxtime |
Largest time to display on the x-axis of all output plots. |
ci |
A dataframe containing the 95% CI for each ratio of csHR/sHR. |
eoi |
Event number for the event of interest, useful when more than two events exist. |
A large list containing 2 figures:
Plot the cumulative incidence of the composite event and of each event by exposure group.
Plot the ratio of Sub-hazard ratio and Cause-specific hazard ratio for each event i (Ri).
#data from the package data <- hrcomprisk::dat_ckid #Estimate the Cumulative Incidence Functions and Ratios of sHR and csHR mydat.CIF<-CRCumInc(df=data, time=exit, event=event, exposed=b1nb0, print.attr=FALSE) #Plot the CIs and Ratios estimated plots<-plotCIF(cifobj=mydat.CIF, maxtime = 20, eoi = 1)
#data from the package data <- hrcomprisk::dat_ckid #Estimate the Cumulative Incidence Functions and Ratios of sHR and csHR mydat.CIF<-CRCumInc(df=data, time=exit, event=event, exposed=b1nb0, print.attr=FALSE) #Plot the CIs and Ratios estimated plots<-plotCIF(cifobj=mydat.CIF, maxtime = 20, eoi = 1)