Package 'SimHaz'

Title: Simulated Survival and Hazard Analysis for Time-Dependent Exposure
Description: Generate power for the Cox proportional hazards model by simulating survival events data with time dependent exposure status for subjects. A dichotomous exposure variable is considered with a single transition from unexposed to exposed status during the subject's time on study.
Authors: Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>
License: GPL (>= 2)
Version: 0.2
Built: 2025-02-20 03:44:06 UTC

Help Index

Simulated Survival and Hazard Analysis for time-dependent


This package generates power for the Cox proportional hazards model by simulating survival events data with time dependent exposure status for subjects. A dichotomous exposure variable is considered with a single transition from unexposed to exposed status during the subject time's in the study.


Package: SimHaz
Type: Package
Version: 0.2
Date: 2015-12-09
License: GPL-2
Depends: R (>= 3.1.1) Imports: survival


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


# Simulate a dataset of 600 subjects with time-dependent exposure without
# considering minimum follow-up time or minimum post-exposure follow-up time.
# Specifically, set the duration of the study to be 24 months; the median time to
# event for control group to be 24 months; exposure effect to be 0.3; median time
# to censoring to be 14 months; and exposure proportion to be 20%.

df <- tdSim.method1(N = 600, duration = 24, lambda = log(2)/24, rho = 1, 
   beta = 0.3, rateC = log(2)/14, exp.prop = 0.2, 
   prop.fullexp  = 0, maxrelexptime = 1, min.futime = 0,
   min.postexp.futime = 0)
# We recommend setting nSim to at least 500. It is set to 10 in the example to
# reduce run time for CRAN submission.

ret <- getpower.method1(nSim = 10, N = 600, b = 0.3, exp.prop = 0.2, 
    type = "td", scenario = "scenario 1", maxrelexptime = 1/6, min.futime = 4,
    min.postexp.futime = 4, output.fn = "output.csv")

Calculate power and betahat for Cox proportional Hazard model in the case of exposure matching.


This functions runs nSim (Number of simulations, specified by the user) Monte Carlo simulations, each time it calling tdSim.method1 internally. This function is used in the case of exposure matching where exposed subject is matched with non-exposed subject according to the user-input ratio. Users specify the which Cox model approach to analyze the data with clusters(matching sets in this case).

The function returns a data frame of scenario-specific parameters (including statistical power) and appends the output to a file with file name specified in the input parameters list. The user also have an option whether to plot an incidence plot or not.


getpower.exp.matching(nSim, N_match, duration = 24, med.TTE.Control = 24, 
	rho = 1, med.TimeToCensor = 14, beta, matching.ratio, type, scenario,
	method, prop.fullexp = 0,maxrelexptime = 1, min.futime = 0,
    min.postexp.futime = 0, output.fn, simu.plot = FALSE)



Number of simulations.


Number of subjects to be screened.


Length of the study in months; the default value is 24 (months).


Median time to event for control group; the default value is 24 (months).


Shape parameter of the Weibull distribution. Default is 1, which will generate survival times by using the exponential distribution.


Median time to censoring for all subjects. The default value is 14 (months).


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event.


Matching ratio used in exposure matching.For 1:1, the user should specify 1.input value 3 corresponds to 1:3 (exposed : unexposed). input value 0.25 corresponds to 4:1 (ie. 1:0.25)


A text string indicating the what type of dataset is of interest. Either one of "fixed" or "td" should be inputted.


Specified which Cox model approach to analyze the data with clusters. Can be one of the following: 'frailty', 'fixed effects', 'strata', 'Model with Independence Assumption'.


Any text string inputted by the user as an option to name a scenario that is being simulated. The use can simply put " " if he/she decide to not name the scenario.


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study. Only applies when type is "td". When type is "td", the value is automatically 1.


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study.


A numeric value that represents minimum post-exposure follow-up time (in months). The defaul value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure.


A .csv filename to write in the output. If the filename does not exist, the function will create a new .csv file for the output.


A logical value indicating whether or not to output an incidence plot. The default value is FALSE.


The function calculates power based on the Cox regression model, which calls the coxph function from the survival library using the the simulated data from tdSim.method1.


A data.frame object with columns corresponding to


Scenario name specified by the user


Dataset type specified by the user


Number of matching set specified by the user


Matching ratio used in exposure matching specified by the user


Minimum follow-up time to be considered, specified by the user


Minimum post-exposure follow-up time to be considered, specified by the user


Value of the scale parameter of the Weibull distribution to generate survival times. Calculated from median time to event for control group, which is specified by the user.


User-specified value of the shape parameter of the Weibull distribution to generate survival times


Rate of the exponential distribution to generate censoring times. Calculated from median time to censoring, which is specified by the user. i_beta Input value of regression coefficient (log hazard ratio).


Value of the input beta


Simulated number of evaluable subjects, which is the resulting number of subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time.


Simulated proportion of exposed subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time.


Simulated value of regression coefficient (log hazard ratio)


Simulated value of hazard ratio


Simulated number of events in total


Simulated number of events in control group


Simulated number of events in exposed group


Simulated median survival time in control group


Simulated median survival time in exposed group


Simulated statistical power from the Cox regression model on data with time-dependent exposure


Variance of the betahat from the simulations


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


Therneau T (2015). A Package for Survival Analysis in S. version 2.38,


# We recommend setting nSim to at least 500. It is set to 10 in the example to
# reduce run time for CRAN submission.

# Run 10 simulations. Each time simulate a dataset of 100 matches
# time-dependent exposure with both minimum follow-up time (4 months) and
# minimum post-exposure follow-up time (4 months) imposed. Also consider a
# quick exposure after entering the study for each exposed subject. Set the
# maximum relative exposure time to be 1/6. 

ret = getpower.exp.matching(nSim=10, N_match=100,duration=24,
    med.TTE.Control=24,rho=1, med.TimeToCensor=14, beta=0.7,
    matching.ratio=3,type="td", scenario="exposure_matching",  
    method="marginal", prop.fullexp=0,maxrelexptime=1,
    min.futime=0, min.postexp.futime=0, output.fn="result_matching",

Calculate betahat bias and variance for different matching ratios in the case of exposure matching in the Cox proportional Hazard model


This function is used to explore what matching ratios should be used in the case of exposure matching. It simulate a large population (100000 subjects) in order to estimate a betahat value of that population. Then in repeated simulation, a subset of that population is drawn and exposure matching is done according to a user-specified list of ratios in order to compare the bias in the betahat estimate from the betahat value estimated from the large populations well as the variance of the betahat estimate for each ratio.


getpower.exp.matching.opt(nSim, N, ratios=c(1,0.25,0.333,0.5,2,3,4,5),
    duration=24, med.TTE.Control=24,rho=1,med.TimeToCensor=14,beta, 
	exp.prop,type,scenario, method,  prop.fullexp=0,maxrelexptime=1,
	min.futime=0,min.postexp.futime=0, output.fn=NULL,simu.plot=FALSE)



Number of simulations.


Number of subjects to be screened.


Specification for matching ratio as a list of numbers. For 1:1, the user should specify 1.input value 3 corresponds to 1:3 (exposed : unexposed). input value 0.25 corresponds to 4:1 (ie. 1:0.25). A list of c(1,0.25, 2) corresponds to the following matching ratios: 1:1, 4:1, 1:2


Length of the study in months; the default value is 24 (months).


Median time to event for control group; the default value is 24 (months).


Shape parameter of the Weibull distribution. Default is 1, which will generate survival times by using the exponential distribution.


Median time to censoring for all subjects. The default value is 14 (months).


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event.




A text string indicating the what type of dataset is of interest. Either one of "fixed" or "td" should be inputted.


Any text string inputted by the user as an option to name a scenario that is being simulated. The use can simply put " " if he/she decide to not name the scenario.


Specified which Cox model approach to analyze the data with clusters(matching sets in this case). Can be one of the following: 'frailty', 'fixed effects', 'strata', 'Model with Independence Assumption'.


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study. Only applies when type is "td". When type is "td", the value is automatically 1.


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study.


A numeric value that represents minimum post-exposure follow-up time (in months). The defaul value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure.


A .csv filename to write in the output. If the filename does not exist, the function will create a new .csv file for the output.


A logical value indicating whether or not to output an incidence plot. The default value is FALSE.


The function calculates power based on the Cox regression model, which calls the coxph function from the survival library using the the simulated data from tdSim.method1.


A data.frame object with columns corresponding to


Scenario name specified by the user


Dataset type specified by the user


Minimum follow-up time to be considered, specified by the user


Minimum post-exposure follow-up time to be considered, specified by the user


Value of the scale parameter of the Weibull distribution to generate survival times. Calculated from median time to event for control group, which is specified by the user.


User-specified value of the shape parameter of the Weibull distribution to generate survival times


Rate of the exponential distribution to generate censoring times. Calculated from median time to censoring, which is specified by the user. i_beta Input value of regression coefficient (log hazard ratio).


value of the input beta


Number of matching sets


Number of exposed subjects


Number of unexposed subjects


Simulated value of regression coefficient (log hazard ratio)


Simulated statistical power from the Cox regression model on data with time-dependent exposure


Value of betahat based on the population of 100000 subjects.


Value of betahat - actual_betahat(approximated from the big population)


Variance of the betahat from the simulations


The variance of betahat of the ratio in that row divided by variance of betahat for the 1:1 matching


The variance of betahat of the ratio in that row divided by variance of the betahat for the closest ratio based on the exposure proportion in the population


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


Therneau T (2015). A Package for Survival Analysis in S. version 2.38,


# We recommend setting nSim to at least 500. It is set to 5 in the example to
# reduce run time for CRAN submission.

# Run 5 simulations. Each time simulate a dataset of 400 subjects with
# time-dependent exposure with both minimum follow-up time (4 months) and
# minimum post-exposure follow-up time (4 months) imposed. Also consider a
# quick exposure after entering the study for each exposed subject. Set the
# maximum relative exposure time to be 1/6. 

ret = getpower.exp.matching.opt(nSim=5, N=400, ratios=c(1), duration=24,
    med.TTE.Control=24, rho=1,med.TimeToCensor=14,beta=0.5, exp.prop=0.3, 
    type="td",scenario="opt_exp_matching", method="marginal",prop.fullexp=0,
    maxrelexptime=1,min.futime=0,min.postexp.futime=0, output.fn="opt_matching",

Calculate power for the Cox proportional hazard model with time-dependent exposure using method 1


This functions runs nSim (Number of simulations, specified by the user) Monte Carlo simulations, each time it calling tdSim.method1 internally. The function returns a data frame of scenario-specific parameters (including statistical power) and appends the output to a file with file name specified in the input parameters list. The user also have an option whether to plot an incidence plot or not.


getpower.method1(nSim, N, duration = 24, med.TTE.Control = 24, rho = 1,
    med.TimeToCensor = 14, beta, exp.prop, type, scenario, prop.fullexp = 0,
    maxrelexptime = 1, min.futime = 0, min.postexp.futime = 0, output.fn, 
    simu.plot = FALSE)



Number of simulations.


Number of subjects to be screened.


Length of the study in months; the default value is 24 (months).


Median time to event for control group; the default value is 24 (months).


Shape parameter of the Weibull distribution. Default is 1, which will generate survival times by using the exponential distribution.


Median time to censoring for all subjects. The default value is 14 (months).


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event.


A numeric value between 0 and 1 (not include 0 and 1) that represents the proportion of subjects that are assigned with an exposure.


A text string indicating the what type of dataset is of interest. Either one of "fixed" or "td" should be inputted.


Any text string inputted by the user as an option to name a scenario that is being simulated. The use can simply put " " if he/she decide to not name the scenario.


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study. Only applies when type is "td". When type is "td", the value is automatically 1.


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study.


A numeric value that represents minimum post-exposure follow-up time (in months). The defaul value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure.


A .csv filename to write in the output. If the filename does not exist, the function will create a new .csv file for the output.


A logical value indicating whether or not to output an incidence plot. The default value is FALSE.


The function calculates power based on the Cox regression model, which calls the coxph function from the survival library using the the simulated data from tdSim.method1.


A data.frame object with columns corresponding to


Scenario name specified by the user


Dataset type specified by the user


Number of subjects to be screened, specified by the user


Minimum follow-up time to be considered, specified by the user


Minimum post-exposure follow-up time to be considered, specified by the user


Exposure rate specified by the user


Value of the scale parameter of the Weibull distribution to generate survival times. Calculated from median time to event for control group, which is specified by the user.


User-specified value of the shape parameter of the Weibull distribution to generate survival times


Rate of the exponential distribution to generate censoring times. Calculated from median time to censoring, which is specified by the user. i_beta Input value of regression coefficient (log hazard ratio).


Simulated number of evaluable subjects, which is the resulting number of subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time.


Simulated proportion of exposed subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time.


Simulated value of regression coefficient (log hazard ratio)


Simulated value of hazard ratio


Simulated number of events in total


Simulated number of events in control group


Simulated number of events in exposed group


Simulated median survival time in control group


Simulated median survival time in exposed group


Simulated statistical power from the Cox regression model on data with time-dependent exposure


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


Therneau T (2015). A Package for Survival Analysis in S. version 2.38,


# We recommend setting nSim to at least 500. It is set to 10 in the example to
# reduce run time for CRAN submission.

# Run 10 simulations. Each time simulate a dataset of 600 subjects with
# time-dependent exposure with both minimum follow-up time (4 months) and
# minimum post-exposure follow-up time (4 months) imposed. Also consider a
# quick exposure after entering the study for each exposed subject. Set the
# maximum relative exposure time to be 1/6. 

# Set the duration of the study to be 24 months; the median time to event for
# control group to be 24 months; exposure effect to be 0.3; median time to
# censoring to be 14 months; and exposure proportion to be 20%.

ret <- getpower.method1(nSim = 10, N = 600, b = 0.3, exp.prop = 0.2,
    type = "td", scenario = " ", maxrelexptime = 1/6, min.futime = 4,
    min.postexp.futime = 4, output.fn = "output.csv")

Calculate power for the Cox proportional hazard model with time-dependent exposure using method 2


This functions runs nSim (Number of simulations, specified by the user) Monte Carlo simulations, each time calling tdSim.method2 internally. The function returns a data frame of scenario-specific input parameters- and also output statistical power. The user has the option to append the output to a file with file name specified in the input parameters list.


getpower.method2(nSim = 500, N, duration = 24, scenario, lambda12,
  lambda23 = NULL, lambda13, HR = NULL, exp.prop, rateC, min.futime, 
  min.postexp.futime, output.fn, simu.plot = FALSE)



Number of simulations.


Number of subjects to be screened.


Length of the study in months; the default value is 24 (months).


Any text string inputted by the user as an option to name a scenario that is being simulated. The use can simply put " " if he/she decides to not name the scenario.


Lambda12 parameter to control time to exposure.


Lambda23 parameter to control time to event after exposure.


Lambda13 parameter to control time to event in the control group.


Hazard Ratio. This input is optional. If HR is set and lambda23 is not set, lambda23 = lambda13*HR.


A numeric value between 0 and 1 (not include 0 and 1) that represents the proportion of subjects that are assigned with an exposure.


Rate of the exponential distribution to generate censoring times.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study.


A numeric value that represents minimum post-exposure follow-up time (in months). The defaul value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure.


A .csv filename to write in the output. If the filename does not exist, the function will create a new .csv file for the output.


A logical value indicating whether or not to output an incidence plot.The default value is FALSE.


The function calculates power based on the Cox regression model, which calls the coxph function from the survival library using the the simulated data from tdSim.method2.


A data.frame object with columns corresponding to


Scenario name specified by the user


Number of subjects needs to be screened, specified by the user


Minimum follow-up time to be considered, specified by the user


Minimum post-exposure follow-up time to be considered, specified by the user


Exposure rate specified by the user


Lambda12 parameter to control time to exposure


Lambda23 parameter to control time to event after exposure


Lambda13 parameter to control time to event in the control group


Rate of the exponential distribution to generate censoring times. Calculated from median time to censoring, which is specified by the user. i_beta Input value of regression coefficient (log hazard ratio)


Simulated number of evaluable subjects, which is the resulting number of subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time


Simulated proportion of exposed subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time


Simulated value of regression coefficient (log hazard ratio)


Simulated value of hazard ratio


Simulated number of events in total


Simulated number of events in control group


Simulated number of events in exposed group


Simulated median survival time in control group


Simulated median survival time in exposed group


Simulated statistical power from the Cox regression model on data with time-dependent exposure


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


# We recommend setting nSim to at least 500. It is set to 10 in the example to
# reduce run time for CRAN submission.

# Run 10 simulations. Each time simulate a dataset of 600 subjects

ret <- getpower.method2(nSim=10, N=600, duration=24, scenario="test",
    lambda12=1.3, lambda23=0.04, lambda13=0.03, HR=NULL,exp.prop=0.2, rateC=0.05,
    min.futime=4, min.postexp.futime=4,output.fn="database.csv", simu.plot=FALSE)

Calculate power for the Cox proportional hazard model with time-dependent exposure and multiple centers using method1


This functions runs nSim (Number of simulations, specified by the user) Monte Carlo simulations, each time calling tdSim.multicenter internally. The function returns a data frame of scenario-specific input parameters- and also output statistical power. The user has the option to append the output to a file with file name specified in the input parameters list.


    df,dist=NULL, method, type, scenario, prop.fullexp=0, maxrelexptime=1, 
    min.futime=0, min.postexp.futime=0, output.fn,simu.plot=FALSE)



Number of simulations.


Number of subjects to be screened.


Length of the study in months; the default value is 24 (months).


Shape parameter of the Weibull distribution. Default is 1, which will generate survival times by using the exponential distribution.


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event.


Median time to censoring for all subjects. The default value is 14 (months).


A user-specified n by 4 clustering data frame with columns corresponding to cat_id (category id, which is the physician site id. It can be either text strings or integers), center.size (number of subjects within each center), cat_exp.prop (proportion of exposed subjects in each center), and med.TTE.Control (median time to control event for each center). n rows corresponds to n different centers


The distribution of the center effect across centers. Default is NULL. If dist= 'gamma', then a random frailty effect from a gamma distribution with scale 0.5 and shape 2 is used.


Specified which Cox model approach to analyze the data with centers. Can be one of the following: 'frailty', 'fixed effects', 'strata', 'Model with Independence Assumption'.


A text string indicating the what type of dataset is of interest. Either one of "fixed" or "td" should be inputted.


Any text string inputted by the user as an option to name a scenario that is being simulated. The use can simply put " " if he/she decide to not name the scenario.


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study. Only applies when type is "td". When type is "td", the value is automatically 1.


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study.


A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure.


A .csv filename to write in the output. If the filename does not exist, the function will create a new .csv file for the output.


A logical value indicating whether or not to output an incidence plot. The default value is FALSE.


The function calculates power based on the Cox regression model, which calls the coxph function from the survival library using the the simulated data from tdSim.multicenter.


A data.frame object with columns corresponding to


Scenario name specified by the user


Dataset type specified by the user


Number of subjects to be screened, specified by the user


Minimum follow-up time to be considered, specified by the user


Minimum post-exposure follow-up time to be considered, specified by the user


Exposure rate specified by the user


Value of the scale parameter of the Weibull distribution to generate survival times. Calculated from median time to event for control group, which is specified by the user.


User-specified value of the shape parameter of the Weibull distribution to generate survival times


Rate of the exponential distribution to generate censoring times. Calculated from median time to censoring, which is specified by the user. i_beta Input value of regression coefficient (log hazard ratio).


Simulated number of evaluable subjects, which is the resulting number of subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time.


Simulated proportion of exposed subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time.


Simulated value of regression coefficient (log hazard ratio)


Simulated value of hazard ratio


Simulated number of events in total


Simulated number of events in control group


Simulated number of events in exposed group


Simulated median survival time in control group


Simulated median survival time in exposed group


Simulated statistical power from the Cox regression model on data with time-dependent exposure


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


# We recommend setting nSim to at least 500. It is set to 10 in the example to
# reduce run time for CRAN submission.

# Run 10 simulations. Each time simulate a dataset of 300 subjects

input_df1 <- data.frame(cat_id = c("low","med","high"), center.size = rep(100,3), 
                        cat_exp.prop = rep(1/3, 3), med.TTE.Control=c(14,20,31))

df_strat <- getpower.multicenter(nSim = 10, N = 300, beta = 0.7, 
    df = input_df1,method="strata",  type = "td",  scenario = "strata", 
    maxrelexptime = 1/6, min.futime = 4, min.postexp.futime = 4, 
    output.fn = "output_mult1.csv")

Plot power curves for survival analysis with time-dependent exposure


This function plots a power curve at each time and returns a subsetted data frame that match the list of input parameters.


plot_power(table_df, N, type, exp.prop, min.futime, min.postexp.futime,
    show.plot = FALSE, newplot = FALSE, col = NULL, lty, lwd, pch)



A data frame read from a .csv file of a format output from the getpower.method1 function.


Number of subjects needs to be screened


A text string indicating the what type of dataset is of interest. Either one of "fixed" or "td" should be inputted


A numeric value between 0 and 1 (not include 0 and 1) that represents the proportion of subjects that are assigned with an exposure


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study


A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure


A logical value indicating whether to output a power curve or not. The default value is TRUE


A logical value indicating whether to create a new plot or add to an existing plot

col, lty, lwd, pch

Graphical parameters as in the regular plot function in R


The gist of this function is that the user can check the plot with the values in the output data frame. Moreover, it is flexible that the user can choose to only output the data frame in order to plot their own graph (such as add titles or legends) based on the data. The user can also add as many lines as possible to an existing power curve plot so that he/she is allowed to compare different scenarios.


A data.frame object with columns corresponding to


Number of subjects needs to be screened, specified by the user


Simulated number of evaluable subjects, which is the resulting number of subjects with or without considering minimum follow-up time and/or minimum post-exposure follow-up time


Input value of regression coefficient (log hazard ratio)


Simulated statistical power from the Cox regression model on data with time-dependent exposure


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


# We recommend setting nSim to at least 500. It is set to 10 in the example to
# reduce run time for CRAN submission.

ret <- getpower.method1(nSim = 10, N = 600, b = 0.3, exp.prop = 0.2, 
    type = "td", scenario =  " ", maxrelexptime = 1/6, min.futime = 4, 
    min.postexp.futime = 4, output.fn = "output.csv")
ret2 <- getpower.method1(nSim = 10, N = 600, b = 0.3, exp.prop = 0.2, 
    type = "td", scenario = " ", maxrelexptime = 1/6, min.futime = 4, 
    min.postexp.futime = 0, output.fn ="output.csv")
# Read in .csv file as a data frame

tb <-  read.csv("output.csv", header = TRUE, sep = ",")

	# Visualize the subsetted data frame of interest and create a new plot

visualize1 <- plot_power(table_df = tb, N = 600, type = "td", exp.prop = 0.2,
    min.futime = 4, min.postexp.futime = 4, show.plot = TRUE, newplot = TRUE,
    col = "red", lty = 1, lwd = 2, pch = 16)

# Add a different power curve to the previously created plot

visualize2 <- plot_power(table_df = tb, N = 600, type = "td", exp.prop=0.2, 
    min.futime = 4, min.postexp.futime = 0, show.plot = TRUE, newplot = FALSE,
    col = "blue", lty = 1, lwd = 2, pch = 16)

Make an incidence plot from simulated data.


Create an incidence plot ordered by follow-up time from a survival data simulated.


plot_simuData(data, title="Sample Survival Data")



A dataframe of survival data containing the following columns: id, start, stop, status, x


Title of the graph


This makes the incidence plot of the survival data based on the input dataframe from the tdSim.method1 or tdSim.method2 function. More generally, this function would also works with a dataframe containing survival data with the same columns name as indicated above.


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


dat <- tdSim.method2(500, duration=24,lambda12=1.3,lambda23=0.04, 
    lambda13=0.03, exp.prop=0.2,rateC=0.05, min.futime=4, min.postexp.futime=4)
plot_simuData(dat, title='method2_filter')

Simulate 1 dataframe (1 simulation) of time-dep exposure in the case of exposure matching


This function simulate Survival Data. It generates a simulated dataset with time-dependent exposure with a user-specified list of parameters as input as well as matching id in order to do exposure matching.


tdSim.exp.matching(N_match, duration=24,lambda, rho=1, beta,



Number of matching sets


Length of the study in Months. The default value is 24 (months)


Scale parameter of the Weibull distribution, which is calculated as log(2) / median time to event for control group


Shape parameter of the Weibull distribution, which is defaulted as 1, as we generate survival times by using the exponential distribution


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event


Rate of the exponential distribution to generate censoring times, which is calculated as log(2) / median time to censoring


Matching ratio used in exposure matching.For 1:1, the user should specify 1.input value 3 corresponds to 1:3 (exposed : unexposed). input value 0.25 corresponds to 4:1 (ie. 1:0.25).


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study


A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure


Simulata a Survival dataset using a modified version of illness-death model controlled by lambda12, lambda23, lambda13


A data.frame object with columns corresponding to


Integer that represents a subject's identification number


For counting process formulation. Represents the start of each time interval


For counting process formulation. Represents the end of each time interval


Indicator of event. status = 1 when event occurs and 0 otherwise


Indicator of exposure. x = 1 when exposed and 0 otherwise


Integer that represents a subject's matching set.


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


df = tdSim.exp.matching<-function(N_match, duration=24,lambda, rho=1, 
    beta, rateC,matching.ratio=3,  prop.fullexp=0,maxrelexptime=1,min.futime=0,

Simulate 1 dataframe (1 simulation) of time-dep exposure under method 1


This function generates a simulated dataset with time-dependent exposure under method 1 with a user-specified list of parameters as input. Survival times and censoring times are generated from the exponential distribution.


tdSim.method1(N, duration = 24, lambda, rho = 1, beta, rateC, exp.prop, 
    prop.fullexp  = 0, maxrelexptime = 1, min.futime = 0, min.postexp.futime = 0)



Number of subjects needs to be screened


Length of the study in Months. The default value is 24 (months)


Scale parameter of the Weibull distribution, which is calculated as log(2) / median time to event for control group


Shape parameter of the Weibull distribution, which is defaulted as 1, as we generate survival times by using the exponential distribution


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event


Rate of the exponential distribution to generate censoring times, which is calculated as log(2) / median time to censoring


A numeric value between 0 and 1 (not include 0 and 1) that represents the proportion of subjects that are assigned with an exposure


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study


A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure


If no minimum follow-up time or minimum post-exposure follow-up time is considered (min.fut = 0 and min.postexp.fut = 0), then the output data frame will have N subjects. If we consider minimum follow-up time or both, then the output data frame will have at most N subjects.


A data.frame object with columns corresponding to


Integer that represents a subject's identification number


For counting process formulation. Represents the start of each time interval


For counting process formulation. Represents the end of each time interval


Indicator of event. status = 1 when event occurs and 0 otherwise


Indicator of exposure. x = 1 when exposed and 0 otherwise


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


Therneau and C. Crowson (2015). Using Time Dependent Covariates and Time Dependent Coefficients in the Cox Model.


# Simulate a dataset of 600 subjects with time-dependent exposure without
# considering minimum follow-up time or minimum post-exposure follow-up time.
# Specifically, set the duration of the study to be 24 months; the median time to
# event for control group to be 24 months; exposure effect to be 0.3; median time
# to censoring to be 14 months; and exposure proportion to be 20%.

df1 <- tdSim.method1(N = 600, duration = 24, lambda = log(2)/24, rho = 1, 
   beta = 0.3, rateC = log(2)/14, exp.prop = 0.2, prop.fullexp  = 0, 
   maxrelexptime = 1, min.fut = 0, min.postexp.fut = 0)

# Simulate a dataset of 600 subjects with time-dependent exposure with
# both minimum follow-up time (4 months) and minimum post-exposure
# follow-up time (4 months) imposed. Other parameters remain the same as
# in the first case.

df2 <- tdSim.method1(N = 600, duration = 24, lambda = log(2)/24, rho = 1, 
   beta = 0.3, rateC = log(2)/14, exp.prop = 0.2, prop.fullexp  = 0, 
   maxrelexptime = 1, min.fut = 4, min.postexp.fut = 4)

# Simulate a dataset of 600 subjects with time-dependent exposure with
# both minimum follow-up time (4 months) and minimum post-exposure
# follow-up time (4 months) imposed. Also consider a quick exposure after
# entering the study for each exposed subject. Set the maximum relative
# exposure time to be 1/6. Other parameters remain the same as in the first case.

df3 <- tdSim.method1(N = 600, duration = 24, lambda = log(2)/24, rho = 1, 
   beta = 0.3, rateC = log(2)/14, exp.prop = 0.2, prop.fullexp  = 0,
   maxrelexptime = 1/6, min.fut = 4,min.postexp.fut = 4)

Simulate 1 dataframe (1 simulation) of time-dep exposure under method 2


This function simulate Survival Data. It generates a simulated dataset with time-dependent exposure under method 2 with a user-specified list of parameters as input.


tdSim.method2(N,duration, lambda12, lambda23=NULL, lambda13, 
    HR=NULL, exp.prop,rateC, min.futime = 0, min.postexp.futime = 0)



Number of subjects


Duration of the study. This is used in censoring


Lambda12 parameter to control time to exposure


Lambda23 parameter to control time to event after exposure


Lambda13 parameter to control time to event in the control group


Hazard Ratio. This input is optional. If HR is set and lambda23 is not set, lambda23 = lambda13*HR


A numeric value between 0 and 1 (not include 0 and 1) that represents the proportion of subjects that are assigned with an exposure


Rate of the exponential distribution to generate censoring times


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study


A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure


Simulata a Survival dataset using a modified version of illness-death model controlled by lambda12, lambda23, lambda13


A data.frame object with columns corresponding to


Integer that represents a subject's identification number


For counting process formulation. Represents the start of each time interval


For counting process formulation. Represents the end of each time interval


Indicator of event. status = 1 when event occurs and 0 otherwise


Indicator of exposure. x = 1 when exposed and 0 otherwise


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


sim_data <- tdSim.method2(500, duration=24,lambda12=1.3,lambda23=0.04, 
    lambda13=0.03, exp.prop=0.2,rateC=0.05, min.futime=4, min.postexp.futime=4)

Simulate 1 dataframe (1 simulation) of time-dependent exposure with multiple centers


This function allows the user to input a data frame with multi-center parameters and generates a simulated dataset with time-dependent exposure. In particular, the output dataset has a column corresponding to the center id, which will be used as a clustering variable in the Cox regression model in power calculation.


    min.postexp.futime=0, dist=NULL)



Number of subjects needs to be screened


Length of the study in Months. The default value is 24 (months)


Shape parameter of the Weibull distribution, which is defaulted as 1, as we generate survival times by using the exponential distribution


A numeric value that represents the exposure effect, which is the regression coefficient (log hazard ratio) that represent the magnitude of the relationship between the exposure covariate and the risk of an event


Rate of the exponential distribution to generate censoring times, which is calculated as log(2) / median time to censoring


A user-specified n by 4 clustering data frame with columns corresponding to cat_id (category id, which is the physician site id. It can be either text strings or integers), center.size (number of subjects within each center), cat_exp.prop (proportion of exposed subjects in each center), and med.TTE.Control (median time to control event for each center). n rows corresponds to n different centers


A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study


A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.


A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study


A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure


The distribution of the center effect across centers. Default is NULL. If dist= 'gamma', then a random frailty effect from a gamma distribution with scale 0.5 and shape 2 is used.


The current version of this function allows the user to input a data frame with at least 3 categories of physician sites, because the function uses a multinomial distribution to assign subjects into each category according to the corresponding category proportion


A data.frame object with columns corresponding to


Integer that represents a subject's identification number


For counting process formulation. Represents the start of each time interval


For counting process formulation. Represents the end of each time interval


Indicator of event. status = 1 when event occurs and 0 otherwise


Indicator of exposure. x = 1 when exposed and 0 otherwise


For clustering in the Cox proportional hazard model. Represents label of each subject's corresponding physician site


Danyi Xiong, Teeranan Pokaprakarn, Hiroto Udagawa, Nusrat Rabbee
Maintainer: Nusrat Rabbee <[email protected]>


T. Therneau and C. Crowson (2015). Using Time Dependent Covariates and Time Dependent Coefficients in the Cox Model.


# Create a clustering data frame as input with 3 categories and a 20% weighted
# exposure proportion.
input_df1 <- data.frame(cat_id = c("low","med","high"), 
                        center.size = rep(100,3),  
                        cat_exp.prop = rep(1/3, 3), 

df_strat <- tdSim.multicenter(N = 300, duration =24, rateC = log(2)/14, beta = 0.7, 
    df = input_df1, maxrelexptime = 1/6, min.futime = 4, min.postexp.futime = 4)