--- title: "R Markdown version of lab 2" author: "Constantin T Yiannoutsos" date: "April 4, 2019" output: html_document --- ```{r setup, include = F} knitr::opts_chunk$set(warning = F) ``` In this lab, we are going to go over sample size calculation, which is required to design a clinical trial. \section{Sample size for a pre-specified precision} Using some notation, we want $$ \Pr(|\bar{X}-\mu|Translating medians to hazards and hazard ratios We begin by assuming that the time to death is distributed according to the exponential distribution for both groups. Denoting by $m_0$ and $m_{1}$ the median survival times for the standard treatment and the new treatment, respectively. Using some (very simple) calculous we can solve the equation with respect to the hazard $\lambda_{0}$ in the following formula $$ \int_{0}^{m_{0}}\lambda_0e^{-\lambda_0 t}dt = 0.5 $$ where $\lambda_0$ and $\mu_0$ are the hazard and the median survival time (i.e., the time point where 50\% of the subjects have experienced the event of interest) under the null hypothesis respectively. Running through the steps we get $\lambda_0 = - \dfrac{\log(0.5)}{m_0}$ and $\lambda_1 = - \dfrac{\log(0.5)}{m_1}$ in a similar manner, where $\lambda_1$ and $\mu_1$ are the hazard of failure and median survival time under the alternative hypothesis respectively. Now the hazard ratio is easily obtained. We point out that another nice property of the exponential distribution is that the ratio of hazards equals the inverse ratio of medians i.e., $$ HR=\frac{\lambda_1}{\lambda_0}=\frac{\mu_0}{\mu_1} $$ Note above that we have not specified a unit in time in any of the above calculations. You should be very careful to maintain the same unit of time (months, years, etc.) throughout your calculations. For example, the hazard ratio when the median survival times are $\mu_0=15$ and $\mu_1=25$ months (say) is ```{r ss.survival} # Median survival of standard therapy m0 = 15 lambda0 = - log(0.5)/m0 # Median survival of experimental therapy m1 = 25 lambda1 = - log(0.5)/m1 # Based on the properties of the exponetial distribution c(m0/m1,lambda1/lambda0) ``` 2. **Sample size when accrual time total follow-up duration are known** You have to always keep in mind that when dealing with survival analysis, you are concerned about events not patients. Thus, to obtain the desired level of power in a survival study, a certain number of events must occur, irrespective of the number of patients involved. A complicating factor is that there are different combinations of numbers of patients and length of follow-up that will result in the desired number of events. WE can use the function $\verb|nSurv|$ to help us investigate such combinations.\\[2ex] First we give some definitions: (1) Accrual time is the time it takes to enroll all the patients in the study (2) Follow-up time is the duration of the follow-up period after the last patient has been enrolled (3) Study duration is the sum of (1)+(2). The $\verb|nSurv|$ function has plenty of options. You are strongly advised to examine the options in depth. ``` {r ss.survival55} # Assume 55 months of accrual; 10 months of follow up afterwards # The function solves nSurv(lambdaC = lambda0, hr = lambda1/lambda0, hr0 = 1, R = 55, minfup = 10,T = 65, ratio = 1, alpha = 0.05, beta = 0.20, sided = 2) ``` In the above example * $\verb|lambdaC|$ is the event rate in the control group * $\verb|hr|$ is the hazard ratio under the alternative hypothesis * $\verb|hr0|$ is the hazard ratio under the null hypothesis * $\verb|R|$ is the accrual (recruitment) duration * $\verb|minfup|$ is the follow-up of the last patient enrolled * $\verb|T|$ is the maximum study duration In this example, the sample size needed to obtain the desired power is $n=173$, whereas the accrual rate should be 3.13 patients a month. Note that $3.1351\times 55= 172.4305$ so that the sample size will be accrued in the specified accrual period of 55 months. 3. **Sample size when the total follow-up time is known** Since we allow for the possibility that some patients might get lost while being under follow-up, it is obvious that we implicitly inflate the variance in our estimates, so a larger sample size should be required. Indeed we see that we now need 197 patients if the annual rate of loss to follow-up is $\eta=0.009$ (see below): ``` {r ss.survival55.ltu} # Assuming dropout rates nSurv(lambdaC = lambda0, hr = lambda1/lambda0, hr0 = 1, R = 55, minfup = 10,T = 65,eta = 0.009,etaE = 0.009, ratio = 1, alpha = 0.05, beta = 0.20, sided = 2) ``` We increase the follow-up period up to 100 months. Thus, patients would be under study for a longer period of time leading to a much greater probability of observing an event at a progressively higher proportion of them. Since the key quantity in determining the power in survival analysis is the number of events, we don't have to enroll so many patients as almost everybody would develop the event as follow-up time is extended. ``` {r ss.survival100} # Assuming a larger follow-up time nSurv(lambdaC = lambda0, hr = lambda1/lambda0, hr0 = 1, R = 55, minfup = 100,T = 155, ratio = 1, alpha = 0.05, beta = 0.20, sided = 2) ``` The expected sample size is now much smaller ($n=122$). It is in fact very close to the expected number of events, which means that almost all patients are going to experience the event within that period of time. 4. **Sample size when accrual rate and duration are known We specify the enrollment rate through $\verb|gamma=5|$ and the accrual duration through $\verb|R=35|$. The following code will produce a design under these criteria ``` {r ss.survival5.35} # Survival study design with known accrual rate and accrual duration nSurv(lambdaC = lambda0, hr = lambda1/lambda0, hr0 = 1, gamma = 5, R = 35, ratio = 1, alpha = 0.05, beta = 0.20, sided = 2) ``` Note that the expected number of events is the virtually the same as in the previous cases. We need $n=175$ subjects and the additional follow-up time required is approximately 16.5 months. On the following graph, we see that for a given recruitment duration of 5 months, as the follow up period increases the sample size decreases, and eventually converges to the expected number of events. The results are presented in Figure 2. ``` {r ss.survivalfig, fig.width=8, fig.height=5, fig.align='center', fig.cap="Figure 2. Sample size as the length of follow-up is extended."} # Increase follow up from 5 up to 100 months fp = seq(5,100,by=0.5) n = rep(NA,length(fp)) ar = rep(NA,length(fp)) for (i in 1:length(fp)) { x = nSurv(lambdaC = lambda0, hr = lambda1/lambda0, hr0 = 1, R = 5, minfup = fp[i], T = 5 + fp[i], ratio = 1, alpha = 0.05, beta = 0.20, sided = 2) n[i] = x$n ar[i] = x$gamma } plot(fp,n,ylim = c(80,510),yaxt = "n",ylab = "Sample size", xlab = "Follow up (months)",type = "l") axis(2,at = c(100,120,200,300,400,500),las = 1) abline(h = 119,col = "red") text(18,130,"Expected events") ```