---
title: "R Markdown version of lab 1"
author: "Constantin T Yiannoutsos"
date: "September 14, 2017"
output: html_document
     fig_caption: true
---

# Introduction to clinical trials
Lab 1: Study designs

## Dose finding studies
We will go over an example from artificial data from a Phase I study.  We will do this with \verb|R|, but these data can also be analyzed with any other statistical software as well.  We proceed with the following steps:
We first import the data (\emph{trialPh1.csv}) into \verb|R|. By having a look at the data, we can understand that each patient (\emph{id}) has been assigned exactly to one dose of the drug.
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
setwd("H:/teaching/Indiana/PBHL B-582/labs/lab 1 (study designs)/data")
trialPh1 <- read.csv("trialPh1.csv")
library(knitr)
options(knitr.table.format = "html") 

```

```{r h1data, echo=TRUE}
# (1): Artificial data from an old Phase I study of a cytotoxic drug
# 10 doses have been investigated; have a look at the data
trialPh1[1:10,]
```
Notice in the \verb|R| above how we indexed the first 10 rows of data set \texttt{trialPh1}:  \verb|R| keeps the data set as a two-dimensional grid \texttt{trialPh1[nr,nc]}, where \texttt{nr} (say) is the number of rows and \texttt{nc} is (say) the number of columns. Notice that the rows, are indexed first and the columns second.  Our index (included between the brackets) specified rows 1 through 10 (i.e., \texttt{1:10}, note the use of the colon to imply range). Since we want all the columns, we left the index blank after the comma.  This told \verb|R| to include all columns in the output.

The outcome variable (*toxicity*) indicates whether the subject developed toxicity  or not. Thus, it is a binary (0/1) variable and a logistic regression model may be appropriate. We assume that the logit transformation of the probability of toxicity $p$ is

$$
\operatorname{logit}(p)=\log(p/(1-p))
$$
and it depends linearly on the logarithm of the dose.
$$ 
\operatorname{Logit}(p) = \beta_{0} + \beta_{1}\log(Dose)
$$
To fit such a model, one can use the \verb|glm| in \verb|R|. To do so, we need to define the appropriate formula indicating the outcome variable (on the left) and the covariates (on the right). We also need to tell \verb|R| to use the binomial distribution using the argument \verb|family = "binomial"|, which assumes by default a logit link function.

```{r llmodel, ECHO=TRUE}
# Fitting a logistic model to estimate the dose associated with toxicity 
# in 30% of the patients
fit = glm(toxicity ~ log(dose), family = "binomial", data = trialPh1)
summary(fit)
```
To get the coefficients we proceed as follows:
```{r llcoef, echo=TRUE}
# Get coefficients
betaHat = coef(fit)
exp(betaHat)[2]
```
Note the syntax of the logistic regression model. The model is included in the argument of the function \texttt{glm} with the dependent (\texttt{toxicity}) and independent variable(s) (log \texttt{dose}) separated by a tilde. This is a general syntax of modeling functions.  Also, note that option \texttt{family="binomial"}, which implies that a logistic regression model is to be run (the \texttt{glm} function runs an array of generalized linear models). 
The above output means that $\beta_1=138.6054$. This is the increase in the log-odds of the toxicity for each log-unit in the dose.  

Suppose now that we want to estimate the optimal dose, which is defined as the dose that results in no more than 30\% probability of toxicity. To estimate the optimal dose we simply need to solve the equation \eqref{logit} with respect to $Dose$, giving
\begin{align}
\widehat{D} = \exp\left\{(\operatorname{Logit}(p) - \hat\beta_{0})/\hat\beta_{1}\right\}
\nonumber
\end{align}
where $\hat D$ is the estimate of the (optimal) dose at $p=0.30$, and $\hat\beta_0$ and $\hat\beta_1$ are the logistic regression coefficients from the model in \eqref{logit} above. 
Consequently, the estimated optimal dose is
```{r optdose, echo=TRUE}
# Estimate dose30
invlogit = function(x){exp(x)/(1+exp(x))}
logit = function(x){log(x/(1-x))}
Dose30 = exp((logit(0.30) - betaHat[1])/betaHat[2])
```
Thus, the optimal dose which is associated with 30\% chance of toxicity is 0.466. We can also produce a graph showing the predictions of the model
```{r optdosegraph, fig.width=5.5, fig.height=4.5, fig.cap="Figure 1. Probability of toxicity based on a logistic regression model", fig.align="CENTER"}
# Estimated probability of toxicity as a function of log-dose
curve(invlogit(betaHat[1]+betaHat[2]*x),from = min(log(trialPh1$dose)),
      to = max(log(trialPh1$dose)),ylim = c(0,0.60),
      xlab = "Logarithm of dose", ylab = "Probability of toxicity")
lines(c(-1.4,log(Dose30)),rep(0.3,2),col = "red",lty = 2)
lines(rep(log(Dose30),2),c(0,0.3),col = "red",lty = 2)
```

Notice that the x-axis is in the log scale, so the log-dose is $\log(0.466553)=-0.762$.  Note the use of the function \texttt{curve}. You can see what this function does by typing ``\texttt{help(curve)}" in the command line 
Although this approach can accurately estimate the optimal dose (the one corresponding to a certain probability of event), it cannot be used in actual clinical trials due to ethical reasons. In this artificial example, patients have been exposed to doses that were certain to lead to serious toxicities, violating central tenets of ethics.

## Operating characteristics of the standard Phase I design
Let us consider the standard 3+3 design.  The conditional probability of escalating past the $i$th dose,  given that the $i$th dose has been reached, is 
$$
P_{i} = b(0;3,p_{i}) + b(1;3,p_{i})\times b(0;3,p_{i}).
$$
where $b(k;n, p_i)={\rm Pr}(X=k; n,p_i)$ is the binomial probability of obtaining $k$ successes out of $n$ trials when the probability of success is $p_i$.  There are two scenarios that can lead to increasing the $i$th dose:

1. No toxicities are observed
2. One toxicity is seen but there are no additional ones in the expanded cohort.

Given the hypothetical probability of toxicity at the $i$th dose, $p_{i}$, the probability of seeing no toxicities (scenario 1) is $b(0;3,p_{i})$. The second scenario requires two events: the occurrence of one toxicity in the original cohort (which occurs with probability $b(1;3,p_{i})$) and the occurrence of no toxicity in the additional subjects (which occurs with  probability $b(0;3,p_{i})$ conditional on the toxicity in the original cohort). By the multiplication law of probability, the probability of scenario 2 is the product of the two corresponding probabilities $b(1;3,p_{i})\times b(0;3,p_{i})$.  
Overall, the conditional probability of passing the $i$th dose is the sum of the two probabilities 

$$
b(0;3,p_{i}) + b(1;3,p_{i})\times b(0;3,p_{i})
$$
Then the unconditional probability of passing the $i$th dose is
$$
Q_{i}\prod_{k=1}^{i}b(0;3,p_{k}) + b(1;3,p_{k})\times b(0;3,p_{k}).
$$
By the multiplication law of probability\footnote{The multiplication law of probability states that if an event $A$ is independent (conveys no information about) the event $B$, then the probabiliby that both happen is
$$
P(A\cap B)=P(A)P(B) 
$$
Here the event that exactly one among the first three subjects will experience a serious treatment-related toxicity or death (event, say, $A$ with $P(A)=b(1,3;p)$), does not contain any information about the likelihood that no toxicities or deaths are seen among the next three subjects (event $B$ with $P(B)=b(0,3;p)$). Thus, the multiplication principal holds. In other words, 
\begin{eqnarray*}
P(A\cap B) & = & P({\rm 1\,\,toxicity\,\,among\,\,3\,\,subjects\,\, AND\,\,0\,\,toxicities\,\,among\,\,3\,\,subjects})\\
&= & P(A)P(B)=b(1,3;p)b(0,3;p)
\end{eqnarray*}} 
to obtain the unconditional probability of passing the $i$th dose, we need to multiply the probabilities of passing all earlier doses. You may find these calculations similar to those involved in the relationship between the survival and hazard function.

### Calculating the operating characteristics for both scenarios in \texttt{R}
You can do this by hand using the binomial probability density function (pdf) in \texttt{R}, \texttt{dbinom} as follows:
1. Find $b(0;3, 0.2)$ the probability of no toxicities in the first dose (where $p_1=0.20$).  This is
```{r, dbinom01, echo=TRUE}
dbinom(x = 0, size = 3, prob = 0.20)
```
2.  Find the probability of just one toxicity in the first dose
```{r binom11, echo=TRUE}
dbinom(x = 1, size = 3, prob = 0.20)
```
3. Find the same probabilities for the second dose (where $p_2=0.30$)
```{r dbinom02, echo=TRUE}
dbinom(x = 0, size = 3, prob = 0.30)
```
and the probability of one toxicity in the second dose
```{r dbinom12, echo=TRUE}
dbinom(x = 1, size = 3, prob = 0.30)
```

The probability of getting past the second dose is then
$$
Q_2=((0.512)+(0.512)\times (0.384))\times((0.343)+(0.343)\times (0.441))\approx 0.35 
$$
You can also write a small \texttt{R} function that would handle any $i$ as follows:
```{r rfun}
# (2)-b: Example of a three-up/three-down design
OC = function(p, dose = paste("dose",1:length(p),sep = ""))
 {
   # Conditional probabilities of escalating past the ith dose
   # given that the ith dose has been reached
   Pi = dbinom(x = 0,size = 3,prob = p)
   Pi = Pi + dbinom(x = 0,size = 3,prob = p)*dbinom(x = 1,size = 3,prob = p)

   # Unconditional probability of passing the ith dose
   Qi = cumprod(Pi)

   # Operating characteristic
   OC = 1 - Qi

   return(data.frame(dose = dose, prob = p,Pi = Pi, Qi = Qi, OC = OC))
}
```
Functions are elegant ways to not repeat \texttt{R} code.  Function \texttt{OC} takes one input argument: the probability of toxicity $p$.  A useful feature of \texttt{R} is that the argument can be a vector of (multiple) values.  Thus, we can, for example, input the proportion of treatment-related toxicity at the first and second dose level, $p_1=0.20$ and $p_2=0.30$ respectively.  The function then produces a data set, a ``data frame" as it is known in \texttt{R}, with arguments, the number of doses (as many as the proportions entered in the input - \texttt{length(p)} the length of the vector \texttt{p}), plus the probability -- \texttt{prob} -- $p$, and the probability $P_i$ of continuing past the current stage and the probability $Q_i$ of continuing past all stages including the current stage.  The final element in the data frame is the operating characteristic of the design, i.e., the probability of stopping by the current stage.  The data frame is produced by the \texttt{return} command which produces the only thing that will remain after the execution of the function.

Running function \texttt{OC} in the previous example we obtain
```{r ocex}
OC(p = c(0.2,0.3))
```
Here we have defined a vector in \texttt{R} by \emph{concatenating} two individual elements: 0.2 and 0.3. This is done by putting a list of elements, separated by a comma, within parentheses, following the character ``c". Here, the vector \texttt{p} is
```{r Rvec}
c(0.2, 0.3)
```
This is a vector of one row.  Thus, there is 35\% probability of continuing past the second dose, with 20\% toxicity in the first and 30\% in the second dose. Note that the command 
\begin{center}
\texttt{paste("dose"1:length(p), sep = "")} 
\end{center}
concatenates a number from ``1" to the length of \texttt{p} (the toxicity proportion) to the word ``dose" to create a unique name for $D_i$, where $i=1,\cdots,{\texttt{length(p)}}$.

\subsection{Example of OC for Phase I studies (Piantadosi pp. 235)}
We will show the operating characteristics of two different situations (Piantadosi pp. 235). In the first, (design $A$) significant toxicities arise at later doses, while in the second (design $B$) the toxicity probabilities increase linearly.
```{R OCplot, fig.cap="Figure 2: Hypothetical toxicity probabilities", fig.height=5, fig.width=8, fig.align="center"}
# (2)-c Piantadosi, 2005, pp. 235
ptrueB = seq(0.10,0.55,by = 0.05)
ptrueA = c(0.001,0.01,0.02,0.05,0.11,0.19,0.28,0.39,0.53,0.68)

# Add probabilities of set A
plot(1:10,ptrueA, xlab = "Dose level",ylab = "Probability of toxicity",
     xaxt = "n",yaxt = "n",ylim = c(0,0.8),type = "b", pch = 21, lty = 3,)
     axis(1,1:10)
     axis(2,seq(0,0.8,by = 0.1),las = 1)

# Add grid lines
abline(v = 1:10, col = "lightgray", lty = "dotted")
abline(h = seq(0,0.8,by = 0.1), col = "lightgray", lty = "dotted")

# Adding probabilities of set B
lines(1:10,ptrueB,pch = 16,type = "b")
text(7,0.2,labels = "A")
text(4.5,0.35,labels = "B")
```

Assuming that the optimal dose is the one that gives 30\% chance of toxicity, for set $A$, the trial should terminate between the 7th and 8th dose with a high probability. For the second set (curve B), the trial should terminate near the 5th dose with a high probability. The OCs are shown in Figure 3.

```{R OCcalc, fig.width=8, fig.height=5, fig.cap="Figure 3: Operating characteristics for designs $A$ and $B$", fig.align="center"}

# Calculate OCs
setAoc = OC(ptrueA)
setBoc = OC(ptrueB)

# Add OC of set A
plot(1:10,setAoc$OC,xlab="Dose level",
     ylab="Cumulative probability of stopping",
     xaxt = "n",yaxt = "n",ylim = c(0,1),type = "b",pch = 21,lty = 3,)
axis(1,1:10)
axis(2,seq(0,1,by = 0.1),las = 1)

# Add grid lines
abline(v = 1:10, col = "lightgray", lty = "dotted")
abline(h = seq(0,1,by = 0.1), col = "lightgray", lty = "dotted")

# Adding OC of set B
lines(1:10,setBoc$OC,pch = 16,type = "b")
text(6.5,0.4,labels = "A")
text(3.15,0.65,labels = "B")
```

For set $A$, the expected stopping point is between the 6th and 7th dose (see stopping probabilities close to 50\%), and the design has a nearly 90\% chance of stopping before or at the 8th dose. For set $B$, the expected stopping point is the third dose, and the design has an 85\% chance of stopping before or at the 5th dose (the optimal dose). Thus, in these examples, the three-up/three-down design might stop before the optimal dose is visited.

## Phase II designs
### Single-stage designs
Let $p$ denote the response rate of the new therapy. The response rate of the standard treatment is around 15\%, whereas the new therapy is expected to be successful for about 40\% of the patients. The null hypothesis to be tested is
$$
H0: p \leq 0.15
$$
versus the alternative
$$
H1: p = 0.40
$$
at $\alpha = 0.10$ significance level with power $1-\beta=0.8$. Suppose that the sample size is $n=16$ (we will see later how we arrived at this sample size). The number of responders $X$ among the $n=16$ patients follows the binomial distribution, $X|p\sim \operatorname{Binomial}(n=16,p)$. Common sense suggests that we should reject the null hypothesis in favor of the alternative when the number of responders exceeds some threshold $k$ ($k=0,1,\ldots,16$). We also would like this $k$ to satisfy both the type-I error requirement, i.e., that $\Pr(X>k|p=p_{0})\leq\alpha$ and Type-II error requirement, i.e., $\Pr(X\leq k|p=p_{1})\leq\beta$. We can easily get the binomial probabilities in \verb|R| through the function \verb|dbinom|,
```{R ph2comp}
# (3): Phase II non-comparative study
# Construct a test to investigate if the true resp. rate is at least 15% a = 0.10 
# (one-sided), assume sample size=16 and the resp. rate is 40% under the alternative
n = 16
k = 0:16

# The number of responders (X) out of n patients follows the binomial dist.
# The new therapy worths further examination if X >= k; We need to determine k
round(dbinom(k, size = n, prob = 0.15),5)
```
Note that the function \texttt{dbinom} takes \texttt{k}$1,\cdots, 16$ as a vector and produces a vector of 17 (i.e., $n+1$) elements.
```{R bintest}
bintest = round(data.frame(k = k,dbinom(k,size = n,prob = 0.15),
                           pbinom(k,size = n,prob = 0.15,lower.tail = F),
                           dbinom(k,size = n,prob = 0.40),
                           pbinom(k,size = n,prob = 0.40,lower.tail = F)),5)
names(bintest) = c("k","Pr(X=k|p=0.15)","Pr(X>k|p=0.15)",
                   "Pr(X=k|p=0.40)","Pr(X>k|p=0.40)")

bintest
```
It is easy to see from the output above that the smallest $k$ such that
$Pr(X>k|p=p_{0})\leq\alpha$ is $k = 4$. Thus, we reject the null hypothesis when $X\geq 5$ responses are observed. 

In the above code, note that, again, we passed a vector \texttt{k} of length 17 (=$n+1$) to function \texttt{dbinom}.  The function will execute 17 times and, each time, produce the binomial probability $b(k,n;p)$ for $n=16$ and $k=0,\cdots,16$.  Consequently, not complicated code or loops are needed; \texttt{R} can perform this over all elements of vector \texttt{k}.

Since we use the binomial distribution, we can get the exact significance level of the test by calculating $Pr(X>4|p=0.15)$. Note that this doesn't necessarily correspond to $\alpha=0.10$ the significance level. On the other hand, the power of the test is defined by the probability of rejecting the null hypothesis under the alternative, i.e. $Pr(X>4|p=0.40)$.
```{R binomexact}
# The null is rejected if X>4 or equivalently X>=5!
# Exact significance level and power of the test
k1 = which(bintest[,3] < 0.10)[1]
bintest[k1,c(1,3,5)]
```
Beautiful!  To select the correct line of the \texttt{bintest} data set, we used the \texttt{which} command in \texttt{R} to search for the row where the entry in the third column in the data is less than $\alpha=0.10$.  Then we store this row in variable \texttt{k1}.  Then, we print the \texttt{k1}$^{\rm th}$ row and the first, third and fifth column in the database (entered as vector \texttt{c(1,3,5)}).

Thus, the exact significance level and power of the test are 0.07905 and 0.83343, respectively. You have to keep in mind that although you can set the type-I error at any level you want, you \textbf{cannot} minimize both the type-I and type-II errors for a given sample size. The only thing you can do in order to increase the power (or decrease the Type-II error) is to increase the sample size.

### Sample size calculations
To find the sample size we need to find an $N$ which provides a combination of binomial probabilities which satisfy the Type-I and Type-II errors.  We can write a very small program in \texttt{R} to do this:
```{R rsspow, echo=TRUE}
rsppow<-function(alpha, beta, p0, p1){
  if (p1<=p0) {
    cat("You idiot! p1 must be greater than p0. Try again.")
    } 
  else {
#F is the sample size based on the normal approximation to the Binomial (Fleming, 1982)
    F<-(qnorm(1-beta)*sqrt(p1*(1-p1))+qnorm(1-alpha)*sqrt(p0*(1-p0)))^2/(p1-p0)^2
# We will search (just like PASS is doing) between 0.8F and 4F
    n<-round(0.8*F)
    found<-FALSE
#Probability of false rejection (pp0) and false non-rejection (pp1)
   pp0<-1.0
   pp1<-1.0
while(n<round(4*F) & found==FALSE) {
  j<-0
  while((pp0>alpha | pp1>beta) & j<=n) {
    pp0<-1-pbinom(q = j, size = n, prob = p0)
    pp1<-pbinom(q = j, size = n, prob = p1)
    if(pp0<alpha | pp1<beta){
      j<-j+1
    }
  }
  if(j>n){
    j<-0
    n<-n+1
  }
  else {
    found<-TRUE
  }
}
if(found){
  cat("SAMPLE SIZE (N) = ", n, "\n")
  cat("P0 EXCLUDED IF X .ge. ",j-1,"\n")
  cat("COVERAGE= ", 1-pp0, "\n")
  cat("POWER   = ", 1-pp1, "\n")
} else cat("No suitable sample size found", "\n")
    }
}
```
The program loops from $0.8F$ to $4F$, where
$$
F=\left (\frac{z_{1-\beta}\sqrt{(p_1(1-p_1)}+z_{1-\alpha}\sqrt{p_0(1-p_0)}}{p_1-p_0}\right )^2
$$
is the sample size based on the normal approximation to the binomial distribution\footnote{Fleming TR. One-sample multiple testing procedure for Phase II clinical trials (1982)P.\textit{Biometrics}, 143-151.} With $p_0=0.15$, $p_1=0.40$, $\alpha=0.1$ and $\beta=0.2$, $F=10$.  
Performing this analysis produces the following results:
```{R rsppowanal}
rsppow(alpha=0.1, beta=0.2, p0=0.15, p1=0.4)
```
In the output above, ``coverage" is equal to $1-\alpha$.  Note that \texttt{p0} must be strictly lower than \texttt{p1} or you will get a (rather rude) error message!  In any case, the above output shows that for $N=16$ we obtain coverage $1-\alpha>0.92$ (or $\alpha<0.08$), which means that the attained alpha level is the desired 10\%.  The $1-\beta>0.83$, which is greater than our desired power of 80\%.

### Two-stage designs
Two-stage designs usually involve cumbersome calculations, so we can use the function \verb|ph2simon| do the job for us. Note that the available options are the \emph{minimax} and the \emph{optimal} (Simon's) two-stage design. Two-stage designs enroll some patients in the first-stage, where a first assessment of efficacy is performed. If the therapy seems unacceptable, the trial stops. If not, more patients are recruited in the second stage and the efficacy of the drug is investigated again. Type-I and type-II errors are calculated over both stages and involve the cumulative probability of scenarios which result in the study stopping at the first stage and those which result in the study continuing to the second phase.  The idea is identical to the operating characteristics in the 3-up-3-down phase-1 design.
```{R simon2stage, warning=FALSE}
# 4. Simon's two-stage design
# There are two alternatives in library clinfun:
# Optimal and Minimax 2-stage Phase II designs
# Assume p0 = 0.15, p1 = 0.15,
# type-I error alpha = 0.10 (one-sided), type-II error beta = 0.20
library(clinfun)
ph2simon(pu = 0.15, pa = 0.40, ep1 = 0.10, ep2 = 0.20, nmax = 100)
```

In the above output, note the involcation of package \texttt{clinfun}, which must be loaded or else \texttt{R} won't recognize the command \texttt{ph2simon}.   
For both designs, the hypotheses to be tested are $H0:p\leq0.15$ vs. $H1:p\geq0.40$ (one-sided). For the optimal (Simon's) design we need to enroll $n=18$ patients, of
whom $n_{1} = 7$ are involved in the first stage. If $r_{1} \leq 1$ of them respond, then the study will stop. If $r_1 > 1$ individuals respond, then $n_2 = 11$
additional subjects will be enrolled in the second phase. If $r_2 \leq 4$ respond, the null hypothesis will not be rejected and the drug will not be considered effective. Otherwise, the null hypothesis will be rejected and the conclusion will be that the response rate $p > p_0 = 0.15$. The Simon's design is "optimal" in the sense that it exposes, on average, the minimum number of patients if the true probability of response is $p_0$, i.e.,
$$
ASN (p0) = EN0 = 10.1
$$
is minimized if $H0$ is true (i.e. if $p \leq p_0$).

Regarding the minimax design, the above output implies that we need to enroll $n=16$ patients, of whom $n_{1} = 9$ are involved in the first stage. If $r_{1} \leq 1$ of them respond, then the study will stop. If $r_1 > 1$ individuals respond, then $n_2 = 7$ additional subjects will be enrolled in the second phase. If $r_2 \leq 4$ respond, the null hypothesis will not be rejected and the drug will not be considered effective. Otherwise, the null hypothesis will be rejected and the conclusion will be that the response rate $p > p_0 = 0.15$. The minimax design minimizes the total sample $n$ but may result in larger average sample under the null hypothesis compared with Simon's design. Here, $ASN (p0) = 11.8$ individuals) if $H0$ is true compared to the optimal two-stage design.

## Factorial designs
Factorial designs are efficient when combinations of various treatments are assessed.
### Evaluation of combination therapy for hypertension
We consider the following example\footnote{Hung HMJ. Evaluation of a combination drug with multiple doses in unbalanced factorial design clinical trials.\textit{Stat Med}, 2000; \textbf{19}: 2079--2087.} Various combinations of two anti-hypertensive drugs, four doses of an ACE inhibitor (drug $A$) and three doses of a diuretic (drug $B$) were considered. The results and sample sizes are given in the following table:

\begin{table}[h!]
\caption{\label{tab:sidbp} Table of the antihypertension study (Hung, 2000)}
\begin{center}
  \begin{tabular}{ccccc||ccccc}
    \hline
      & \multicolumn{4}{c}{Sample size} & &  \multicolumn{4}{c}{Mean mmHg in SiDBP}\\
      \hline
      & A0 & A1 & A2 & A3 &  & A0 & A1 & A2 & A3\\
    \hline
    B0 & 75 & 75 & 74 & 48 & & 0  & 1.4 & 2.7 & 4.6 \\
    B1 & 74 & 75 & 74 & 49 & & 1.8 & 2.8 & 5.7 & 8.2 \\
    B2 & 48 & 50 & 48 & 48 & & 2.8 & 4.5 & 7.2 & 10.9 \\
   \hline
  \end{tabular}
\end{center}
\end{table}

We consider simulated data from the above table (with common variance $s=7.07$ mmHg) and equal to the above means. We focus here on the $2\times 2$ factorial of the $A0$, $A1$, $B0$ and $B1$ (top left-hand corner of the table).
 
The data are in file hungab.csv. The data are inputted in \verb|R| and managed by the following code:
```{r hungabcode, eval=FALSE}
# input the data
hungab<-read.csv("H:/teaching/Indiana/PBHL B-582/labs/lab 1 (study designs)/data/hungab.csv")
```
The command `read.csv` reads in a comma-separated-value (CSV) file. There are options for the separation character if this is not a comma, but here the default will do just fine.  A box plot of the four drug combinations is given in Figure 4.

<center>
![Figure 4: Box plot of Hung data](h:/teaching/indiana/PBHL B-582/labs/lab 1 (Study designs)/r markdown/box_plots_2x2_factorial.png){width=60%}</center>

The power of the $2\times 2$ factorial design is in the fact that it combines both groups with the effect $B1$ (i.e., $A1B1$, $A0B1$) and the two groups without the effect $B1$ (i.e., $A1B0$, $A0B0$). This works \underline{only} when there is no interaction effect. 
```{r inputhungab, echo=FALSE}
########### set up ##############
setwd("H:/teaching/Indiana/PBHL B-582/labs/lab 1 (study designs)/data")
hungab <- read.csv("hungab.csv")
```
We can assess the presence of interaction graphically (see Figure 5).
```{r intplot, fig.height=5, fig.width=8, fig.cap="Figure 5: Interaction plot of $A$ versus $B$", fig.align="center"}
# Interaction plot
with(hungab, interaction.plot(A, B, y, type="b", fun=mean, col=c(2:3),  
                 leg.bty="o", leg.bg="beige", 
                 lwd=2, pch=c(18,24),
                 xlab="ACE inhibitor dose (A)",
                 ylab="Mean SidBP drop (mmHg)",
                 #main="Interaction Plot"
                 ))
```

From Figure 5, it does not appear that a significant interaction effect is present. By the way, note in the R code that we can enter entire vectors for arguments of things like color (<font="courier new"> col </font>) or type of symbol (<font="courier new"> pch</font>).  R does the rest.

We can run the $2\times 2$ factorial design without interaction as follows:
```{r twowayanova}
hungnoint<-lm(formula = y ~ A + B, data = hungab)
summary(hungnoint)
```
We can run the $2\times 2$ factorial design with interaction as follows:
```{r twowayanovaX}
hungint<-lm(formula = y ~ A * B, data = hungab)
summary(hungint)
```
We also can run this a four-way comparison where each treatment combination is considered separately.
```{r onewayanova}
hung4arm<-lm(formula = y ~ AB, data = hungab)
summary(hung4arm)
```
The following remarks emanate directly from the previous output:
1. Note that the effect of $B$ (the diuretic) is almost statistically significant at the 5\% level and statistically significant at the 10\% level in the no-interaction model.
2. The interaction model is not statistically significant. Neither is the straight four-way treatment model.

The interaction model is \textit{identical} to the four-way comparison model (note the overall $F$ statistics in the two models). The p-values associated with the various effects are not the same, since the two models are parametrized differently: The interaction model is
$$
E(y_{i})=\beta_0+\beta_1X_{A}+\beta_2X_B+\gamma X_AX_B=\left \{
\begin{array}{l}
E(y_{A_0B_0}) = \beta_0\\
E(y_{A_1B_0}) = \beta_0+\beta_1\\
E(y_{A_0B_1}) = \beta_0+\beta_2\\
E(y_{A_1B_1}) = \beta_0+\beta_1+\beta_2+\gamma\\
\end{array}
\right .
$$
\normalsize
where $X_A=1$ is equivalent to $A1$, $X_B=1$ with $B1$ and, respectively, zero values denote $A0$ and $B0$. The four-way design is
$$
E(y_{ij})=\zeta+\eta+\theta+\kappa=\left \{
\begin{array}{l}
E(y_{A_0B_0}) = \zeta+\kappa\\
E(y_{A_1B_0}) = \zeta+\eta\\
E(y_{A_0B_1}) = \zeta+\theta\\
E(y_{A_1B_1}) = \zeta\\
\end{array}
\right .
$$
\normalsize
so that (see previous output), $\beta_0=\zeta+\kappa$, $\beta_0+\beta_1=\zeta+\eta$, etc.

\section{Cross-over trials}
In cross-over trials, each patient is assigned both treatments under comparison. Consider the following example.
\subsection{Example of cross-over trial design: Pronethalol for angina pectoris.}
The drug pronethalol was tested in angina pectoris. A cross-over design was used where patients were randomized to receive placebo or pronethalol. The number of angina episodes while receiving one or the other treatment were counted. The data can be stored in long format (i.e. one row per patient) or a wide format (i.e., one row per patient, where each column stores the response to pronethalol or placebo.  Here, the data (stored in text file \texttt{angina.txt} are in the wide format.
```{r cctrial}
# 5. Example of a Cross-over trial for angina pectoris
angina <- read.table("../data/angina.txt", header=TRUE, quote="\"")
head(angina)
```
We can check the normality assumption for the differences of angina attacks or for the actual measurements on the two groups.
```{r qqplots, fig.width=5, fig.height=5, fig.align="center", fig.cap="Figure 5: Histograms for the difference in angina pectoris episodes between pronethanol and placebo groups."}
# (a) Check normality assumption for the difference in attacks of angina
library(lattice)
histogram( ~ Difference,data = angina)
```
We first turn the data in the long format.
```{r turnlong}
# Create independent sample data
angina2 = data.frame(attacks = NA,
                    trt = rep(c("Pronethalol","Placebo"),each = nrow(angina)))
angina2[angina2$trt == "Pronethalol","attacks"] = angina$Pronethalol
angina2[angina2$trt == "Placebo","attacks"] = angina$Placebo
```
Now we can also produce a histogram of the raw number of angina events by treatment group.
```{r anginahist, fig.width=5, fig.align="center", fig.cap="Figure 6: Histograms of the number of angina pectoris episodes between pronethanol and placebo groups."}
histogram( ~ attacks|trt,data = angina2)
```

Note in the histogram, the syntax \texttt{$\sim$ attacks|trt} results in a histogram of the number of attacks by treatment arm. This is a common feature in various \verb|R| commands. 
We can also check the normality of the distribution of the events.
```{r qqplotevents, fig.width=5, fig.height=5, fig.cap="Figure 7: Q-Q plot of the number of angina pectoris episodes between pronethanol and placebo groups", fig.align="center"}
# Q-Q plot, normal quantiles
qqmath(~ attacks|trt, data = angina2, distribution = qnorm,
      prepanel = prepanel.qqmathline,
      panel = function(x, ...) {
      panel.qqmathline(x, ...)
      panel.qqmath(x, ...)
      })
```
The same can be done for the differences.  The Q-Q plot of the differences in the number of angina episodes is given in Figure 7.

```{r qqplotdiff, fig.width=5, fig.height=5, fig.cap="Figure 8: Q-Q plot of the differences in the number of angina pectoris episodes between pronethanol and placebo groups.", fig.align="center"}
# Q-Q plot of the differences
qqmath(~ Difference, data = angina, distribution = qnorm,
       prepanel = prepanel.qqmathline,
       panel = function(x, ...) {
         panel.qqmathline(x, ...)
         panel.qqmath(x, ...)
       })
```
The distribution is far from normal so we go on with nonparametric tests.

### Data analysis
First, we ignore the study design and assume that the data come from two independent populations. To do so, we have to use the data frame \emph{angina2}, which includes two rows per patient. Then, we perform a two-sample Wilcoxon test
```{r naiveanalysis}
# (b) naive analysis
wilcox.test(attacks ~ trt, data = angina2)
```
This test suggests that there is no significant difference between the two groups with respect to the number of attacks of angina. Let's see what happens if we take into account the matched study design.
 
We are to use the paired version of wilcoxon test (i.e. a nonparametric equivalent of Student's t tests)
```{r pairedwilcoxon, , warning=FALSE}
#C Dependent 2-group Wilcoxon Signed Rank Test
wilcox.test(angina$Placebo,angina$Pronethalol,paired = T)
```
Now the treatment effect becomes significant! This implies that when there is a matched study design we \textbf{must} take into account the study design in the analysis. Unfortunately, cross-over trials require more assumptions to be valid. For example, in this analysis, we have assumed that there is no treatment$\times$period interaction. In other words, are the effects of pronethanol the same if it follows placebo as they are if pronethanol comes first? Other more general prerequisites for application of crossover studies are given in the notes.