Lab 2: Study Designs

(i) Phase I Study

Data Setup and Analysis

First, we will set the working directory to the location of our data file and then import the data (trialPh1.csv) into R.

# Set working directory to the location of data files
setwd("C:/Users/user/Desktop/Clinical_trials/Labs/Lab_2_Study_designs")

# Import the Phase 1 trial data
trialPh1 <- read.csv("data/trialPh1.csv")

(a) Data Examination and Model Fitting

By examining the data, we can see that each patient (id) has been assigned to exactly one dose of the drug. Ten doses have been investigated; let’s take a look at the data structure:

# Display the first few rows of the dataset
head(trialPh1)

##   id toxicity      dose
## 1  1        0 0.2500000
## 2  2        0 0.2750000
## 3  3        0 0.3025000
## 4  4        0 0.3327500
## 5  5        0 0.3660250
## 6  6        1 0.4026275

The outcome variable (toxicity) is binary, making logistic regression appropriate. We assume that the logit transformation of the probability $\text{Logit}(p) = \log\left(\frac{p}{1-p}\right)$ depends linearly on the logarithm of the dose:

$\text{Logit}(p) = \beta_0 + \beta_1 \log(\text{Dose})$

To fit such a model, we can use glm in R. We need to define the appropriate formula indicating the outcome variable (on the left) and the covariates (on the right). We also need to specify that R should use the binomial distribution by setting family = "binomial", which assumes a logit link function by default.

# Fit logistic regression model
fit <- glm(toxicity ~ log(dose), family = "binomial", data = trialPh1)

# Display the summary of the fitted model
summary(fit)

## 
## Call:
## glm(formula = toxicity ~ log(dose), family = "binomial", data = trialPh1)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   2.9125     0.2341   12.44   <2e-16 ***
## log(dose)     4.9316     0.2890   17.07   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2050.5  on 1999  degrees of freedom
## Residual deviance: 1633.7  on 1998  degrees of freedom
## AIC: 1637.7
## 
## Number of Fisher Scoring iterations: 5

(b) Optimal Dose Calculation

The optimal dose is defined as the one that corresponds to a 30% probability of toxicity. Thus, we need to solve for Dose: $\text{Dose} = \exp\left(\frac{\text{Logit}(p) - \beta_0}{\beta_1}\right)$

To estimate this optimal dose, we calculate:

# Extract coefficients from the fitted model
beta_hat <- coef(fit)

# Define inverse logit functions
invlogit <- function(x) {
  exp(x) / (1 + exp(x))
  }

# Define logit functions
logit <- function(x) {
  log(x / (1 - x))
  }

# Calculate optimal dose (Dose30)
Dose30 <- exp((logit(0.30) - beta_hat[1]) / beta_hat[2])
Dose30

## (Intercept) 
##    0.466553

Thus, the optimal dose which is associated with 30% chance of toxicity is 0.466.

We can produce a graph showing the predictions of the model:

# Create probability plot
curve(invlogit(beta_hat[1] + beta_hat[2] * x),
      from = min(log(trialPh1$dose)),
      to = max(log(trialPh1$dose)),
      ylim = c(0,0.60),
      xlab = "Logarithm of dose",
      ylab = "Probability of toxicity",
      main = "Probability of toxicity \nbased on a logistic regression model")

# Adding dashed red lines for Dose30 on the plot
lines(c(-1.4, log(Dose30)), rep(0.3, 2), col = "red", lty = 2) # Horizontal line at 30%
lines(rep(log(Dose30), 2), c(0, 0.3), col = "red", lty = 2) # Vertical line at Dose30

Figure 1. Probability of toxicity based on a logistic regression model

Note: The x-axis is on a logarithmic scale, so the log-dose is $\log(0.466553) \approx -0.762$ .

(c) Ethical Considerations

Although this approach can accurately estimate the optimal dose, it cannot be used in modern clinical trials primarily due to ethical reasons. In this artificial example, patients have been exposed to doses that are likely to lead to serious toxicities; however, the goal is to expose as few patients as possible to new experimental drug.

(ii) Standard 3+3 Design

(a) Derivation of Probabilities

The conditional probability of escalating beyond the $i$ th dose is expressed as:

$P_i = b(0; 3, p_i) + b(1; 3, p_i) \times b(0; 3, p_i),$

where $b(k;n, p_i)={\rm Pr}(X=k; n,p_i)$ denotes the binomial probability of achieving $k$ successes out of $n$ trials, with a success probability of $p_i$ .

There are two scenarios that can result in an increase to the $i$ th dose:

No toxicities are observed
One toxicity is observed, but no additional toxicities occur in the expanded cohort

Given the hypothetical probability of toxicity at the $i$ th dose, denoted as $p_{i}$ :

Scenario 1 probability: $b(0;3,p_{i})$
Scenario 2 probability: $b(1;3,p_{i})\times b(0;3,p_{i})$ (by multiplication law of probability)

The second scenario involves two events: one toxicity in the original cohort (which occurs with probability $b(1;3,p_{i})$ ) and no toxicity in additional subjects (which occurs with probability $b(0;3,p_{i})$ , conditional on one toxicity in the original cohort).

Overall, the conditional probability of passing the $i$ th dose is given by:

$b(0;3,p_{i}) + b(1;3,p_{i})\times b(0;3,p_{i})$

Then, the unconditional probability of passing the $i$ th dose is:

$Q_{i}=\prod_{k=1}^{i}b(0;3,p_{k}) + b(1;3,p_{k})\times b(0;3,p_{k}).$

This follows from the multiplication law of probability¹:

Here, event $A$ , which represents exactly one among the first three subjects experiencing serious treatment-related toxicity or death with probability $P(A)=b(1,3;p)$ ), does not provide any information about event $B$ , which represents no toxicities or deaths among three additional subjects with probability $P(B)=b(0,3;p)$ . Thus, we can apply this principle as follows:

$\begin{align*} P(A\cap B) &= P(\text{1 toxicity among 3 subjects AND 0 toxicities among 3 subjects}) \\ &= P(A)P(B) = b(1,3;p)b(0,3;p) \end{align*}$ To obtain the unconditional probability of passing the $i$ th dose, we need to multiply all earlier probabilities together. You may find these calculations similar to those involved in understanding relationships between survival and hazard functions.

(b) Implementation in R

# Function to implement the three-up/three-down design
OC <- function(p, dose = paste("dose", 1:length(p), sep = "")) {
  # Calculate conditional probabilities of escalating past the i-th dose
  Pi <- dbinom(x = 0, size = 3, prob = p) # Probability of no responders
  Pi <- Pi + dbinom(x = 0, size = 3, prob = p) * dbinom(x = 1, size = 3, prob = p) # Add one responder probability
  
  # Calculate the unconditional probability of passing the i-th dose
  Qi <- cumprod(Pi)
  
  # Calculate the operating characteristic (OC)
  OC <- 1 - Qi
  
  return(data.frame(dose = dose, prob = p, Pi = Pi, Qi = Qi, OC = OC))
}

# Example usage with hypothetical probabilities
OC_example <- OC(p = c(0.2, 0.3))
print(OC_example)

##    dose prob       Pi        Qi        OC
## 1 dose1  0.2 0.708608 0.7086080 0.2913920
## 2 dose2  0.3 0.494263 0.3502387 0.6497613

(c) Visualization of Scenarios

First, let’s plot the toxicity probabilities for both scenarios:

# Define probability scenarios
ptrueB <- seq(0.10, 0.55, by = 0.05) # Probabilities for scenario B
ptrueA <- c(0.001, 0.01, 0.02, 0.05, 0.11, 0.19, 0.28, 0.39, 0.53, 0.68) # Probabilities for scenario A

# Plotting probabilities of set A
plot(1:10, ptrueA,
     xlab = "Dose level",
     ylab = "Probability of toxicity",
     xaxt = "n", yaxt = "n",
     ylim = c(0, 0.8),
     type = "b", pch = 21,
     lty = 3,
     main = "Toxicity Probabilities by Dose Level")

# Customizing axes
axis(1, 1:10)
axis(2, seq(0, 0.8, by = 0.1), las = 1)

# Adding grid lines to the plot
abline(v = 1:10, col = "lightgray", lty = "dotted")
abline(h = seq(0, 0.8, by = 0.1), col = "lightgray", lty = "dotted")

# Adding probabilities of set B to the same plot
lines(1:10, ptrueB, pch = 16, type = "b")

text(7, 0.2, label = "A") # Label for set A
text(4.5, 0.35,label = "B") # Label for set B

Now, let’s visualize the Operating Characteristics:

# Calculate operating characteristics (OCs) for both sets A and B
setAoc <- OC(ptrueA)
setBoc <- OC(ptrueB)

# Plotting OCs for set A
plot(1:10, setAoc$OC,
     xlab = "Dose level",
     ylab = "Cumulative probability of stopping",
     xaxt = "n", yaxt = "n",
     ylim = c(0, 1),
     type = "b", pch = 21,
     lty = 3,
     main = "Operating Characteristics by Dose Level")

# Customizing axes for OC plot
axis(1, 1:10)
axis(2, seq(0, 1, 0.1), las = 1)

# Adding grid lines to the OC plot
abline(v = 1:10, col = "lightgray", lty = "dotted")
abline(h = seq(0, 0.9, 0.1), col = "lightgray", lty = "dotted")

# Adding OCs for set B to the same plot
lines(1:10, setBoc$OC, pch = 16, type = "b")

text(6.5, 0.4, label = "A") # Label for set A in OC plot
text(3.15, 0.65, label = "B") # Label for set B in OC plot

Analysis of Results

For set $A$ :

Expected stopping point is between doses 6 and 7 (stopping probabilities close to 50%)
~90% probability of stopping before or at dose 8

For scenario $B$ :

Expected stopping point is at dose 3
85% probability of stopping before or at dose 5 (the optimal dose)
The design may stop before reaching the optimal dose

This analysis suggests that while the three-up/three-down design is conservative from a safety perspective, it might be overly cautious and stop before reaching the optimal dose level.

(iii) Phase II Non-comparative Study

(a) Examining Hypotheses and Test Construction

Let $p$ denote the response rate of the new therapy. The standard treatment has a response rate of approximately 15%, while the new therapy is expected to achieve success in about 40% of patients. We test the following hypotheses:

$H_0: p \leq 0.15$ $H_1: p = 0.40$

with significance level $\alpha = 0.10$ , power $1-\beta=0.8$ and sample size $n=16$ . The number of responders $X$ follows a binomial distribution: $X|p\sim \operatorname{Binomial}(n=16,p)$

Let’s calculate the binomial probabilities using R:

Common sense suggests that we should reject the null hypothesis in favor of the alternative when the number of responders exceeds some threshold $k$ ( $k=0,1,\ldots,16$ ). We also would like this $k$ to satisfy both the type-I error requirement, i.e., that $\Pr(X>k|p=p_{0})\leq\alpha$ and Type-II error requirement, i.e., $\Pr(X\leq k|p=p_{1})\leq\beta$ . We can easily get the binomial probabilities in R through the function dbinom:

# Calculate binomial probabilities
n <- 16 # Sample size
k <- 0:16 # Possible number of responders

# Calculate probabilities under null hypothesis
round(dbinom(k, size = n, prob = 0.15),5)

##  [1] 0.07425 0.20965 0.27748 0.22851 0.13106 0.05551 0.01796 0.00453 0.00090
## [10] 0.00014 0.00002 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000

Note that the function dbinom takes k $1,\cdots, 16$ as a vector and produces a vector of 17 (i.e., $n+1$ ) elements.

Let’s create a comprehensive summary of the test:

# Create a data frame summarizing binomial test results
bintest <- round(data.frame(
  k = k,
  dbinom(k, size = n, prob = 0.15),
  pbinom(k, size = n, prob = 0.15, lower.tail = FALSE),
  dbinom(k, size = n, prob = 0.40),
  pbinom(k, size = n, prob = 0.40, lower.tail = FALSE)),
  digits = 5)

# Add column names
names(bintest) <- c("k","Pr(X=k|p=0.15)","Pr(X>k|p=0.15)",
                   "Pr(X=k|p=0.40)","Pr(X>k|p=0.40)")

print(bintest)

##     k Pr(X=k|p=0.15) Pr(X>k|p=0.15) Pr(X=k|p=0.40) Pr(X>k|p=0.40)
## 1   0        0.07425        0.92575        0.00028        0.99972
## 2   1        0.20965        0.71610        0.00301        0.99671
## 3   2        0.27748        0.43862        0.01505        0.98166
## 4   3        0.22851        0.21011        0.04681        0.93485
## 5   4        0.13106        0.07905        0.10142        0.83343
## 6   5        0.05551        0.02354        0.16227        0.67116
## 7   6        0.01796        0.00559        0.19833        0.47283
## 8   7        0.00453        0.00106        0.18889        0.28394
## 9   8        0.00090        0.00016        0.14167        0.14227
## 10  9        0.00014        0.00002        0.08395        0.05832
## 11 10        0.00002        0.00000        0.03918        0.01914
## 12 11        0.00000        0.00000        0.01425        0.00490
## 13 12        0.00000        0.00000        0.00396        0.00094
## 14 13        0.00000        0.00000        0.00081        0.00013
## 15 14        0.00000        0.00000        0.00012        0.00001
## 16 15        0.00000        0.00000        0.00001        0.00000
## 17 16        0.00000        0.00000        0.00000        0.00000

(b) Finding the Rejection Region

It is easy to see from the output above that the smallest $k$ satisfying $\text{Pr}(X>k|p=p_0) \leq \alpha$ is $k = 4$ . Therefore, we reject $H_0$ when we observe $X \geq 5$ responses.

# Find threshold k value for rejection
k1 <- which(bintest[,3] < 0.10)[1] # Find first k where Pr(X>k|p=0.15) < alpha level
result <- bintest[k1, c(1,3,5)] # Extract relevant results for decision making 
print(result) # Display results for decision making

##   k Pr(X>k|p=0.15) Pr(X>k|p=0.40)
## 5 4        0.07905        0.83343

Thus, the exact significance level and power of the test are 0.07905 and 0.83343, respectively. You have to keep in mind that although you can set the type-I error at any level you want, you cannot minimize both the type-I and type-II errors for a given sample size. The only thing you can do in order to increase the power (or decrease the Type-II error) is to increase the sample size.

(c) Sample Size Calculation Function

To verify our results, we can implement a sample size calculation function that determines the required sample size while controlling both type-I and type-II errors:

# Sample size calculations function definition 
rsppow <- function(alpha, beta, p0, p1) {
  # Validate input
  if (p1 <= p0) {
    cat("Error: p1 must be greater than p0. Please try again.\n")
    return(NULL)
  } 
  
  # Calculate initial sample size using normal approximation (Fleming, 1982)
  F <- (qnorm(1 - beta) * sqrt(p1 * (1 - p1)) +
        qnorm(1 - alpha) * sqrt(p0 * (1 - p0)))^2 / (p1 - p0)^2
  
  # Search for the sample size between 0.8F and 4F
  n <- round(0.8 * F)
  found <- FALSE
  
  # Initialize probabilities of false rejection (pp0) and false non-rejection (pp1)
  pp0 <- 1.0
  pp1 <- 1.0
  
  # Loop until a suitable sample size is found or the upper limit is reached
  while (n < round(4 * F) && !found) {
    j <- 0
    
    # Inner loop to check probabilities until conditions are met
    while ((pp0 > alpha || pp1 > beta) && j <= n) {
      pp0 <- 1 - pbinom(q = j, size = n, prob = p0)   # Probability of false rejection
      pp1 <- pbinom(q = j, size = n, prob = p1)        # Probability of false non-rejection
      
      if (pp0 < alpha || pp1 < beta) {
        j <- j + 1
      }
    }
    
    # If no suitable j found, increment n and continue searching
    if (j > n) {
      j <- 0
      n <- n + 1
    } else {
      found <- TRUE
    }
  }
  
  # Output results if a suitable sample size was found
  if (found) {
    cat("SAMPLE SIZE (N) =", n, "\n")
    cat("P0 EXCLUDED IF X ???", j - 1, "\n")
    cat("COVERAGE =", 1 - pp0, "\n")
    cat("POWER =", 1 - pp1, "\n")
  } else {
    cat("No suitable sample size found.\n")
  }
}

The program loops from $0.8F$ to $4F$ , where $F=\left (\frac{z_{1-\beta}\sqrt{p_1(1-p_1)}+z_{1-\alpha}\sqrt{p_0(1-p_0)}}{p_1-p_0}\right )^2$ is the sample size based on the normal approximation to the binomial distribution². With $p_0=0.15$ , $p_1=0.40$ , $\alpha=0.1$ and $\beta=0.2$ , $F=10$ .

Let’s verify our calculations:

rsppow(alpha = 0.1, beta = 0.2, p0 = 0.15, p1 = 0.4)

## SAMPLE SIZE (N) = 16 
## P0 EXCLUDED IF X ??? 4 
## COVERAGE = 0.9209487 
## POWER = 0.8334326

The results confirm our earlier findings: - With $N=16$ , we achieve coverage ( $1- \alpha > 0.92$ ) (or $\alpha < 0.08$ ), meeting our desired significance level of 0.10 - The power ( $1- \beta > 0.83$ ) exceeds our desired power of 0.80 - These results verify our exact binomial calculations using the rsppow program

(iv) Sample Size Calculation for Simon’s Two-stage Design

Two-stage designs typically involve complex calculations. We’ll use the ph2simon function to implement both minimax and optimal (Simon’s) two-stage designs. These designs assess efficacy in two stages: 1. First stage: Initial efficacy assessment with potential early stopping 2. Second stage: Additional recruitment if results are promising

Two-stage designs enroll some patients in the first-stage, where a first assessment of efficacy is performed. If the therapy seems unacceptable, the trial stops. If not, more patients are recruited in the second stage and the efficacy of the drug is investigated again. Type-I and type-II errors are calculated over both stages and involve the cumulative probability of scenarios which result in the study stopping at the first stage and those which result in the study continuing to the second phase. The idea is identical to the operating characteristics in the 3-up-3-down phase-1 design.

# Load required library
library(clinfun)

# Define parameters for the Simon's two-stage design
p0 <- 0.15    # Response rate under the null hypothesis
p1 <- 0.40    # Response rate under the alternative hypothesis
alpha <- 0.10 # Type I error rate (one-sided)
beta <- 0.20  # Type II error rate

# Calculate optimal and minimax designs
simon_results <- ph2simon(pu = p0, pa = p1, ep1 = alpha, ep2 = beta, nmax = 100)

# Print results of Simon's two-stage design
print(simon_results)

## 
##  Simon 2-stage Phase II design 
## 
## Unacceptable response rate:  0.15 
## Desirable response rate:  0.4 
## Error rates: alpha =  0.1 ; beta =  0.2 
## 
##         r1 n1 r  n EN(p0) PET(p0)   qLo   qHi
## Minimax  1  9 4 16  11.80  0.5995 0.457 1.000
## Optimal  1  7 4 18  10.12  0.7166 0.000 0.457

Interpretation of Results

For both designs, the hypotheses to be tested are $H_0:p\leq0.15$ vs. $H_1:p\geq0.40$ (one-sided).

Optimal (Simon’s) Design:

We need to enroll $n=18$ patients, of whom $n_1 = 7$ are involved in the first stage. - Decision rule: If $r_1 \leq 1$ of them respond, then the study will stop.

If $r_1 > 1$ individuals respond, then $n_2 = 11$ additional subjects will be enrolled in the second phase.

If $r_2 \leq 4$ respond, the null hypothesis will not be rejected and the drug will not be considered effective.

Otherwise, the null hypothesis will be rejected and the conclusion will be that the response rate $p > p_0 = 0.15$ .

The Simon’s design is “optimal” in the sense that it exposes, on average, the minimum number of patients if the true probability of response is $p_0$ , i.e., $ASN (p_0) = EN0 = 10.1$ is minimized if $H_0$ is true (i.e. if $p \leq p_0$ ).

Minimax Design:

We need to enroll $n=16$ patients, of whom $n_1 = 9$ are involved in the first stage. - Decision rule: If $r_1 \leq 1$ of them respond, then the study will stop.

If $r_1 > 1$ individuals respond, then $n_2 = 7$ additional subjects will be enrolled in the second phase.

If $r_2 \leq 4$ respond, the null hypothesis will not be rejected and the drug will not be considered effective.

Otherwise, the null hypothesis will be rejected and the conclusion will be that the response rate $p > p_0 = 0.15$ .

The minimax design minimizes the total sample $n$ but may result in larger average sample under the null hypothesis compared with Simon’s design. Here, $ASN (p_0) = 11.8$ individuals) if $H_0$ is true compared to the optimal two-stage design.

(v) Example of a Cross-over Trial for Angina Pectoris

(a) Data Import and Normality Assessment

First, let’s import and examine the data:

# Import data
library(readr)
angina <- read_table("data/angina.txt") # Load angina data

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   Patient = col_double(),
##   Placebo = col_double(),
##   Pronethalol = col_double(),
##   Difference = col_double()
## )

head(angina) # Display first few rows of the dataset

## # A tibble: 6 × 4
##   Patient Placebo Pronethalol Difference
##     <dbl>   <dbl>       <dbl>      <dbl>
## 1       1      71          29         42
## 2       2     323         348         25
## 3       3       8           1          7
## 4       4      14           7          7
## 5       5      23          16          7
## 6       6      34          25          9

Let’s assess normality through visualizations:

We can check the normality assumption for the differences in angina attacks or for the actual measurements between the two groups.

# Histogram of differences
library(lattice)
histogram(~ Difference, data = angina,
          main = "Histogram of Differences in Angina Attacks",
          xlab = "Difference in Attacks",
          ylab = "Frequency")

Figure 1: Distribution of differences in angina attacks

We transform the data into a long format suitable for analysis.

# Transform data into long format suitable for analysis
angina_long <- data.frame(
  attacks = NA,
  trt = rep(c("Pronethalol", "Placebo"), each = nrow(angina))
)
angina_long[angina_long$trt == "Pronethalol", "attacks"] <- angina$Pronethalol
angina_long[angina_long$trt == "Placebo", "attacks"] <- angina$Placebo

Now we can produce a histogram of the raw number of angina events by treatment group.

# Histogram of Angina Events by Treatment Group
histogram(~ attacks | trt, data = angina_long,
          main = "Histogram of Angina Episodes by Treatment Group",
          xlab = "Number of Angina Episodes",
          ylab = "Frequency")

Figure 2: Distribution of angina episodes by treatment

Note in the histogram, the syntax ~ attacks | trt results in a histogram of the number of attacks by treatment arm. This is a common feature in various R commands.

We can also check the normality of the distribution of events using Q-Q plots.

# Q-Q plot for number of angina episodes between treatment groups
qqmath(~ attacks | trt, data = angina_long,
       distribution = qnorm,
       prepanel = prepanel.qqmathline,
       panel = function(x, ...) {
         panel.qqmathline(x, ...)
         panel.qqmath(x, ...)
       })

Figure 7: Q-Q plot of the number of angina pectoris episodes between pronethanol and placebo groups

The same can be done for the differences. The Q-Q plot of the differences in the number of angina episodes is given in Figure 7.

# Q-Q plot of differences in angina episodes between treatment groups
qqmath(~ Difference, data = angina,
       distribution = qnorm,
       prepanel = prepanel.qqmathline,
       panel = function(x, ...) {
         panel.qqmathline(x, ...)
         panel.qqmath(x, ...)
       })

Figure 8: Q-Q plot of the differences in the number of angina pectoris episodes between pronethanol and placebo groups.

The distribution is far from normal so we go on with nonparametric tests.

(b) Naive Analysis (Independent Samples)

Initially, we ignore the study design and assume that the data come from two independent populations. We will use the angina data frame, which includes two rows per patient. We perform a two-sample Wilcoxon test:

# (b) Naive analysis using Wilcoxon test for independent samples
wilcox.test(attacks ~ trt, data = angina_long)

## Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
## compute exact p-value with ties

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  attacks by trt
## W = 88, p-value = 0.3705
## alternative hypothesis: true location shift is not equal to 0

This test suggests that there is no significant difference between the two groups regarding the number of attacks of angina. Let’s see what happens when we take into account the matched study design. We will use the paired version of the Wilcoxon test (i.e., a nonparametric equivalent to Student’s t-test):

(c) Analysis Accounting for Study Design

Now considering the matched design using paired Wilcoxon test:

# Paired Wilcoxon test
wilcox.test(angina$Placebo, angina$Pronethalol, paired = TRUE)

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  angina$Placebo and angina$Pronethalol
## V = 67, p-value = 0.03066
## alternative hypothesis: true location shift is not equal to 0

Key Findings:

The data shows significant departure from normality, necessitating non-parametric tests
Naive analysis (independent samples) shows no significant treatment effect
Proper paired analysis reveals significant treatment differences
Important assumptions for cross-over trials include:
- No treatment $\times$ period interaction
- No carryover effects
- Adequate washout period between treatments

Now, the treatment effect becomes significant! This implies that when considering a matched study design, we must account for this in our analysis. However, cross-over trials require more assumptions to be valid. For example, we assume that there is no treatment $\times$ period interaction. In other words, do the effects of pronethalol differ depending on whether it follows placebo or comes first? Additional prerequisites for applying crossover studies are provided in supplementary notes.

For independent events $A$ and $B$ , $P(A\cap B)=P(A)P(B)$ ↩︎
Fleming TR. One-sample multiple testing procedure for Phase II clinical trials (1982). Biometrics, 143-151.↩︎