Problem 1

Re‐write the survival times in terms of patient time, and create a simple data set listing the survival time and censoring indicator for each patient.

# set up data
patient_id <- 1:5
surv_time <- c(5,5,4,3,1)
status <- c(0,0,1,1,1)  # 0 is censored, 1 is died

data.frame(id = patient_id, surv_time, status)
##   id surv_time status
## 1  1         5      0
## 2  2         5      0
## 3  3         4      1
## 4  4         3      1
## 5  5         1      1

Problem 2

Consider the survival data from problem 1. Assuming that these observations are from an exponential distribution, find the MLE estimate for Ī» and the standard error for this estimate.

Solution

The MLE estimate for \(\lambda\) using the exponential distribution is \(\hat{\lambda} = \frac{d}{V} = \frac{3}{18} \approx 0.1667\) with a standard error of \(\sqrt{d/V^2} = \sqrt{3/18^2} \approx 0.0962\)

Problem 3

Another parametric survival distribution is the log‐normal distribution. Use the density and cumulative distribution R functions ā€œdlnormā€ and ā€œplnormā€ to compute and plot the lognormal hazard functions with the parameter ā€œmeanlogā€ taking the values 0, 1, and 2, and with ā€œsdlogā€ fixed at 0.25.

mean=c(0,1,2)
lognormHaz <- function(x, meanlog, sdlog) dlnorm(x, meanlog = meanlog, sdlog = sdlog)/plnorm(x, meanlog=meanlog, sdlog=sdlog, lower.tail=F)

curve(lognormHaz(x, meanlog=mean[1],sdlog=0.25), from=0, to=80, ylab='Hazard', xlab='Time', col="red")
curve(lognormHaz(x, meanlog=mean[2],sdlog=0.25), from=0, to=80, ylab='Hazard', xlab='Time', col="blue",add = T)
curve(lognormHaz(x, meanlog=mean[3],sdlog=0.25), from=0, to=80, ylab='Hazard', xlab='Time', col="green",add = T)
legend("topright", legend=c("meanlog = 0", "meanlog = 1","meanlog = 2"), col=c("red","blue","green"), lwd=2)

Problem 4

Using the data from problem 1, obtain the Kaplan‐Meier estimates for the survival probabilities at each event time. Also, construct a 95% confidence interval for the survival probabilities. Then plot the survival function estimate with the 95% confidence interval limits.

library(survival)
## Warning: package 'survival' was built under R version 3.5.2
tt=c(5,5,4,3,1)
status=c(0,0,1,1,1)
Surv(tt,status)
## [1] 5+ 5+ 4  3  1
result.km=survfit(Surv(tt,status)~1,conf.type="log-log")
summary(result.km)
## Call: survfit(formula = Surv(tt, status) ~ 1, conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1      5       1      0.8   0.179        0.204        0.969
##     3      4       1      0.6   0.219        0.126        0.882
##     4      3       1      0.4   0.219        0.052        0.753
plot(result.km)

Problem 5

Repeat part (4) using the Nelson‐Aalen estimator. Based on this estimator, estimate the median survival time.

result.fh=survfit(Surv(tt,status)~1,type="fh",conf.type="log-log")
summary(result.fh)
## Call: survfit(formula = Surv(tt, status) ~ 1, type = "fh", conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1      5       1    0.819   0.164       0.2418        0.972
##     3      4       1    0.638   0.204       0.1629        0.894
##     4      3       1    0.457   0.211       0.0829        0.782
plot(result.fh)

The estimated median survival time is 4.

Problem 6

The remission times of 42 patients with acute leukemia were reported from a clinical trial undertaken to assess the ability of a drug called 6‐mercaptopurine (6‐MP) to maintain in remission. Each patient was randomized to receive either 6‐MP or a placebo. The following data were obtained (in weeks):

library(survival)
tt2 <- c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,25,32,32,34,35,1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23)
status2 <- c(rep(1,times = 9), rep(0, times = 12),rep(1,times =21))
treat <- c(rep(1,times =21),rep(0,times =21))
survdiff(Surv(tt2, status2) ~ treat)
## Call:
## survdiff(formula = Surv(tt2, status2) ~ treat)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## treat=0 21       21     10.7      9.77      16.8
## treat=1 21        9     19.3      5.46      16.8
## 
##  Chisq= 16.8  on 1 degrees of freedom, p= 4e-05
Part (a)

Use R to obtain the Kaplan‐Meier estimates and 95% CI for the survival probabilities at each event time for the 6‐MP data.

result.m6=survfit(Surv(tt2[treat==1],status2[treat==1])~1,conf.type="log-log")
summary(result.m6)
## Call: survfit(formula = Surv(tt2[treat == 1], status2[treat == 1]) ~ 
##     1, conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     6     21       3    0.857  0.0764        0.620        0.952
##     7     17       1    0.807  0.0869        0.563        0.923
##    10     15       1    0.753  0.0963        0.503        0.889
##    13     12       1    0.690  0.1068        0.432        0.849
##    16     11       1    0.627  0.1141        0.368        0.805
##    22      7       1    0.538  0.1282        0.268        0.747
##    23      6       1    0.448  0.1346        0.188        0.680
#plot(result.m6)
Part (b)

Repeat part (a) using the Placebo data.

result.pl=survfit(Surv(tt2[treat==0],status2[treat==0])~1,conf.type="log-log")
summary(result.pl)
## Call: survfit(formula = Surv(tt2[treat == 0], status2[treat == 0]) ~ 
##     1, conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1     21       2   0.9048  0.0641      0.67005        0.975
##     2     19       2   0.8095  0.0857      0.56891        0.924
##     3     17       1   0.7619  0.0929      0.51939        0.893
##     4     16       2   0.6667  0.1029      0.42535        0.825
##     5     14       2   0.5714  0.1080      0.33798        0.749
##     8     12       4   0.3810  0.1060      0.18307        0.578
##    11      8       2   0.2857  0.0986      0.11656        0.482
##    12      6       2   0.1905  0.0857      0.05948        0.377
##    15      4       1   0.1429  0.0764      0.03566        0.321
##    17      3       1   0.0952  0.0641      0.01626        0.261
##    22      2       1   0.0476  0.0465      0.00332        0.197
##    23      1       1   0.0000     NaN           NA           NA
#plot(result.m6)
Part (c)

Plot both Kaplan‐Meier curves in one graph.

plot(survfit(Surv(tt2, status2) ~ treat) , xlab ="Time", ylab ="Survival probability",col=c("blue","red"), lwd=2)
legend("topright", legend=c("M6", "Placebo"), col=c("blue","red"), lwd=2)

Or if you want to include the confidence limits,

plot(result.m6,col="blue")
lines(result.pl,col="red")
legend("topright", legend=c("M6", "Placebo"), col=c("blue","red"), lwd=2)

Part (d)

Compare the two groups using both the log‐rank test and the Prentice modification of the Gehan test.

survdiff(Surv(tt2, status2) ~ treat, rho=0)     ## Log-Rank Test
## Call:
## survdiff(formula = Surv(tt2, status2) ~ treat, rho = 0)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## treat=0 21       21     10.7      9.77      16.8
## treat=1 21        9     19.3      5.46      16.8
## 
##  Chisq= 16.8  on 1 degrees of freedom, p= 4e-05
survdiff(Surv(tt2, status2) ~ treat, rho=1)     ## Gehan Test
## Call:
## survdiff(formula = Surv(tt2, status2) ~ treat, rho = 1)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## treat=0 21    14.55     7.68      6.16      14.5
## treat=1 21     5.12    12.00      3.94      14.5
## 
##  Chisq= 14.5  on 1 degrees of freedom, p= 1e-04

In both cases, the resulting p-values were extremely small. Therefore, we have enough evidence to conclude that the treatment (M6) has a significant effect on the survival time.