Statistics and Population

Competing Risks

Home Parametric Non-Parametric Competing Risks Heterogeneity Multivariate

Competing Risk Simulation

Here is a simple demostration of the difference between the marginal distribution of a failure time and and the associated single-decrement function, in a competing risks framework with two causes of failure.

We will simulate two correlated standard log-normal random variables, with a user-supplied parameter rho. Call these t1 and t2. The overall survival time t is the miminum of the two. There is no censoring.
We calculate three Kaplan-Meier estimates: the joint survival, the marginal distribution of the (unobserved) time t1, and the survival-like function obtained by censoring observations that fail due to cause 2.

Stata
R

Here’s a Stata command an R function to simulate and plot the data:

. capture program drop simcomp

. program simcomp
  1.     args rho n
  2.     if "`n'" == "" local n 1000
  3.     clear
  4.     quietly {
  5.         set obs `n'
  6.         gen y1 = rnormal()
  7.         gen y2 = rnormal(`rho' * y1, sqrt(1-`rho'^2))
  8.         gen t1 = exp(y1)
  9.         gen t2 = exp(y2)
 10.         gen t = min(t1, t2)
 11.         gen j = (t1 < t2) + 1
 12.         stset t, fail(1)
 13.         gen tos = _t
 14.         sts gen os = s
 15.         stset t, fail(j == 1)
 16.         sts gen asd = s
 17.         gen tasf = _t
 18.         stset t1, fail(1)
 19.         sts gen ms = s
 20.         gen tms = _t
 21.     }
 22.     twoway line os tos if tos < 8, c(J) sort ///
>       || line asd tasf if tasf < 8, lc(red) c(J) sort ///
>       || line ms tms if tms < 8, lc(green) c(J) sort ///
>          title("Single-decrement, marginal and joint survival when rho=`rho'")
>  ///
>          legend(order(1 "joint" 2 "asd" 3 "marginal") )
 23. end

> simcomp <- function(rho, n=1000) {
+     meanlog = 0; sdlog = 1; tmax = 8
+     require(survival); require(dplyr); require(ggplot2)
+ 
+     # Simulate data
+     y1 <- rnorm(n, meanlog, sdlog)
+     y2 <- rnorm(n, meanlog + rho * (y1 - meanlog), sqrt(1 - rho^2) * sdlog )
+     t1 <- exp(y1)
+     t2 <- exp(y2)
+     t <- ifelse(t1 < t2, t1, t2)
+     j <- (t1 < t2) + 1
+ 
+     # Kaplan-Meiers
+     os  <- survfit(Surv(t, j > 0) ~ 1) # overall survival
+     asd <- survfit(Surv(t,  j == 1) ~ 1) # associated single decrement
+     ms  <- survfit(Surv(t1, j > 0 ) ~ 1) # marginal survival
+ 
+     # ggplot
+     tdf <- function(sf,name) {
+         data.frame(time=sf$time, surv=sf$surv, group=rep(name,length(sf$time)))
+     }
+     km <- filter(rbind(tdf(asd,"asd"), tdf(ms, "marginal"), 
+         tdf(os,"joint")), time <= 8)
+     ggplot(km, aes(time, surv, color=group)) + geom_step() + xlim(0, 8) +
+       scale_color_manual(name="group", 
+       values=c(asd="red", marginal="green",joint="black"))        
+ }

And here is the result when we try rho=0.5

. simcomp 0.5

. graph export simcomp.png, width(500) replace
file simcomp.png saved as PNG format

> library(ggplot2)
> simcomp(0.5)
> ggsave("simcompr.png", width=500/72, height=400/72, dpi=72)

What would you expect if the correlation is closer to zero? Closer to one? Try rho=0.2 and rho=0.8 to confirm your intuition.