The survival time we will work with is the time that heroin addicts
stay in a clinic for methadone maintenance treatment. The data were
first analyzed by Caplehorn and Bell (1991) and appeared in a handbook
of small datasets by Hand et al (1994). They have also been analyzed by
Rabe-Hesketh and Everitt (2006) in various editions of their Handbook of
Statistical Analyses using Stata. The data are available at
https://grodri.github.io/datasets/heroin.dta
.
The data come from two clinics coded 1 and 2; status
is
1 if the event occurred (the person left the clinic) and 0 otherwise,
and time
is in days. The other two variables are an
indicator of whether the person had a prison
record, and
the dose
of methadone in mg.
(a) Verify that we have 150 failures in a total of 95,812 days of exposure. It will be convenient to recon time in months, treating a month as 30 days. Assume the hazard is constant over time. What’s the event rate per month? What’s the probability that someone would still be in treatment after one, two and three years?
. use https://grodri.github.io/datasets/heroin, clear (Retention of Heroin Addicts in Methadone Treatment Program, Hand(1994)) . count if status==1 150 . scalar count = r(N) . quietly sum time . di r(sum) 95812 . gen months = time / 30 . stset months, fail(status == 1) id(id) Survival-time data settings ID variable: id Failure event: status==1 Observed time interval: (months[_n-1], months] Exit on or before: failure ────────────────────────────────────────────────────────────────────────── 238 total observations 0 exclusions ────────────────────────────────────────────────────────────────────────── 238 observations remaining, representing 238 subjects 150 failures in single-failure-per-subject data 3,193.733 total analysis time at risk and under observation At risk from t = 0 Earliest observed entry t = 0 Last observed exit t = 35.86666 . scalar hazard = count/r(sum) . di hazard .04696698 . di exp(-12*hazard), exp(-24*hazard), exp(-36*hazard) .56915429 .3239366 .18436991
> library(haven) > heroin <- read_dta("https://grodri.github.io/datasets/heroin.dta") > library(dplyr) > tally <- summarize(heroin, events=sum(status==1), days=sum(time), + exposure = days/30); tally # A tibble: 1 × 3 events days exposure <int> <dbl> <dbl> 1 150 95812 3194. > hazard <- tally$events/tally$exposure; hazard [1] 0.04696698 > exp(-hazard * c(12, 24, 36)) [1] 0.5691543 0.3239366 0.1843699
We have 150 failures in 95,812 days as expected. The monthly event rate is 0.047, so addicts leave treatment at the rate of 1 out of 20 per month. The probability of remaining in treatment is 57% after one year, 32% after two and 18% after three years.
(b) Split the dataset so that we have separate observations for 0-3, 3-6, 6-12, 12-18, 18-24 and 24-36 months. (In other words split the first year into two quarters and a semester, and the second year into two semesters.) There are no exits after 3 years. Fit a piece-wise exponential model and describe the shape of the hazard. Test the hypothesis that the hazard is constant. Estimate the probability that someone would be in treatment after one, two and three years.
. stsplit interval, at(3 6 12 18 24 36) (613 observations (episodes) created) . gen events = _d . gen exposure = _t - _t0 . poisson events i.interval, exposure(exposure) Iteration 0: log likelihood = -512.97744 Iteration 1: log likelihood = -512.89225 Iteration 2: log likelihood = -512.89205 Iteration 3: log likelihood = -512.89205 Poisson regression Number of obs = 851 LR chi2(5) = 10.01 Prob > chi2 = 0.0749 Log likelihood = -512.89205 Pseudo R2 = 0.0097 ─────────────┬──────────────────────────────────────────────────────────────── events │ Coefficient Std. err. z P>|z| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── interval │ 3 │ .1847856 .291796 0.63 0.527 -.3871241 .7566953 6 │ .2764494 .2616835 1.06 0.291 -.2364409 .7893396 12 │ .3810534 .2752023 1.38 0.166 -.1583333 .92044 18 │ .45605 .3198465 1.43 0.154 -.1708377 1.082938 24 │ 1.056881 .3318809 3.18 0.001 .4064063 1.707355 │ _cons │ -3.364343 .2085144 -16.13 0.000 -3.773024 -2.955662 ln(exposure) │ 1 (exposure) ─────────────┴──────────────────────────────────────────────────────────────── . di e(chi2), e(df_m) 10.013372 5 . mat list e(b) e(b)[1,7] events: events: events: events: events: events: 0b. 3. 6. 12. 18. 24. interval interval interval interval interval interval y1 0 .18478562 .27644938 .38105336 .45604996 1.0568809 events: _cons y1 -3.3643429 . mata b = st_matrix("e(b)") . mata h = exp(b[7] :+ b[1..6]) . mata w = 3, 3, 6, 6, 6, 12 . mata S = exp(-runningsum(h :* w)) . mata S[(3, 5, 6)] 1 2 3 ┌───────────────────────────────────────────┐ 1 │ .6052257103 .3219616817 .09754079 │ └───────────────────────────────────────────┘
> library(survival) > heroin <- mutate(heroin, months = time/30, fail = status) # for clarity > breaks <- c(3, 6, 12, 18, 24, 36) > heroinx <- survSplit(heroin, cut = breaks, start = "start", end = "months", + event = "fail", episode = "interval") > heroinx <- mutate(heroinx, exposure = months - start, interval = + factor(interval, labels=paste("(",c(0,breaks[-6]),",",breaks,"]", sep = ""))) > nrow(heroinx) [1] 851 > pwe <- glm(fail ~ interval + offset(log(exposure)), data = heroinx, family = poisson) > summary(pwe)$coefficients Estimate Std. Error z value Pr(>|z|) (Intercept) -3.3643429 0.2085120 -16.1350089 1.447867e-58 interval(3,6] 0.1847856 0.2917936 0.6332751 5.265540e-01 interval(6,12] 0.2764494 0.2616729 1.0564692 2.907539e-01 interval(12,18] 0.3810534 0.2751989 1.3846471 1.661604e-01 interval(18,24] 0.4560500 0.3198416 1.4258618 1.539082e-01 interval(24,36] 1.0568809 0.3318793 3.1845343 1.449871e-03 > me <- glm(fail ~ offset(log(exposure)), data=heroinx, family=poisson); exp(coef(me)) (Intercept) 0.04696698 > deviance(me) - deviance(pwe) [1] 10.01337 > b <- coef(pwe) > h <- exp(b[1] + c(0,b[-1])) > w <- c(3, 3, 6, 6, 6, 12) > exp(-cumsum(h * w)) interval(3,6] interval(6,12] interval(12,18] interval(18,24] 0.90144685 0.79567346 0.60522571 0.44668129 0.32196168 interval(24,36] 0.09754079
The rate at which addicts leave the clinic increases substantially and steadily with duration of stay; at the one year mark it is almost 50% higher than in the first trimester, and in the third year it is almost three times what it was initially. To test the hypothesis of a constant hazard we compare this model with the null, which gives as a chi-squared of 10.01 on 5 d.f., a test that would accept at the conventional 5% level. The estimated survival probabilities are 61% after one year, 32% after two and just under 10% after three years.
(a) Introduce a dummy variable to identify clinic 1 and add it to the model. Interpret the exponentiated coefficient and test its significance. (Note that clinic is coded 1 and 2, we want to treat clinic 2 as the reference. You may want to save this model for part c.)
. gen clinic1 = clinic == 1 // or use ib2.clinic . poisson events i.interval clinic1, exposure(exposure) Iteration 0: log likelihood = -497.08351 Iteration 1: log likelihood = -497.03449 Iteration 2: log likelihood = -497.03443 Iteration 3: log likelihood = -497.03443 Poisson regression Number of obs = 851 LR chi2(6) = 41.73 Prob > chi2 = 0.0000 Log likelihood = -497.03443 Pseudo R2 = 0.0403 ─────────────┬──────────────────────────────────────────────────────────────── events │ Coefficient Std. err. z P>|z| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── interval │ 3 │ .1932776 .2917984 0.66 0.508 -.3786367 .7651919 6 │ .3130266 .2617334 1.20 0.232 -.1999614 .8260146 12 │ .4588443 .2754258 1.67 0.096 -.0809804 .998669 18 │ .6027036 .3205791 1.88 0.060 -.0256198 1.231027 24 │ 1.336127 .3348078 3.99 0.000 .6799155 1.992338 │ clinic1 │ 1.085154 .2121442 5.12 0.000 .669359 1.500949 _cons │ -4.212952 .2776001 -15.18 0.000 -4.757038 -3.668866 ln(exposure) │ 1 (exposure) ─────────────┴──────────────────────────────────────────────────────────────── . estimates store clinic . di exp(_b[clinic1]) - 1 1.9598958
> heroinx <- mutate(heroinx, clinic1 = as.numeric(clinic == 1)) > pwec <- update(pwe, . ~ . + clinic1) > summary(pwec)$coefficient Estimate Std. Error z value Pr(>|z|) (Intercept) -4.2129522 0.2775893 -15.1769264 5.027959e-52 interval(3,6] 0.1932776 0.2917943 0.6623762 5.077302e-01 interval(6,12] 0.3130266 0.2617218 1.1960284 2.316855e-01 interval(12,18] 0.4588443 0.2754218 1.6659693 9.571952e-02 interval(18,24] 0.6027036 0.3205751 1.8800700 6.009854e-02 interval(24,36] 1.3361267 0.3348046 3.9907653 6.586044e-05 clinic1 1.0851541 0.2121327 5.1154493 3.129949e-07 > exp(coef(pwec)["clinic1"]) - 1 clinic1 1.959896
The departure rate is much higher in clinic 1 than clinic 2, in fact almost triple at the same duration of treatment, a highly significant effect with a Wald z-test of 5.12.
(b) Let us add an interaction between clinic and duration. To save d.f. we will group time in years for purposes of the interaction. For full credit present your parameter estimates in terms of the effect of clinic 1 compared to 2 in the first, second and third year of treatment. Comment on the point estimates.
To achieve this goal we create separate dummy variables to represent the contrast betweeen clinics 1 and 2 in each year and then fit a model.
. gen clinic1year1 = clinic1 * (interval < 12) . gen clinic1year2 = clinic1 * (interval >= 12 & interval < 24) . gen clinic1year3 = clinic1 * (interval >= 24) . poisson events i.interval clinic1year*, exposure(exposure) Iteration 0: log likelihood = -494.51171 Iteration 1: log likelihood = -491.19934 Iteration 2: log likelihood = -491.15455 Iteration 3: log likelihood = -491.15449 Iteration 4: log likelihood = -491.15449 Poisson regression Number of obs = 851 LR chi2(8) = 53.49 Prob > chi2 = 0.0000 Log likelihood = -491.15449 Pseudo R2 = 0.0516 ─────────────┬──────────────────────────────────────────────────────────────── events │ Coefficient Std. err. z P>|z| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── interval │ 3 │ .1897187 .2918028 0.65 0.516 -.3822043 .7616417 6 │ .2975722 .2618246 1.14 0.256 -.2155945 .810739 12 │ -.3384617 .4922646 -0.69 0.492 -1.303283 .6263592 18 │ -.1761591 .5079844 -0.35 0.729 -1.17179 .819472 24 │ -.7497782 1.041067 -0.72 0.471 -2.790233 1.290676 │ clinic1year1 │ .5491248 .2549685 2.15 0.031 .0493957 1.048854 clinic1year2 │ 1.505471 .4095048 3.68 0.000 .7028567 2.308086 clinic1year3 │ 3.080406 1.035098 2.98 0.003 1.05165 5.109161 _cons │ -3.769108 .2895188 -13.02 0.000 -4.336554 -3.201661 ln(exposure) │ 1 (exposure) ─────────────┴──────────────────────────────────────────────────────────────── . mata exp(st_matrix("e(b)")[7..9]) 1 2 3 ┌───────────────────────────────────────────┐ 1 │ 1.731736713 4.506277465 21.76723125 │ └───────────────────────────────────────────┘ . estimates store model2b
> heroinx <- mutate(heroinx, code = as.numeric(interval), + clinic1year1 = clinic1 * (code <= 3), + clinic1year2 = clinic1 * (code > 3 & code < 6), + clinic1year3 = clinic1 * (code == 6)) > pwei <- update(pwe, . ~ . + clinic1year1 + clinic1year2 + clinic1year3) > summary(pwei)$coefficient Estimate Std. Error z value Pr(>|z|) (Intercept) -3.7691076 0.2895117 -13.0188447 9.560644e-39 interval(3,6] 0.1897187 0.2918003 0.6501662 5.155848e-01 interval(6,12] 0.2975722 0.2618153 1.1365732 2.557167e-01 interval(12,18] -0.3384617 0.4922541 -0.6875751 4.917204e-01 interval(18,24] -0.1761591 0.5079744 -0.3467875 7.287510e-01 interval(24,36] -0.7497782 1.0410633 -0.7202042 4.713993e-01 clinic1year1 0.5491248 0.2549585 2.1537813 3.125733e-02 clinic1year2 1.5054714 0.4094973 3.6763887 2.365590e-04 clinic1year3 3.0804057 1.0350963 2.9759606 2.920724e-03 > exp(coef(pwei)[7:9]) clinic1year1 clinic1year2 clinic1year3 1.731737 4.506277 21.767232
We see that the difference between the clinics increases substantially with duration. In the first year the exit rate in clinic 2 is 73% higher than in clinic 1, but in year 2 it is more than four times as high, and in year 3 it is a remarkable 22 times as high. (Turns out clinic 1 pretty much restricts the duration of stay to two years.)
(c) Test the hypothesis that the clinic effect does indeed vary by year using a likelihood ratio test and a Wald test.
. lrtest clinic . Likelihood-ratio test Assumption: clinic nested within model2b LR chi2(2) = 11.76 Prob > chi2 = 0.0028 . test clinic1year1 == clinic1year2 == clinic1year3 ( 1) [events]clinic1year1 - [events]clinic1year2 = 0 ( 2) [events]clinic1year1 - [events]clinic1year3 = 0 chi2( 2) = 8.51 Prob > chi2 = 0.0142
> deviance(pwec) - deviance(pwei) [1] 11.75989 > A <- matrix( c(-1, 1, 0, -1, 0, 1), 2, 3, byrow = TRUE) > b <- A %*% coef(pwei)[7:9] > V <- A %*% vcov(pwei)[7:9, 7:9] %*% t(A) > t(b) %*% solve(V) %*% b [,1] [1,] 8.514599
For the likelihood ratio test we compare the models with and without the interaction, obtaining a chi-squared of 11.76 on 2 d.f., which is significant at the 1% level. For the Wald test we simply test if the clinic effects are the same each year, obtaining 8.51 on the same d.f., with a P-value of 0.014. (I thought it would be instructive to do the calculation from first principles.)
(a) Starting from the model of part 2.b, where the effect of clinic varies by year, add a dummy variable for prison history and interpret the resulting estimate.
. poisson events i.interval clinic1year* prison, exposure(exposure) Iteration 0: log likelihood = -492.67668 Iteration 1: log likelihood = -489.51457 Iteration 2: log likelihood = -489.47392 Iteration 3: log likelihood = -489.47387 Poisson regression Number of obs = 851 LR chi2(9) = 56.85 Prob > chi2 = 0.0000 Log likelihood = -489.47387 Pseudo R2 = 0.0549 ─────────────┬──────────────────────────────────────────────────────────────── events │ Coefficient Std. err. z P>|z| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── interval │ 3 │ .1918866 .2918031 0.66 0.511 -.380037 .7638103 6 │ .3192309 .2621104 1.22 0.223 -.194496 .8329578 12 │ -.3353302 .4926043 -0.68 0.496 -1.300817 .6301564 18 │ -.1773817 .5077096 -0.35 0.727 -1.172474 .8177109 24 │ -.7688194 1.04115 -0.74 0.460 -2.809436 1.271797 │ clinic1year1 │ .5555696 .255022 2.18 0.029 .0557356 1.055403 clinic1year2 │ 1.551366 .4103486 3.78 0.000 .7470973 2.355634 clinic1year3 │ 3.160547 1.036035 3.05 0.002 1.129956 5.191137 prison │ .3048991 .1653632 1.84 0.065 -.0192068 .629005 _cons │ -3.925787 .3036952 -12.93 0.000 -4.521018 -3.330555 ln(exposure) │ 1 (exposure) ─────────────┴──────────────────────────────────────────────────────────────── . di exp(_b[prison]) - 1 .35648811
> pweip <- update(pwei, . ~ . + prison) > summary(pweip)$coefficients Estimate Std. Error z value Pr(>|z|) (Intercept) -3.9257867 0.3036874 -12.9270634 3.166853e-38 interval(3,6] 0.1918866 0.2917996 0.6575973 5.107969e-01 interval(6,12] 0.3192309 0.2621010 1.2179691 2.232357e-01 interval(12,18] -0.3353302 0.4925842 -0.6807572 4.960251e-01 interval(18,24] -0.1773817 0.5076902 -0.3493897 7.267967e-01 interval(24,36] -0.7688194 1.0411455 -0.7384361 4.602495e-01 clinic1year1 0.5555696 0.2550123 2.1785989 2.936148e-02 clinic1year2 1.5513658 0.4103303 3.7807734 1.563419e-04 clinic1year3 3.1605431 1.0360323 3.0506221 2.283678e-03 prison 0.3048991 0.1653586 1.8438663 6.520262e-02 > exp(coef(pweip)["prison"]) - 1 prison 0.3564881
At any given duration of treatment in either clinic, the rate of departure is 36% higher for patients with a prison record than for those without.
(b) Add a linear effect of dose and interpret the estimate. The effect of 1 mg of methadone may not be of interest because doses vary by several mg. In fact the standard deviation is 14.45. What’s the effect on the hazard of increasing the dose by one standard deviation?
. poisson events i.interval clinic1year* prison dose, exposure(exposure) Iteration 0: log likelihood = -477.56757 Iteration 1: log likelihood = -474.29046 Iteration 2: log likelihood = -474.24806 Iteration 3: log likelihood = -474.248 Iteration 4: log likelihood = -474.248 Poisson regression Number of obs = 851 LR chi2(10) = 87.30 Prob > chi2 = 0.0000 Log likelihood = -474.248 Pseudo R2 = 0.0843 ─────────────┬──────────────────────────────────────────────────────────────── events │ Coefficient Std. err. z P>|z| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── interval │ 3 │ .2392676 .291938 0.82 0.412 -.3329203 .8114555 6 │ .417069 .2626636 1.59 0.112 -.0977421 .9318802 12 │ -.2169191 .4935739 -0.44 0.660 -1.184306 .750468 18 │ .00888 .5075821 0.02 0.986 -.9859625 1.003723 24 │ -.3725992 1.0441 -0.36 0.721 -2.418998 1.6738 │ clinic1year1 │ .4582848 .2552573 1.80 0.073 -.0420102 .9585798 clinic1year2 │ 1.502897 .411861 3.65 0.000 .6956643 2.31013 clinic1year3 │ 2.997887 1.036895 2.89 0.004 .9656101 5.030165 prison │ .361036 .1668066 2.16 0.030 .0341009 .687971 dose │ -.0350766 .0063872 -5.49 0.000 -.0475952 -.022558 _cons │ -1.859145 .45847 -4.06 0.000 -2.75773 -.9605601 ln(exposure) │ 1 (exposure) ─────────────┴──────────────────────────────────────────────────────────────── . estimates store model3b . di exp(14.45 * _b[dose]) - 1 -.39761399
> pweipd <- update(pweip, . ~ . + dose) > summary(pweipd)$coefficients Estimate Std. Error z value Pr(>|z|) (Intercept) -1.859144804 0.458454212 -4.05524642 5.008148e-05 interval(3,6] 0.239267570 0.291934839 0.81959238 4.124485e-01 interval(6,12] 0.417069040 0.262655447 1.58789412 1.123103e-01 interval(12,18] -0.216919133 0.493543045 -0.43951411 6.602891e-01 interval(18,24] 0.008880051 0.507552344 0.01749583 9.860411e-01 interval(24,36] -0.372599175 1.044092942 -0.35686399 7.211936e-01 clinic1year1 0.458284775 0.255244021 1.79547702 7.257773e-02 clinic1year2 1.502897099 0.411831929 3.64929719 2.629588e-04 clinic1year3 2.997887375 1.036890505 2.89122850 3.837390e-03 prison 0.361035946 0.166801697 2.16446207 3.042890e-02 dose -0.035076598 0.006386991 -5.49188126 3.976747e-08 > exp(coef(pweipd)["dose"]) - 1 dose -0.03446854
The coefficient indicates that the hazard is 3.5% lower per mg of methadone. (The value is small enough that it can be interpreted directly as a percent.) A dose increase of one standard deviation or 14.45 mg is asociated with a 39.8% reduction in the risk of leaving treatment.
(c) The original paper by Caplehorn and Bell treated dose as a categorical variable with three levels: < 60, 60-79, and 80+. Compare this specification with the linear term in part b in terms of parsimony and goodness of fit.
. recode dose (min/59 = 1 "< 60") (60/79=2 "60-79") (80/max=3 "80+"), /// > gen(doseg) (851 differences between dose and doseg) . scalar aic1 = -2*e(ll) + 2*(e(df_m) + 1) . poisson events i.interval clinic1year* prison i.doseg, exposure(exposure) Iteration 0: log likelihood = -477.36196 Iteration 1: log likelihood = -474.03756 Iteration 2: log likelihood = -473.99224 Iteration 3: log likelihood = -473.99218 Iteration 4: log likelihood = -473.99218 Poisson regression Number of obs = 851 LR chi2(11) = 87.81 Prob > chi2 = 0.0000 Log likelihood = -473.99218 Pseudo R2 = 0.0848 ─────────────┬──────────────────────────────────────────────────────────────── events │ Coefficient Std. err. z P>|z| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── interval │ 3 │ .2276247 .2918845 0.78 0.435 -.3444585 .7997079 6 │ .3914805 .262388 1.49 0.136 -.1227904 .9057515 12 │ -.1965312 .4931713 -0.40 0.690 -1.163129 .7700669 18 │ .0461413 .5087045 0.09 0.928 -.9509013 1.043184 24 │ -.2437024 1.046734 -0.23 0.816 -2.295263 1.807858 │ clinic1year1 │ .4298679 .256712 1.67 0.094 -.0732784 .9330143 clinic1year2 │ 1.432044 .4141453 3.46 0.001 .6203341 2.243754 clinic1year3 │ 2.858089 1.042954 2.74 0.006 .8139374 4.902241 prison │ .3278549 .1689952 1.94 0.052 -.0033695 .6590793 │ doseg │ 60-79 │ -.5739452 .1791532 -3.20 0.001 -.9250791 -.2228114 80+ │ -1.471106 .3014925 -4.88 0.000 -2.062021 -.8801918 │ _cons │ -3.438688 .3095259 -11.11 0.000 -4.045348 -2.832029 ln(exposure) │ 1 (exposure) ─────────────┴──────────────────────────────────────────────────────────────── . scalar aic2 = -2*e(ll) + 2*(e(df_m) + 1) . di aic1, aic2 970.49601 971.98437
> heroinx <- mutate(heroinx, doseq = cut(dose, c(0, 60, 80, Inf), right = FALSE)) > pweipq <- update(pweip, . ~ . + doseq) > summary(pweipq)$coefficients Estimate Std. Error z value Pr(>|z|) (Intercept) -3.43868815 0.3095178 -11.10982435 1.123804e-28 interval(3,6] 0.22762470 0.2918812 0.77985399 4.354768e-01 interval(6,12] 0.39148053 0.2623803 1.49203491 1.356900e-01 interval(12,18] -0.19653116 0.4931548 -0.39851820 6.902482e-01 interval(18,24] 0.04614126 0.5086887 0.09070627 9.277260e-01 interval(24,36] -0.24370241 1.0467283 -0.23282300 8.158988e-01 clinic1year1 0.42986790 0.2567020 1.67457952 9.401676e-02 clinic1year2 1.43204394 0.4141315 3.45794498 5.443125e-04 clinic1year3 2.85808901 1.0429504 2.74038830 6.136664e-03 prison 0.32785488 0.1689909 1.94007465 5.237062e-02 doseq[60,80) -0.57394526 0.1791488 -3.20373422 1.356576e-03 doseq[80,Inf) -1.47110621 0.3014871 -4.87949950 1.063554e-06 > c(AIC(pweipd), AIC(pweipq)) [1] 970.4960 971.9844
The model treating dose as a discrete factor has a slightly higher log-likelihood, but according to Akaike the improvement in fit is not worth the loss of parsimony incurred by adding two parameters.
(a) Use the piece-wise exponential model of part 3.b with effects of clinic, dose, and prison history, to predict the probability of remaining in treatment after one, two and three years for someone with no prison record receiving the average dose of 60.4 mg of methadone in clinic 1.
I will do the calculation from first principles. Because the effect of clinic 1 is not proportional we need to add it separately for each year:
. estimates restore model3b (results model3b are active now) . mata : ───────────────────────────────────────────────── mata (type end to exit) ────── : b = st_matrix("e(b)") : logh = b[12] :+ b[1..6] : c = b[(7, 7, 7, 8, 8, 9)] : h = exp(logh + c :+ (60.4 * b[11])) : H = runningsum(h :* w) : exp(-H)[(3, 5, 6)] 1 2 3 ┌───────────────────────────────────────────┐ 1 │ .6241612559 .2497116013 .0112137274 │ └───────────────────────────────────────────┘ : end ────────────────────────────────────────────────────────────────────────────────
> b <- coef(pweipd) > logh <- b[1] + c(0, b[2:6]) > xb <- b["dose"] * 60.4 > h <- exp(logh + b[c(7, 7, 7, 8, 8, 9)] + xb) > H <- cumsum(h * w) > exp(-H) interval(3,6] interval(6,12] interval(12,18] interval(18,24] 0.91498838 0.81733685 0.62416125 0.41565387 0.24971160 interval(24,36] 0.01121373
The probabilities are 62.4%, 25.0% and 1.1%.
(b) Repeat the calculations for someone with a prison record who receives 60.4 mg in clinic 1.
A prison record affects the hazard at every duration proportionally, so the calculation is simple
. mata: ───────────────────────────────────────────────── mata (type end to exit) ────── : hp = h :* exp(b[10]) : Hp = runningsum(hp :* w) : exp(-Hp)[(3, 5, 6)] 1 2 3 ┌───────────────────────────────────────────┐ 1 │ .50849745 .1365953803 .0015912996 │ └───────────────────────────────────────────┘ : end ────────────────────────────────────────────────────────────────────────────────
> Hp <- H * exp(b["prison"]) > exp(-Hp) interval(3,6] interval(6,12] interval(12,18] interval(18,24] 0.8803158 0.7487068 0.5084974 0.2837597 0.1365954 interval(24,36] 0.0015913
The probabilities are 50.8%, 13.7% and 0.2%.
(c) Finally, repeat the calculations for someone without a prison record who receives the average dose of 60.4 mg, but in clinic 2.
Without a prison record and clinic 2 all we need is the baseline hazard and the dose effect
. mata: ───────────────────────────────────────────────── mata (type end to exit) ────── : h2 = exp(logh :+ (60.4 * b[11])) : H2 = runningsum(h2 :* w) : exp(-H2)[(3, 5, 6)] 1 2 3 ┌───────────────────────────────────────────┐ 1 │ .7422537173 .6053897439 .5185560265 │ └───────────────────────────────────────────┘ : end ────────────────────────────────────────────────────────────────────────────────
> h2 <- exp(logh + xb) > H2 <- cumsum(h2 * w) > exp(-H2) interval(3,6] interval(6,12] interval(12,18] interval(18,24] 0.9453671 0.8802485 0.7422537 0.6780620 0.6053897 interval(24,36] 0.5185560
The probabilities are 74.2%, 60.5% and 51.9%.
To present the results in a single table we combine all the results
. mata HA = H \ Hp \ H2 . mata exp(-HA)[ ,(3, 5, 6)] 1 2 3 ┌───────────────────────────────────────────┐ 1 │ .6241612559 .2497116013 .0112137274 │ 2 │ .50849745 .1365953803 .0015912996 │ 3 │ .7422537173 .6053897439 .5185560265 │ └───────────────────────────────────────────┘
> exp(-cbind(H, Hp, H2))[c(3, 5, 6), ] H Hp H2 interval(6,12] 0.62416125 0.5084974 0.7422537 interval(18,24] 0.24971160 0.1365954 0.6053897 interval(24,36] 0.01121373 0.0015913 0.5185560
There are fairly substantial survival differences among these groups. With an average dose of methadone and no prison record, we estimate that more than half the addicts in clinic 2 would still be under treatment after three years, compared to just about one percent of those in clinic 1, a figure that is further reduced to less than one-fifth of one percent for addicts in clinic 1 and with a prison record.