## Methodological details

Case-base sampling was proposed by Hanley and Miettinen, 2009 as a way to fit smooth-in-time parametric hazard functions via logistic regression. The main idea, which was first proposed by Mantel, 1973 and then later developped by Efron, 1977, is to sample person-moments, i.e.Â discrete time points along an subjectâ€™s follow-up time, in order to construct a base series against which the case series can be compared.

This approach allows the explicit inclusion of the time variable into the model, which enables the user to fit a wide class of parametric hazard functions. For example, including time linearly recovers the Gompertz hazard, whereas including time *logarithmically* recovers the Weibull hazard; not including time at all corresponds to the exponential hazard.

The theoretical properties of this approach have been studied in Saarela and Arjas, 2015 and Saarela, 2015.

## First example

The first example we discuss uses the well-known `veteran`

dataset, which is part of the `survival`

package. As we can see below, there is almost no censoring, and therefore we can get a good visual representation of the survival function:

```
set.seed(12345)
library(survival)
data(veteran)
table(veteran$status)
```

```
##
## 0 1
## 9 128
```

```
evtimes <- veteran$time[veteran$status == 1]
hist(evtimes, nclass = 30, main = '', xlab = 'Survival time (days)',
col = 'gray90', probability = TRUE)
tgrid <- seq(0, 1000, by = 10)
lines(tgrid, dexp(tgrid, rate = 1.0/mean(evtimes)),
lwd = 2, lty = 2, col = 'red')
```