Drake is a workflow manager for R. When it runs a project, it automatically builds missing and outdated results while skipping the up-to-date output. This automation and reproducibility is important for data analysis workflows, especially large projects under heavy development.
The original idea of a time-saving reproducible build system extends back decades to GNU Make, which today helps data scientists as well as the original user base of complied-language programmers. More recently, Rich FitzJohn created remake, a breakthrough reimagining of Make for R and the most important inspiration for drake. Drake is a fresh reinterpretation of some of remake's pioneering fundamental concepts, scaled up for computationally-demanding workflows. Relative to remake, some of drake's most prominent distinguishing features at the time of writing this document are
parallel::mclapply(). (The user can choose either.)
Thanks also to Kirill Müller and Daniel Falster. They contributed code patches and enhancement ideas to my parallelRemake and remakeGenerator packages, which I have now subsumed into drake.
mclapply() as one of two single-session parallel computing backends. Unfortunately,
mclapply() cannot run multiple parallel jobs on Windows, so Windows users should use set
parallelism = "parLapply" rather than
parallelism = "mclapply" inside
make() (already the Windows default). For true distributed parallel computing over multiple R sessions, Windows users need to download and install
Rtools. This is because drake runs Makefiles with
The CRAN page links to multiple tutorials and vignettes. With drake installed, you can load any of the vignettes in an R session.
vignette(package = "drake") # List the vignettes. vignette("drake") # High-level intro. vignette("quickstart") # Walk through a simple example. vignette("caution") # Drake is not perfect. Read this to be safe.
Drake has small self-contained built-in examples. To see the names of the available examples, use
##  "basic"
example_drake() to write the files for the example to your working directory.
Step through the code files to get started.
Drake tries to reproducibly track everything and make other obviously good decisions, but there are limitations. For example, in some edge cases, it is possible to trick drake into ignoring dependencies. Please read the “caution” vignette to use drake safely (
vignette("caution"), also linked from the CRAN page under “vignettes”).
For troubleshooting, please refer to TROUBLESHOOTING.md on the GitHub page for instructions.