Managing checkpoint snapshot archives

Andrie de Vries

2018-09-07

The checkpoint() function enables reproducible research by managing your R package versions. These packages are downloaded into a local .checkpoint folder.

If you use checkpoint() for many projects, these local packages can consume some storage space, so the package also exposes functions to manage your snapshots.

In this vignette:

1 Setting up an example project:

For illustration, set up a script referencing a single package:

# Example from ?darts
library(darts)
x = c(12,16,19,3,17,1,25,19,17,50,18,1,3,17,2,2,13,18,16,2,25,5,5,
      1,5,4,17,25,25,50,3,7,17,17,3,3,3,7,11,10,25,1,19,15,4,1,5,12,17,16,
      50,20,20,20,25,50,2,17,3,20,20,20,5,1,18,15,2,3,25,12,9,3,3,19,16,20,
      5,5,1,4,15,16,5,20,16,2,25,6,12,25,11,25,7,2,5,19,17,17,2,12)
mod = simpleEM(x, niter=100)
e = simpleExpScores(mod$s.final)
oldpar <- par(mfrow=c(1, 2))
drawHeatmap(e)
drawBoard(new=TRUE)
drawAimSpot(e, cex = 5)
par(oldpar)

Next, create the checkpoint:

dir.create(file.path(tempdir(), ".checkpoint"), recursive = TRUE, showWarnings = FALSE)
options(install.packages.compile.from.source = "no")
oldLibPaths <- .libPaths()

## Create a checkpoint by specifying a snapshot date
library(checkpoint)
checkpoint("2015-04-26", project = tempdir(), checkpointLocation = tempdir())

2 Working with checkpoint archive snapshots

You can query the available snapshots on disk using the checkpointArchives() function. This returns a vector of snapshot folders.

# List checkpoint archives on disk.
checkpointArchives(tempdir())
## [1] "2015-04-26"

You can get the full paths by including the argument full.names=TRUE:

checkpointArchives(tempdir(), full.names = TRUE)
## [1] "C:/tmp/RtmpOIq91l/.checkpoint/2015-04-26"

2.1 Working with access dates

Every time you use checkpoint() the function places a small marker in the snapshot archive with the access date. In this way you can track when was the last time you actually used the snapshot archive.

# Returns the date the snapshot was last accessed.
getAccessDate(tempdir())
## C:/tmp/RtmpOIq91l/.checkpoint/2015-04-26 
##                             "2018-09-07"

2.2 Removing a snapshot from local disk

Since the date of last access is tracked, you can use this to manage your checkpoint archives.

The function checkpointRemove() will delete archives from disk. You can use this function in multiple ways. For example, specify a specific archive to remove:

# Remove singe checkpoint archive from disk.
checkpointRemove("2015-04-26")

You can also remove a range of snapshot archives older (or more recent) than a snapshot date

# Remove range of checkpoint archives from disk.
checkpointRemove("2015-04-26", allSinceSnapshot = TRUE)
checkpointRemove("2015-04-26", allUntilSnapshot =  = TRUE)

Finally, you can remove all snapshot archives that have not been accessed since a given date:

# Remove snapshot archives that have not been used recently
checkpointRemove("2015-04-26", notUsedSince = TRUE)

3 Reading the checkpoint log file

One of the side effects of checkpoint() is to create a log file that contains information about packages that get downloaded, as well as the download size.

This file is stored in the checkpoint root folder, and is a csv file with column names, so you can read this with your favourite R function or other tools.

dir(file.path(tempdir(), ".checkpoint"))
## [1] "2015-04-26"         "R-3.5.1"            "checkpoint_log.csv"

Inspect the log file:

log_file <- file.path(tempdir(), ".checkpoint", "checkpoint_log.csv")
log <- read.csv(log_file)
head(log)
##             timestamp snapshotDate   pkg bytes
## 1 2018-09-07 14:06:40   2015-04-26 darts  7011

4 Resetting the checkpoint

In older versions of checkpoint() the only way to reset the effect of checkpoint() was to restart your R session.

In v0.3.20 and above, you can use the function unCheckpoint(). This will reset you .libPaths to the user folder.

.libPaths()
## [1] "C:/tmp/RtmpOIq91l/.checkpoint/2015-04-26/lib/x86_64-w64-mingw32/3.5.1"
## [2] "C:/tmp/RtmpOIq91l/.checkpoint/R-3.5.1"                                
## [3] "C:/PROGRA~1/R/R-35~1.1/library"

Now use unCheckpoint() to reset your library paths

# Note this is still experimental
unCheckpoint(oldLibPaths)
.libPaths()
## [1] "C:/tmp/Rtmp69rJVn/Rinst1e3c1470709e"          
## [2] "C:/Users/richcala/Documents/R/win-library/3.5"
## [3] "C:/Program Files/R/R-3.5.1/library"
## cleanup
unlink("manifest.R")
unlink(file.path(tempdir(), "managing_checkpoint_example_code.R"))
unlink(file.path(tempdir(), ".checkpoint"), recursive = TRUE)