cut2 in LaTeX

A minor thing but one that caused significant headaches is the wonderful cut2() function from Hmisc. cut2() is a more flexible boundary generating function that produces (IMO) better intervals i.e.

The cut2 way ensures there are fewer allocation issues due to rounding with a large number of decimal places. Unfortunately when you try to make a table of these in LaTeX, it thinks you’re trying to go into “maths mode”.

To avoid this I wrote a sanitise() function to replace the sanitize() function from knitr so that it will escape any [ characters a table output from R. Whilst I was giving it a UK name (to save function masking issues), I also extended it to correctly handle the GBP symbol ({r} \u00a310) since that was also an issue for us.


Opinion: if you’re using R, you should be using ggplot2 for charts Life is much easier when using it. Get a very quick primer in my note about building charts using it.

optiRum contains a theme_optimum() (after the company, not an adjective) that changes the ggplot2 default chart look. The ggplot2 themes are good to begin with, and others have extended it further e.g. ggthemes, but we wanted a standard within our company that we could use for charts. There were some considerations:

                  aes(x=Sepal.Width, y=Sepal.Length, colour=Species)) + 
          basicplot+theme_optimum()+ggtitle("theme_optimum()"), layout=matrix(1:3,nrow=3))

Generate PDFs

Using knitr to integrate analysis and content via LaTeX is fantastic. Unfortunately, where our documents using LaTeX via the R package knitr. We can end up with some pretty complex documents that need multiple passthroughs to fully typeset.

Unfortunately, the handy “Compile PDF” button in Rstudio, whilst great for simple documents could end up not doing enough passthroughs. At the same time it would generate lots of extra files and you’d have no control over the file name when it was generated.

To get around this I first used build scripts that did a lot of compensating work around the knit2pdf() function to do the various bits and pieces. This became relatively standard but there would be exceptions or different people would write date suffixes differently. generatePDF() is my answer to the various issues we encountered, including some pesky environment interactions with data.table.

generatePDF() takes the following arguments (lots have defaults):

The best place to check out examples of the generatePDF() are in the unit tests.