Numbers in engineering format

Introduction

Packages used in this vignette.

library(tidyverse)
library(knitr)
library(docxtools)

This vignette demonstrates the use of two functions from the docxtools package:

The primary goal of format_engr() is to present numeric variables in a data frame in engineering format, that is, scientific notation with exponents that are multiples of 3. Compare:

syntax expression
computer \(1.011E+5\)
mathematical \(1.011\times10^{5}\)
engineering \(101.1\times10^{3}\)

format_engr()

This example uses a small data set, density, included with docxtools, with temperature in K, pressure in Pa, the gas constant in J kg-1K-1, and density in kg m-3.

density
#>         date trial    T_K   p_Pa   R  density
#> 1 2018-06-12     a 294.05 101100 287 1.197976
#> 2 2018-06-13     b 294.15 101000 287 1.196384
#> 3 2018-06-14     c 294.65 101100 287 1.195536
#> 4 2018-06-15     d 293.35 101000 287 1.199647
#> 5 2018-06-16     e 293.85 101100 287 1.198791

Four of the variables are numeric. The date variable is of type “double” but class “Date”, so it is not reformatted.

map_chr(density, class)
#>        date       trial         T_K        p_Pa           R     density 
#>      "Date" "character"   "numeric"   "numeric"   "numeric"   "numeric"
map_chr(density, typeof)
#>        date       trial         T_K        p_Pa           R     density 
#>    "double" "character"    "double"    "double"    "double"    "double"

Usage is format_engr(x, sigdig = NULL, ambig_0_adj = FALSE). The function returns a data frame with all numeric values reformatted as character strings in engineering format with math delimiters $...$.

density_engr <- format_engr(density)
density_engr
#>         date trial     T_K                    p_Pa       R density
#> 1 2018-06-12     a $294.0$ ${101.1}\\times 10^{3}$ $287.0$ $1.198$
#> 2 2018-06-13     b $294.2$ ${101.0}\\times 10^{3}$ $287.0$ $1.196$
#> 3 2018-06-14     c $294.6$ ${101.1}\\times 10^{3}$ $287.0$ $1.196$
#> 4 2018-06-15     d $293.4$ ${101.0}\\times 10^{3}$ $287.0$ $1.200$
#> 5 2018-06-16     e $293.8$ ${101.1}\\times 10^{3}$ $287.0$ $1.199$

The formerly numeric variables are now characters. Non-numeric variables are returned unaltered.

map_chr(density_engr, class)
#>        date       trial         T_K        p_Pa           R     density 
#>      "Date" "character" "character" "character" "character" "character"

The math formatting is applied when the data frame is printed in the output document. For example, we can use knitr::kable() to print the formatted data.

kable(density_engr)
date trial T_K p_Pa R density
2018-06-12 a \(294.0\) \({101.1}\times 10^{3}\) \(287.0\) \(1.198\)
2018-06-13 b \(294.2\) \({101.0}\times 10^{3}\) \(287.0\) \(1.196\)
2018-06-14 c \(294.6\) \({101.1}\times 10^{3}\) \(287.0\) \(1.196\)
2018-06-15 d \(293.4\) \({101.0}\times 10^{3}\) \(287.0\) \(1.200\)
2018-06-16 e \(293.8\) \({101.1}\times 10^{3}\) \(287.0\) \(1.199\)

The function is compatible with the pipe operator.

density_engr <- density %>%
  format_engr()
kable(density_engr)
date trial T_K p_Pa R density
2018-06-12 a \(294.0\) \({101.1}\times 10^{3}\) \(287.0\) \(1.198\)
2018-06-13 b \(294.2\) \({101.0}\times 10^{3}\) \(287.0\) \(1.196\)
2018-06-14 c \(294.6\) \({101.1}\times 10^{3}\) \(287.0\) \(1.196\)
2018-06-15 d \(293.4\) \({101.0}\times 10^{3}\) \(287.0\) \(1.200\)
2018-06-16 e \(293.8\) \({101.1}\times 10^{3}\) \(287.0\) \(1.199\)

Comments:

Significant digits

format_engr() has three arguments:

The sigdig argument can be a single value, applied to all numeric columns.

density_engr <- format_engr(density, sigdig = 3)
kable(density_engr)
date trial T_K p_Pa R density
2018-06-12 a \(294\) \({101}\times 10^{3}\) \(287\) \(1.20\)
2018-06-13 b \(294\) \({101}\times 10^{3}\) \(287\) \(1.20\)
2018-06-14 c \(295\) \({101}\times 10^{3}\) \(287\) \(1.20\)
2018-06-15 d \(293\) \({101}\times 10^{3}\) \(287\) \(1.20\)
2018-06-16 e \(294\) \({101}\times 10^{3}\) \(287\) \(1.20\)

Alternatively, significant digits can be assigned to every numeric column. A zero returns the variable in its original form.

density_engr <- format_engr(density, sigdig = c(5, 4, 0, 5))
kable(density_engr)
date trial T_K p_Pa R density
2018-06-12 a \(294.05\) \({101.1}\times 10^{3}\) \(287\) \(1.1980\)
2018-06-13 b \(294.15\) \({101.0}\times 10^{3}\) \(287\) \(1.1964\)
2018-06-14 c \(294.65\) \({101.1}\times 10^{3}\) \(287\) \(1.1955\)
2018-06-15 d \(293.35\) \({101.0}\times 10^{3}\) \(287\) \(1.1996\)
2018-06-16 e \(293.85\) \({101.1}\times 10^{3}\) \(287\) \(1.1988\)

Ambiguous trailing zeros

Subset the data to look at just the numerical variables.

x <- density %>%
  select(T_K, p_Pa, R, density)

Print the data with incrementally decreasing significant digits.

kable(format_engr(x, sigdig = 4), caption = "4 digits")
4 digits
T_K p_Pa R density
\(294.0\) \({101.1}\times 10^{3}\) \(287.0\) \(1.198\)
\(294.2\) \({101.0}\times 10^{3}\) \(287.0\) \(1.196\)
\(294.6\) \({101.1}\times 10^{3}\) \(287.0\) \(1.196\)
\(293.4\) \({101.0}\times 10^{3}\) \(287.0\) \(1.200\)
\(293.8\) \({101.1}\times 10^{3}\) \(287.0\) \(1.199\)

Three digits creates no ambiguity.

kable(format_engr(x, sigdig = 3), caption = "3 digits")
3 digits
T_K p_Pa R density
\(294\) \({101}\times 10^{3}\) \(287\) \(1.20\)
\(294\) \({101}\times 10^{3}\) \(287\) \(1.20\)
\(295\) \({101}\times 10^{3}\) \(287\) \(1.20\)
\(293\) \({101}\times 10^{3}\) \(287\) \(1.20\)
\(294\) \({101}\times 10^{3}\) \(287\) \(1.20\)

With 2 digits, we have three columns with ambiguous trailing zeros.

kable(format_engr(x, sigdig = 2), caption = "2 digits")
2 digits
T_K p_Pa R density
\(290\) \({100}\times 10^{3}\) \(290\) \(1.2\)
\(290\) \({100}\times 10^{3}\) \(290\) \(1.2\)
\(290\) \({100}\times 10^{3}\) \(290\) \(1.2\)
\(290\) \({100}\times 10^{3}\) \(290\) \(1.2\)
\(290\) \({100}\times 10^{3}\) \(290\) \(1.2\)

By setting the ambig_0_adj argument to TRUE, scientific notation is used to remove the ambiguity.

kable(format_engr(x, sigdig = 2, ambig_0_adj = TRUE), caption = "Removing ambiguity")
Removing ambiguity
T_K p_Pa R density
\({0.29}\times 10^{3}\) \({0.10}\times 10^{6}\) \({0.29}\times 10^{3}\) \(1.2\)
\({0.29}\times 10^{3}\) \({0.10}\times 10^{6}\) \({0.29}\times 10^{3}\) \(1.2\)
\({0.29}\times 10^{3}\) \({0.10}\times 10^{6}\) \({0.29}\times 10^{3}\) \(1.2\)
\({0.29}\times 10^{3}\) \({0.10}\times 10^{6}\) \({0.29}\times 10^{3}\) \(1.2\)
\({0.29}\times 10^{3}\) \({0.10}\times 10^{6}\) \({0.29}\times 10^{3}\) \(1.2\)

The ambiguous trailing zero adjustment is applied only to those variables for which the condition exists. For example, if the pressure were known to only 2 digits, it is the only variable with ambiguous trailing zeros.

kable(format_engr(x, sigdig = c(4, 2, 0, 3), ambig_0_adj = FALSE))
T_K p_Pa R density
\(294.0\) \({100}\times 10^{3}\) \(287\) \(1.20\)
\(294.2\) \({100}\times 10^{3}\) \(287\) \(1.20\)
\(294.6\) \({100}\times 10^{3}\) \(287\) \(1.20\)
\(293.4\) \({100}\times 10^{3}\) \(287\) \(1.20\)
\(293.8\) \({100}\times 10^{3}\) \(287\) \(1.20\)

With ambig_0_adj = TRUE, only the pressure variable has a reformatted power of ten.

kable(format_engr(x, sigdig = c(4, 2, 0, 3), ambig_0_adj = TRUE))
T_K p_Pa R density
\(294.0\) \({0.10}\times 10^{6}\) \(287\) \(1.20\)
\(294.2\) \({0.10}\times 10^{6}\) \(287\) \(1.20\)
\(294.6\) \({0.10}\times 10^{6}\) \(287\) \(1.20\)
\(293.4\) \({0.10}\times 10^{6}\) \(287\) \(1.20\)
\(293.8\) \({0.10}\times 10^{6}\) \(287\) \(1.20\)

align_pander()

This function uses pander() to print a table and panderOptions('table.alignment.default') to align columns. Usage is: align_pander(x, align_idx = NULL, caption = NULL)

align_pander(density_engr)

Finally, the heading can be edited for presentation.

names(density_engr) <- c(
  "Date",
  "Trial",
  "Temp (K)",
  "Press (Pa)",
  "R (J kg^-1^K^-1^)",
  "$\\rho$ (kg/m^3^)"
)
align_pander(density_engr, "lccccc", caption = "Air density measurements")

Conclusion

These two functions provide the means for consistently rendering numbers with the desired number of significant digits, including trailing zeros, and align them in output tables without affecting character data in the same data frame.