# Numbers in engineering format

#### 2017-03-09

This vignette demonstrates the use of two functions from the docxtools package:

• format_engr() for formatting numbers in engineering notation
• align_pander() for aligning table columns using a “simple” pander table style

## format_engr()

The primary goal of format_engr() is to present numeric variables in a data frame in engineering format, that is, scientific notation with exponents that are multiples of 3. Compare:

Syntax Expression
conventional computer syntax $$1.011e+5$$
mathematical syntax $$1.011\times10^{5}$$
engineering format $$101.1\times10^{3}$$

This example uses a small temperature-pressure data set to compute air density and display the results in a table. Density is computed using the ideal gas law, $$\rho = p / (RT)$$.

library(knitr)
opts_knit$set(root.dir = "../") suppressPackageStartupMessages(library(dplyr)) library(docxtools) Create some data for the example, # temperature in K T_K <- c(294.05, 294.15, 294.65, 293.35, 293.85) # convert pressure from hPa to Pa p_Pa <- c(1011, 1010, 1011, 1010, 1011) * 100 # gas constant in J / (kg K) R <- 287 # density in kg / m^3 density_data <- data.frame(T_K, p_Pa, R) Compute air density for each observation, density_data <- mutate(density_data, density = p_Pa / (R * T_K)) knitr::kable(density_data) T_K p_Pa R density 294.05 101100 287 1.197976 294.15 101000 287 1.196384 294.65 101100 287 1.195536 293.35 101000 287 1.199647 293.85 101100 287 1.198791 The first argument of format_engr() is the data frame, the second is the array of significant digits in column order. engr_density_data <- format_engr(density_data, sigdig = c(5, 4, 0, 5)) Data frame values are returned as character strings with math formatting. By setting sigdig = 0 for R, the third column will be displayed in its original form. str(engr_density_data) ## 'data.frame': 5 obs. of 4 variables: ##$ T_K    : chr  "$294.05$" "$294.15$" "$294.65$" "$293.35$" ...
##  $p_Pa : chr "${101.1}\\times 10^{3}$" "${101.0}\\times 10^{3}$" "${101.1}\\times 10^{3}$" "${101.0}\\times 10^{3}$" ... ##$ R      : chr  "$287$" "$287$" "$287$" "$287$" ...
##  $density: chr "$1.1980$" "$1.1964$" "$1.1955$" "$1.1996" ... The math formatting is applied when the data frame is printed in the output document. knitr::kable(engr_density_data) T_K p_Pa R density $$294.05$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1980$$ $$294.15$$ $${101.0}\times 10^{3}$$ $$287$$ $$1.1964$$ $$294.65$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1955$$ $$293.35$$ $${101.0}\times 10^{3}$$ $$287$$ $$1.1996$$ $$293.85$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1988$$ Comments: • scientific notation is omitted when the exponent is 0, 1, or 2, hence temperature T_K is reported with 5 digits but without scientific notation • pressure data p_Pa is correctly shown in engineering format • trailing zeros are significant ## align_pander() This function uses pander() to print a table and panderOptions('table.alignment.default') to align columns. Usage is: align_pander(x, align_idx = NULL, caption = NULL) • x is the data frame • align_idx is a string comprised of any combination of “r”, “l”, and “c” The default alignments are numeric right and everything else left • caption is an optional string used as the pander() caption argument align_pander(engr_density_data, align_idx = "cccc") T_K p_Pa R density $$294.05$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1980$$ $$294.15$$ $${101.0}\times 10^{3}$$ $$287$$ $$1.1964$$ $$294.65$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1955$$ $$293.35$$ $${101.0}\times 10^{3}$$ $$287$$ $$1.1996$$ $$293.85$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1988$$ Finally, the heading can be edited for presentation. The heading row, like the numeric variables, are formatted in R Markdown math format. names(engr_density_data) <- c("T\\text{ (K)}$" , "$p\\text{ (Pa)}$" , "$R\\text{ (J kg}^{-1}\\text{ K}^{-1}\\text{)}$" , "$\\rho\\text{ (kg/m}^{3}\\text{)}" ) • backslashes have to be escaped, hence \\text{} and \\rho • \text{} produces “roman” (non-italic) type inside the math environment align_pander(engr_density_data, "cccc", caption = "Air density measurements") Air density measurements $$T\text{ (K)}$$ $$p\text{ (Pa)}$$ $$R\text{ (J kg}^{-1}\text{ K}^{-1}\text{)}$$ $$\rho\text{ (kg/m}^{3}\text{)}$$ $$294.05$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1980$$ $$294.15$$ $${101.0}\times 10^{3}$$ $$287$$ $$1.1964$$ $$294.65$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1955$$ $$293.35$$ $${101.0}\times 10^{3}$$ $$287$$ $$1.1996$$ $$293.85$$ $${101.1}\times 10^{3}$$ $$287$$ $$1.1988$$ ## non-numeric variables Create some alphanumeric test data, # create test input arguments set.seed(20161221) n <- 5 a <- sample(letters, n) b <- sample(letters, n) x <- runif(n, min = -5, max = 50) * 1e+5 y <- runif(n, min = -25, max = 40) / 1e+3 z <- runif(n, min = -5, max = 100) alpha_num <- data.frame(z, b, y, a, x, stringsAsFactors = FALSE) Format the entire data frame with the default 4 significant digits. engr_alpha_num <- format_engr(alpha_num) align_pander(engr_alpha_num, "rcrcr") z b y a x $$6.501$$ c $${1.052}\times 10^{-3}$$ q $${2.847}\times 10^{6}$$ $$28.37$$ o $${347.6}\times 10^{-6}$$ y $${4.874}\times 10^{6}$$ $$-3.850$$ i $${4.600}\times 10^{-3}$$ g $${-111.7}\times 10^{3}$$ $$44.50$$ a $${-3.045}\times 10^{-3}$$ a $${1.315}\times 10^{6}$$ $$92.41$$ x $${-1.069}\times 10^{-3}$$ i $${417.4}\times 10^{3}$$ • character variables b and a are unaffected by engr_format() • coefficients have floating decimal points Variables can be re-ordered in the usual way, e.g., alpha_num <- select(alpha_num, a, b, x, y, z) engr_alpha_num <- format_engr(alpha_num) align_pander(engr_alpha_num, "ccrrr") a b x y z q c $${2.847}\times 10^{6}$$ $${1.052}\times 10^{-3}$$ $$6.501$$ y o $${4.874}\times 10^{6}$$ $${347.6}\times 10^{-6}$$ $$28.37$$ g i $${-111.7}\times 10^{3}$$ $${4.600}\times 10^{-3}$$ $$-3.850$$ a a $${1.315}\times 10^{6}$$ $${-3.045}\times 10^{-3}$$ $$44.50$$ i x $${417.4}\times 10^{3}$$ $${-1.069}\times 10^{-3}$$ $$92.41$$ ## significant zeros Leading zeros are not significant. Trailing zeros generally should be significant. For example, isolate a number from the y column: y2 <- alpha_numy[2]
y2
## [1] 0.000347614

Formatting y2 with different significant digits using format_engr(y2, sigdig) yields the table below. With 3 digits, y2 has 3 unambiguous significant digits. However, reducing the number of digits to 2 would produce a coefficient of $$350$$ with an ambiguous zero before the decimal point. In such cases, format_engr() changes the exponent to produce $${0.35}\times 10^{-3}$$ with two unambiguous significant digits.

sigdig $$y_2$$
4 $${347.6}\times 10^{-6}$$
3 $${348}\times 10^{-6}$$
2 $${0.35}\times 10^{-3}$$
1 $${0.3}\times 10^{-3}$$

Exceptions to the significant trailing zero rule. Consider $$z_2$$ from the data frame,

z2 <- alpha_num$z[2] z2 ## [1] 28.37409 For numbers like z2 that in scientific notation would have exponents = 0, 1, or 2, format_engr() foregoes powers of ten notation. sigdig $$z_2$$ 4 $$28.37$$ 3 $$28.4$$ 2 $$28$$ 1 $$30$$ As we reduce the number of significant digits, we can eventually obtain a trailing zero whose significance is ambiguous, as in $$z_2 =$$ $$30$$. In such cases, format_engr() leaves the ambiguous significant zero instead of imposing scientific notation that the audience might find distracting. ## scalars and vectors are returned as data frames The preferred input to format_engr() is a data frame. If the input is a numeric vector, it will be formatted and returned as a data frame with the variable name value. For example, using y2 again, str(y2) ## num 0.000348 Now formatting the number returns a data frame with one row and column. engr_y2 <- format_engr(y2, 4) str(engr_y2) ## 'data.frame': 1 obs. of 1 variable: ##$ value: chr "${347.6}\\times 10^{-6}$"

which can be used in an inline code chunk, e.g.,

$y_2 =$ r engr_y2$value to produce $$y_2 =$$ $${347.6}\times 10^{-6}$$. A numeric vector is also returned as a data frame. # the x array cat(x, sep = "\n") ## 2846529 ## 4874357 ## -111651.4 ## 1314716 ## 417385 class(x) ## [1] "numeric" # formatted engr_x <- format_engr(x, 3) class(engr_x) ## [1] "data.frame" engr_x ## value ## 1${2.85}\\times 10^{6}$## 2${4.87}\\times 10^{6}$## 3${-112}\\times 10^{3}$## 4${1.31}\\times 10^{6}$## 5${417}\\times 10^{3}\$

The input to format_engr() must have at least one numeric variable, or an error is thrown, e.g., running format_engr(a) produces the error

Error: m_numeric_cols > 0 is not TRUE Execution halted.

## conclusion

These two functions provide the means for consistently rendering numbers with the desired number of significant digits, including trailing zeros, and align them in output tables without affecting character data in the same data frame.