Creating, updating & maintaining the statquotes data base


Primary format for quotes file

The quotes are stored in the data-raw/quotes_raw.txt file. The script data-raw/convert_quotes_to_rda.R can be used to read these quotes and save them to data/quotes.rda, which is the main data file used in the package.

The quotes_raw.txt file uses the following format for each quotation. Lines beginning with “%” are comments and ignored. Other lines contain a “key:value” pair. The key is used to identify the right column when building the quotes data.frame:

% Comment
quo:This is a quotation.
src:Person or persons who said or wrote the quote.
cit:Citation for the original quote.
url:URL where the quote can be found (such as journal articles).
tag:Comma-separated tags to categorize the quote.
tex:TeX-formatted citation

Here is an example:

quo:A judicious man looks at Statistics, not to get knowledge, but to save himself from having ignorance foisted on him.
src:Thomas Carlyle
cit:Chartism, 1840, Chapter II, Statistics
tag:data visualization

LaTeX format for quotes file

The statquotes package originally arose from a LaTeX file, that Michael Friendly used to collect interesting quotations related to statistics, data visualization, history, software and other topics. This was designed to be a collection that a person could search, then copy/paste an appropriate one into a working LaTeX document. The format of quotes was designed to use the LaTeX epigraph package:

\epigraph{You can see a lot, just by looking.}{Yogi Berra}
\epigraph{Every picture tells a story.}{Rod Stewart, 1971}
\epigraph{A picture is worth a thousand words.}{F. Barnard, 1927}

Each quote has some text and a source attribution, and so could be displayed in a document something like

You can see a lot, just by looking. — Yogi Berra

Some of the quotes have manually-added tags to classify the quotes into groups. As many tags as desired can be added to a quote.