Evaluator Workflow

David F. Severski

2017-11-20

Background

The first iterations of Evaluator were created as a part of a major healthcare organization’s decision to shift its already mature risk assessment program from reliance on qualitative labels to a quantitative model that would support more precise comparison of competing projects. This organization was able to use statistical sampling to gain greater insight into its information risks, to meet HIPAA compliance obligations and to provide manager to board level business leaders with the data needed to drive decision making.

Since its creation, versions of Evaluator have been deployed both inside and outside the healthcare field.

How to Use

The Evaluator toolkit consists of a series of processes implemented in the R language. Starting from an Excel workbook, risk data is imported and run through a simulation model to estimate the expected losses for each scenario. The results of these simulations are used to create a detailed analysis and a formal risk report. A starter analysis report, overview dashboard and sample Shiny application are all included in the toolkit.

Evaluator takes a domain-driven and framework-independent approach to strategic security risk analysis. It is compatible with ISO, COBIT, HITRUST CSF, PCI-DSS or any other model used for organizing an information security program. If you are able to describe the domains of your program and the controls and threat scenarios applicable to each domain, you will be able to use Evaluator!

Instructions

This README does not define terms commonly used in an OpenFAIR analysis. While not a prerequisite, a review of OpenFAIR methodology and terminology is highly recommended. Familiarity with the R language is also very helpful.

Follow these seven steps to run the toolkit:

  1. Define your controls and risk scenarios
  2. Import the scenarios
  3. Validate the data is ready for simulation
  4. Encode the qualitative labels into quantiative parameters
  5. Run the simulations
  6. Summarize the simulation outputs
  7. Analyze the results

Don’t be intimidated by the process. Evaluator is with you at every step!

Prepare the Environment

A working R interpreter is required. Evaluator should work on any current version of R (v3.4.1 as of this writing) and on any supported platform (Windows, MacOS, or Linux). This vignette assumes the use of RStudio IDE for some presentation details, but the core Evaluator functions may be used with any IDE.

Obtain the Evaluator toolkit either from CRAN via install.packages("evaluator") or the current development version via devtools::install_github('davidski/evaluator').

If you would like to work with the supplied sample data files, execute the following code:

library(dplyr)     # piping
library(readr)     # better CSV handling
library(evaluator) # core evaluator toolkit

# create default directories
input_directory <- "~/evaluator/data"
results_directory <- "~/evaluator/results"
if (!dir.exists(input_directory)) dir.create(input_directory, recursive = TRUE)
if (!dir.exists(results_directory)) dir.create(results_directory, recursive = TRUE)

# copy sample files
file.copy(system.file("extdata", "domains.csv", package = "evaluator"), 
          input_directory)
file.copy(system.file("extdata", "qualitative_mappings.csv", package="evaluator"), 
          input_directory)
file.copy(system.file("extdata", "risk_tolerances.csv", package = "evaluator"), 
          input_directory)
file.copy(system.file("survey", "survey.xlsx", package = "evaluator"),
          input_directory)

Define Your Controls and Security Domains

Evaluator needs to know the domains of your security program. These are the major buckets into which you subdivide your program, typically including areas such as Physical Security, Strategy, Policy, Business Continuity/Disaster Recovery, Technical Security, etc. Out of the box, Evaluator comes with a demonstration model based upon the HITRUST CSF. If you have a different domain structure (e.g. ISO2700x, NIST CSF, or COBIT), you will need to edit the data/domains.csv file to include the domain names and the domain IDs, and a shorthand abbreviation for the domain (such as POL for the Policy domain).

Indentifying the controls (or capabilities) and risk scenarios associated with each of your domains is critical to the final analysis. The elements are documented in an Excel workbook. The workbook includes one tab per domain, with each tab named after the domain IDs defined in the previous step. Each tab consists of a controls table and a threats table.

Controls Table

The key objectives of each domain are defined in the domain controls table. While the specific controls will be unique to each organization, the sample spreadsheet included in Evaluator may be used as a model. It is best to avoid copying every technical control from, for example, ISO 27001 or COBIT, since most control frameworks are too fine-grained to provide the high level overview that Evaluator delivers. In practice, 50 controls or less will usually be sufficient to describe organizations of up to one to two billion USD in size. Each control must have a unique ID and should be assigned a difficulty (DIFF) score that ranks the maturity of the control on a CMM scale from Initial (lowest score) to Optimized (best of class).

Threats Table

The threats table consists of the potential loss scenarios addressed by each domain of your security program. Each scenario is made up of a descriptive field that describes who did what to whom, the threat community that executed the attack (e.g. external hacktivist, internal workforce member, third party vendor), how often the threat actor acts upon your assets (TEF), the strength with which they act against your assets (TCap), the losses incurred (LM) and a comma-separated list of the control IDs that prevent the scenario.

Import the Scenarios

To extract the spreadsheet into data files for further analysis, run import_spreadsheet("PATH", domains). Evaluator will extract the data and save two comma seperated files in a data directory with the results.

domains <-  readr::read_csv(file.path(input_directory, "domains.csv"))
system.file("survey", "survey.xlsx", package = "evaluator") %>% 
  import_spreadsheet(., domains, output_dir = input_directory)

Validate the Data

Run the command validate_scenarios(). If there are data validation errors, the process will abort and an error message will be displayed. Correct the errors displayed, reimport, and repeat the validation process until there are no errors reported.

mappings <-  readr::read_csv(file.path(input_directory, "qualitative_mappings.csv"))
qualitative_scenarios <- readr::read_csv(file.path(input_directory, "qualitative_scenarios.csv"))
capabilities <- readr::read_csv(file.path(input_directory, "capabilities.csv"))
validate_scenarios(qualitative_scenarios, capabilities, domains, mappings)

Encode the Data

quantitative_scenarios <- encode_scenarios(qualitative_scenarios, 
                                           capabilities, mappings)

Run the Simulations

Once the data is translated into quantitative scenarios and ready for simulation, run run_simulations(quantitative_scenarios). By default, Evaluator puts each scenario through 10,000 individual simulated years, modelling how often the threat actor will come into contact with your assets, the strength of the threat actor, the strength of your controls, and the losses involved in any situation where the threat strength exceeds your control strength. This simulation process can be computationally intense. The sample data set takes approximately five minutes on my primary development machines (last generation Windows-based platforms).

simulation_results <- run_simulations(quantitative_scenarios, 
                                      simulation_count = 10000L)
save(simulation_results, file = file.path(results_directory, "simulation_results.rda"))

Analyze the Results

Several analysis functions are provided, including a template for a technical risk report. Assuming that the input files have been placed in a data directory and the simulation results and summary files in a results directory, the risk report can be generated via generate_report(input_directory, results_directory). This will create a pre-populated risk report that identifies key scenarios and generates initial plots for to be used in creating a final analysis report. Other included report tools include risk_dashboard().

For interactive exploration, run explore_scenarios(input_directory, results_directory) to launch a local copy of the Scenario Explorer application. The Scenario Explorer app may be used to view information about the individual scenarios and provides a sample overview of the entire program.

For more in depth analysis, the following data files may be used directly from R or from external business intelligence programs such as Tableau:

# summarize
scenario_summary <- summarize_scenarios(simulation_results)
domain_summary <- summarize_domains(simulation_results, domains)

# or to save the summary files directly to disk
summarize_to_disk(simulation_results = simulation_results, domains = domains, results_directory)

# define risk tolerances
risk_tolerances <- system.file("extdata", "risk_tolerances.csv", 
                               package="evaluator") %>% read_csv()

# Explorer
explore_scenarios(input_directory, results_directory)

# Sample Report
generate_report(input_directory, results_directory) %>% rstudioapi::viewer()

Or, to view that same report as a Word document for editing, use generate_report(input_directory, results_directory, output_format = "word_document").

If you would rather work on the source RMarkdown file or direct editing and graphics tweaking, use system.file("rmd, "analyze_risk.Rmd", package = "evaluator") to find he location of the native template on your system.

Advanced Customization

Evaluator makes several assumptions to get you up and running as quickly as possible. Advanced users may implement several different customizations including:

Where to Go From Here

While Evaluator is a powerful tool, it does not explicitly attempt to address complex analysis of security risks, interaction between risk scenarios, rolling up multiple levels of risk into aggregations, modelling secondary losses or other advanced topics. As you become more comfortable with quantitative risk analysis, you may wish to dive deeper into these areas (and I hope you do!).

Commercial Software

Blogs/Books/Training

Associations