This vignette demonstrates how to use the *auditor* package for auditing residuals of models.
The auditor provides methods for model verification and validation by error analysis.

Many models, such as random forests and neutral networks are nowadays treated as black boxes. Therefore, there is a lack of theory that describes the behavior of errors in those models. Most methods provided in auditor package are model-agnostic, so can be used regardless of knowledge about errors.

Some of the graphical error analysis methods also have corresponding scores, which allow comparison of two models.

To illustrate applications of *auditor* to regression problems we will use an artificial dataset dragons available in the *DALEX* package. Our goal is to predict the length of life of dragons.

```
library(DALEX)
data("dragons")
head(dragons)
```

```
## year_of_birth height weight scars colour year_of_discovery
## 1 -1291 59.40365 15.32391 7 red 1700
## 2 1589 46.21374 11.80819 5 red 1700
## 3 1528 49.17233 13.34482 6 red 1700
## 4 1645 48.29177 13.27427 5 green 1700
## 5 -8 49.99679 13.08757 1 red 1700
## 6 915 45.40876 11.48717 2 red 1700
## number_of_lost_teeth life_length
## 1 25 1368.4331
## 2 28 1377.0474
## 3 38 1603.9632
## 4 33 1434.4222
## 5 18 985.4905
## 6 20 969.5682
```

```
lm_model <- lm(life_length ~ ., data = dragons)
```

```
library("randomForest")
set.seed(59)
rf_model <- randomForest(life_length ~ ., data = dragons)
```

The beginning of each analysis is creation of an `explainer`

object with DALEX package. Itâ€™s an object that can be used to audit a model.

```
lm_exp <- DALEX::explain(lm_model, label = "lm", data = dragons, y = dragons$life_length)
```

```
## Preparation of a new explainer is initiated
## -> model label : lm
## -> data : 2000 rows 8 cols
## -> target variable : 2000 values
## -> predict function : yhat.lm will be used ([33mdefault[39m)
## -> predicted values : numerical, min = 540.9447 , mean = 1370.986 , max = 3925.691
## -> residual function : difference between y and yhat ([33mdefault[39m)
## -> residuals : numerical, min = -108.2062 , mean = -3.701928e-12 , max = 113.8603
## [32mA new explainer has been created![39m
```

```
rf_exp <- DALEX::explain(rf_model, label = "rf", data = dragons, y = dragons$life_length)
```

```
## Preparation of a new explainer is initiated
## -> model label : rf
## -> data : 2000 rows 8 cols
## -> target variable : 2000 values
## -> predict function : yhat.randomForest will be used ([33mdefault[39m)
## -> predicted values : numerical, min = 610.9752 , mean = 1370.181 , max = 3292.296
## -> residual function : difference between y and yhat ([33mdefault[39m)
## -> residuals : numerical, min = -135.4756 , mean = 0.8047108 , max = 720.0888
## [32mA new explainer has been created![39m
```

In this section we give short overview of a visual validation of model errors and show the propositions for the validation scores. Auditor helps to find answers for questions that may be crucial for further analyses.

Does the model fit data? Is it not missing the information?

Which model has better performance?

How similar models are?

In further sections we will overview auditor functions for analysis of model residuals. They are discussed in alphabetical order.

In this vignette we use first pipeline. However, alternative evaluations are showed as comments.
First, we need to create a `modelResiduals`

objects.

```
library(auditor)
lm_mr <- model_residual(lm_exp)
rf_mr <- model_residual(rf_exp)
```

Autocorrelation Function plot can be used to check randomness of errors. If random, autocorrelations should be near zero for lag separations. If non-random, then autocorrelations will be significantly non-zero.

Residuals may be ordered by values of any model variable or by fitted values. If variable is not specified, function takes order from the data set.

```
plot(lm_mr, type = "acf", variable = "year_of_discovery")
```

```
# alternative:
# plot_acf(lm_mr, variable = "year_of_discovery")
```

On the Autocorrelation plot there are i-th vs i+1-th residuals. This plot may be useful for checking autocorrelation of residuals.

Sometimes it is difficult to compare two models basing only on visualizations. Therefore, we have proposed some scores, which may be useful for choosing a better model.

```
plot(rf_mr, type = "autocorrelation")
```

```
# alternative:
# plot_autocorrelation(rf_mr)
```

DW score and Runs score are based on Durbin-Watson and Runs test statistics.

Scores can be calculated with the `scoreDW()`

and `scoreRuns()`

functions or the `score()`

function with argument `score`

equals to “DW” or “Runs”.

```
rf_score_dw <- score_dw(rf_exp)
rf_score_runs <- score_runs(rf_exp)
rf_score_dw$score
```

```
## [1] 1.951918
```

```
rf_score_runs$score
```

```
## [1] -1.881788
```

A grid of plots presents correlation of dependennt variable and fitted model values.

```
plot(rf_mr, lm_mr, type = "correlation")
```

```
# alternative:
# plotM_correlation(rf_mr, lm_mr)
```

Principal Component Analysis of models residuals. PCA can be used to assess the similarity of the models.

```
plot(rf_mr, lm_mr, type = "pca")
```

```
# alternative:
# plot_pca(rf_audit, lm_audit)
```

Basic plot of residuals vs observed, fitted or variable values. If variable is not specified, function takes order from the data set.

Black line corresponds to the y=x function.

```
plot(rf_mr, lm_mr, variable = "life_length", type = "prediction")
```

```
# alternative:
# plot_prediction(rf_audit, lm_audit, variable = "life_length")
```

Predictions may be ordered by values any model variable or by fitted values. And both models may be plotted together.

```
plot(rf_mr, lm_mr, type = "prediction")
```

```
# alternative:
# plot_prediction(rf_mr, lm_mr)
```

Error Characteristic curves are a generalization of ROC curves. On the x axis of the plot there is an error tolerance and on the y axis there is a percentage of observations predicted within the given tolerance. REC curve estimates the Cumulative Distribution Function (CDF) of the error. Area Over the REC Curve (REC) is a biased estimate of the expected error.

```
plot(rf_mr, lm_mr, type = "rec")
```

```
# alternative:
# plot_rec(rf_mr, lm_mr)
```

Basic plot of residuals vs observed, fitted or variable values. It provides information about the structure of the model.

```
plot(rf_mr, type = "residual")
```

```
# alternative:
# plot_residual(rf_mr)
```

Residuals may be ordered by values any model variable or by fitted values. And both models may be plotted together. If variable is not specified, function takes order from the data set.

```
plot(rf_mr, lm_mr, type = "residual", variable = "_y_hat_")
```

```
# alternative:
# plot_residual(rf_mr, lm_mr, variable = "_y_hat_")
```

Comparison of the absolute valued of residuals. The red dot stands for the root mean square.

```
plot(lm_mr, rf_mr, type = "residual_boxplot")
```

```
# alternative
# plot_residual_boxplot(lm_mr, rf_mr)
```

Density of residuals may be plotted in different ways. Residuals of models may be simply compared.

```
plot(rf_mr, lm_mr, type = "residual_density")
```

```
# alternative
# plot_residual_density(rf_mr, lm_mr)
```

Resuduals may be also divided by median of the numeric variable and splitted by a factor variable

```
plot_residual_density(rf_mr, lm_mr, variable = "life_length")
```

The basic idea of the ROC curves for regression is to show model asymmetry. The RROC is a plot where on the x-axis we depict total over-estimation and on the y-axis total under-estimation.

For RROC curves we use a shift, which is an equvalent to the threshold for ROC curves. For each observation we calculate new prediction: \eqn{\hat{y}'=\hat{y}+s} where s is the shift. Therefore, there are different error values for each shift: \eqn{e_i = \hat{y_i}' - y_i}

Over-estimation is caluclates as: \eqn{OVER= \sum(e_i|e_i>0)}. Under-estimation is calculated as: \eqn{UNDER = \sum(e_i|e_i<0)}. The shift equals 0 is represented by a dot.

The Area Over the RROC Curve (AOC) equals to the variance of the errors multiplied by \eqn{frac{n^{2}{2}}.}

```
plot(rf_mr, lm_mr, type = "rroc")
```

```
# alternative:
# plot_rroc(rf_mr, lm_mr)
```

This plot shows if residuals are spread equally along the ranges of predictors.

```
plot(rf_mr, lm_mr, type = "scalelocation")
```

```
# alternative:
# plot_scalelocation(rf_mr, lm_mr)
```

For comparing 2 models we can use GQ score, which is based on Goldfeld-Quandt test statistic.
And may be computed also in `score()`

function with argument `score`

equals “GQ”.

Cumulative Distribution Function for positive and negative residuals.

```
plot(rf_mr, lm_mr, type = "tsecdf")
```

```
# alternative
# plot_tsecdf(rf_audit, lm_audit)
```

Other methods and plots are described in vignettes: