%\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{Classyfire Cheat Sheet}

Install from CRAN

```
install.packages("classyfire")
```

Load the classyfire package within R

```
library(classyfire)
```

Get the classyfire help overview

```
??classyfire
```

Loading some test data, for instance the **iris** dataset

```
data(iris)
irisClass <- iris[,5]
irisData <- iris[,-5]
```

Construct a classification ensemble **in parallel** (using 4 cpus in this instance) that consists of 10 independent classification models (classifiers) optimised using 10 bootstrap iterations

```
ens <- cfBuild(inputData = irisData, inputClass = irisClass, bootNum = 10, ensNum = 10,
parallel = TRUE, cpus = 4, type = "SOCK")
```

Similarly, **in sequence**:

```
ens <- cfBuild(inputData = irisData, inputClass = irisClass, bootNum = 10, ensNum = 10,
parallel = FALSE)
```

The list of attributes available for each classifier in the ensemble is provided by the function:

```
attributes(ens)
```

Get the **overall** average test and train accuracy

```
getAvgAcc(ens)$Test
getAvgAcc(ens)$Train
```

Get the **individual** test and train accuracies in the ensemble

```
ens$testAcc
ens$trainAcc
# Alternatively
getAcc(ens)$Test
getAcc(ens)$Train
```

In this instance, we are going to randomly generate test data (that represent a new input dataset of unknown classes) to find out their classes using the generated ensemble. The new dataset must have exactly the same number of columns as the inputData, passed as an argument in **cfBuild**. In the following example, 400 points are selected at random, which results in 100 samples (rows).

```
testMatr <- matrix(runif(400)*100, ncol = ncol(irisData))
predRes <- cfPredict(ens, testMatr)
```

Execute five permutation rounds; in each permutation test, an ensemble of 10 classifiers is constructed, each running 10 bootstrap iterations during the optimization process. The default values for permutation testing are ensNum, bootNum and permNum equal to 100.

```
permObj <- cfPermute(irisData, irisClass, bootNum = 10, ensNum = 10, permNum = 5,
parallel = TRUE, cpus = 4, type = "SOCK")
```

Get the vector of averaged accuracies, one for each permutation (each permutation is an independent classification ensemble)

```
permObj$avgAcc
```

Get the overall elapsed time for the permutation process, and the vector of individual execution times for each permutation respectively

```
permObj$totalTime[3]
permObj$execTime
```

Access the first ensemble in the permutation list

```
permObj$permList[[1]]
```

All the functions for descriptive statistics within classyfire start with the prefix “**get**”. For example:

Get the average test and/or train accuracy of the ensemble

```
getAvgAcc(ens)
getAvgAcc(ens)$Test
getAvgAcc(ens)$Train
```

Get the vectors of test and/or train accuracies of the classifiers in the ensemble

```
getAcc(ens)
getAcc(ens)$Test
getAcc(ens)$Train
```

Get the confusion matrix summarising the performance of the ensemble

```
getConfMatr(ens)
```

Get the optimal SVM hyperparameters of the classification ensemble

```
optParam <- getOptParam(ens)
optParam
```

Return the “five number summary”, a descriptive statistic that consists of the minimum, first (lower) quartile, median, third (upper) quartile and maximum value of a given distribution. In this case, the function is applied directly on the output of permutation testing, generated by the **cfPermute** function.

```
getPerm5Num(permObj)
getPerm5Num(permObj)$median
getPerm5Num(permObj)$minimum
getPerm5Num(permObj)$maximum
getPerm5Num(permObj)$upperQ
getPerm5Num(permObj)$lowerQ
```

All the functions for plotting within classyfire start with the prefix “**gg**” since the library **ggplot2** is in use. For example:

The **ggClasPred** function generates a barplot with the per class accuracies (%) for all the correctly classified and misclassified samples in the classification ensemble.

```
# Show the percentages of correctly classified samples in
# a barplot with or without text respectively
ggClassPred(ens)
ggClassPred(ens, showText = TRUE)
# Show the percentages of classified and missclassified samples
# in a barplot simultaneously with and without text
ggClassPred(ens, displayAll = TRUE)
ggClassPred(ens, position = "stack", displayAll = TRUE)
ggClassPred(ens, position = "stack", displayAll = TRUE, showText = TRUE)
# Alernatively, using a dodge position
ggClassPred(ens, position = "dodge", displayAll = TRUE)
ggClassPred(ens, position = "dodge", displayAll = TRUE, showText = TRUE)
```

The **ggEnsTrend** function displays the average test accuracies for every new classifier added to the ensemble, as constructed by the **cfBuild** function.

```
ggEnsTrend(ens)
# Plot with text
ggEnsTrend(ens, showText = TRUE)
# Plot with text; set different limits on y axis
ggEnsTrend(ens, showText = TRUE, ylims=c(90, 100))
```

The **ggEnsHist** function generates a histogram of the ensemble results as generated by **cfBuild**.

```
ggEnsHist(ens)
# Density plot of the test accuracies in the ensemble
ggEnsHist(ens, density = TRUE)
# Density plot that highlights additional descriptive statistics
ggEnsHist(ens, density = TRUE, percentiles=TRUE)
ggEnsHist(ens, density = TRUE, percentiles=TRUE, mean=TRUE)
ggEnsHist(ens, density = TRUE, percentiles=TRUE, median=TRUE)
```

The **ggPermHist** function generates a histogram of the permutation results as generated by **cfPermute**.

```
ggPermHist(permObj)
# Density plot
ggPermHist(permObj, density=TRUE)
# Density plot that highlights additional descriptive statistics
ggPermHist(permObj, density=TRUE, percentiles = TRUE, mean = TRUE)
ggPermHist(permObj, density=TRUE, percentiles = TRUE, median = TRUE)
```

Finally, the **ggFusedHist** function generates a histogram for simultaneous visual comparison of the classification and permutation distributions.

```
ggFusedHist(ensObj, permObj)
```