# Introduction to naturaList

The aim of {naturaList} package is to implement a classification of occurrence records based on the suitability in the species identification record. The quality of classification is ranked up to six levels of confidence. Additionally, {naturaList} package provides tools to filter the occurrence data based on these classification levels, identify the possible specialists in the taxa and evaluate the effects of the filtering procedure on different descriptors of species spatial distribution of occurrence records (area of distribution and niche breadth). With {naturaList} package the users can filter large occurrence data based on well established and clear criterion, evaluate possible effect of data processing on downstream analysis and explore spatial occurrence data through an interactive interface.

## Installation

Install the package:

install.packages("naturaList")

## Classify occurrence records based on confidence in the species identification

{naturaList} has as the core function classify_occ(). The rationale of the classification is that the most reliable identification of a specimen is made by a specialist in the taxa. To classify an occurrence at this level of confidence, the classify_occ() function needs of an occurrence and a specialist dataset. The other levels in which data can be classified are derived from information contained in the occurrence dataset. The default order for classification in confidence levels is:

• Level 1 - species was identified by a specialist, if not;
• Level 2 - who identified the species was a not specialist name, if not;
• Level 3 - occurrence record has an image associated, if not;
• Level 4 - the specimen is preserved in a scientific collection, if not;
• Level 5 - the identification was done in filed observation, if not;
• Level 6 - no criteria was met.

The user can alter this order, depending on his/her objectives, except for the Level 1 that is always a species determined by a specialist.

As example, we will use the datasets in {naturaList}: A.setosa, as the occurrence dataset, and speciaLists, as the specialist dataset. In the A.setosa there are occurrence records for Alsophila setosa, a tree fern of the Brazilian Atlantic Forest. This dataset were downloaded from Global Biodiversity Information Facility (GBIF). The speciaLists is a dataset with specialists of ferns and lycophytes of Brazil, which we gathered from the authors of this paper.

# Load package and data
library(naturaList)

data("A.setosa")
data("speciaLists")

# see the size of datasets
dim(A.setosa) # see ?A.setosa for details
dim(speciaLists) # see ?speciaLists for details

Classification using the default order of confidence levels

# classification
occ.class <- classify_occ(A.setosa, speciaLists)
dim(occ.class)

You can check how many occurrences was classified in each level: