# clustvarsel

An R package implementing *Variable Selection for Gaussian Model-Based Clustering*.

Variable selection for Gaussian model-based clustering as implemented in the **mclust** package. The methodology allows to find the (locally) optimal subset of variables in a data set that have group/cluster information. A greedy or headlong search can be used, either in a forward-backward or backward-forward direction, with or without sub-sampling at the hierarchical clustering stage for starting mclust models. By default the algorithm uses a sequential search, but parallelisation is also available.

## Installation

You can install the released version of **clustvarsel** from CRAN using:

`install.packages("clustvarsel")`

## Usage

Usage of the main functions and several examples are included in the papers shown in the references section below.

For an intro see the vignette **A quick tour of clustvarsel**, which is available as

`vignette("clustvarsel")`

The vignette is also available in the *Vignette* section on the navigation bar on top of the package’s web page.

## References

Raftery, A. E. and Dean, N. (2006) Variable Selection for Model-Based Clustering. *Journal of the American Statistical Association*, 101(473), 168-178.

Maugis, C., Celeux, G., Martin-Magniette M. (2009) Variable Selection for Clustering With Gaussian Mixture Models. *Biometrics*, 65(3), 701-709.

Scrucca, L. and Raftery, A. E. (2018) clustvarsel: A Package Implementing Variable Selection for Gaussian Model-based Clustering in R. *Journal of Statistical Software*, 84(1), pp. 1-28.