The rsurfer package contains functions to aid the importing and manipulation of data generated by the software suite, 'Freesurfer'. 'Freesurfer' is an open-source software suite involving the segmentation of brain MRIs (see for more information). This package provides functionality to import the data generated by 'Freesurfer', specifically the data generated by the cortical restruction function of 'Freesurfer' (recon-all). Once this data is imported, rsurfer provides functions to easily manipulate the data; and also provides brain specific normalisation commonly used when studying structural brain MRIs (a number of intra-cranial volume normalisation methods are provided). This package has been designed using an installation of and data generated from 'Freesurfer' version 5.3 - it may function with older/newer versions but it is untested.

If you would like to request features then please look at the rsurfer repository on GitHub (see, in particular the Issues tab, where bug reports and feature requests can be made. Additionally if you would like to write the functionality yourself and merge it with rsurfer, please email me at:

Importing Data Generated by Freesurfer

Download MRI Data

You need to download the MRI scans you want to analyse, for example MRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI; see:

Process the MRI Data With Freesurfer

With Freesurfer version 5.3, the data can be processed using the the command (adjusting the file locations and ID as to what corresponds with your data):

recon-all -i /input_data/MRI_IMAGE.nii.gz -sd /output_data/ -subjid SUBJECT_ID -all -hippo-subfields

This will then create a subfolder in “output_data” ready to be imported with rsurfer. For more information regarding this step, please use the Freesurfer website:

Import with rsurfer

Now that the data has been generated we can begin to use R and rsurfer. If you do not have any MRI data to import please skip this step.

First install rsurfer if you have not done so already: library(rsurfer)

Then there are two lines of code which will import the data:

mri_data <- fsimport("/output_data/")

The first line uses the function setfshome which points rsurfer to your Freesurfer installation. My Freesurfer installation is installed in “/Applications/freesurfer” so you would need to replace this with where Freesurfer is located on your system. The second line uses the function fsimport which given an input directory it will import all of the subdirectories in this directory into a dataframe and returns the data frame from the function, it will treat every subdirectory as a subject, so it will likely fail if you have subdirectories which are not generated as a result of Freesurfer's cortical reconstruction command.

If you are importing a large number of processed subjects into R it may take some time to run fsimport. Handily rsurfer has a in-built function which will run fsimport and then serialise the output to a file. When this function is run again, it will see the existence of the serialised file and just load that.

mri_data <- fsimport.serialise("/output_data/", "serialised.rds")

The second line uses the function fsimport.serialise, similar to fsimport this points to the output directory of subjects to import. The second parameter refers to the file where the serialised data will be stored. If the serialised file exists then the function will deserialise the data and return that, otherwise it will call the fsimport function and once that is complete it will then serialise the data.

I found that when importing my generated data, some of the rows and columns of the data table are what I would call abnormal (such as missing values or columns where all values are zero). There is a function provided to clean up the imported data.

mri_data <- eliminateabnormalities(mri_data, verbose = T) 

The verbose flag is set to true so you will be aware of which rows and columns are removed in case you want to investigate further.

Generating Random Data With rsurfer

If you have your own MRI data to use, you can skip this step. Otherwise, if you do not have any data to use with rsurfer, you can generate random data that is structured as if it was generated by 'Freesurfer' (note that all the fields have random values but the columns are what is expected as a result from fsimport).

mri_data <- generaterandomsubjects() 

This function will generate 40 subjects, but you can input a number of subjects to generate as an input parameter, for example: generaterandomsubjects(100) will generate 100 subjects.

Manipulating Data Generated by Freesurfer

If you are importing real MRI data generated from Freesurfer, your code should look similar to:

mri_data <- fsimport.serialise("/output_data/", "serialised.rds")

And if you are generating random data, it will look similar to:

mri_data <- generaterandomsubjects() 

Error Checking

If the Freesurfer data fails to import then potentially something has gone wrong in the Freesurfer processing step. rsurfer provides a function which checks for any missing files which may the import function to fail. The function require the filepath of the subjects' directory and will check every subfolder there thus it can perform error checking on every subject in one call.


Extracting Groups of Features

This section will discuss functions you can use to extract groups of features of the data.

extract.brain.features(mri_data) # Extracts all measurements generated by Freesurfer (this is useful when the data has been augmented with other features)

extract.volumes(mri_data) # Extracts all volumetric measurements generated by Freesurfer

extract.hippocampalvolumes(mri_data) # Extracts all the hippocampal volumes generated with the -hippo-subfields flag

extract.cortical(mri_data) # Extracts all the cortical measurements (areas, volumes, thicknesses and standard deviations of thicknesses) from the data

extract.corticalvolumes(mri_data) # Extracts all the cortical volume measurements

extract.corticalsurfaceareas(mri_data) # Extracts all the cortical surface area measurements

extract.corticalthicknesses(mri_data) # Extracts all the cortical thicknesses

extract.corticalthicknessstddevs(mri_data) # Extracts all the standard deviations of the cortical thicknesses

extract.subcorticalvolumes(mri_data) # Extracts all the subcortical volumes generated by Freesurfer

All of the above functions, allow a second parameter to be passed which is a vector of strings determining any additional features to keep. For example if we were to add an age and gender feature to our data:

mri_data$Age <- runif(nrow(mri_data), 50, 80)
mri_data <- addrandomgender(mri_data)

They could then be extracted in the output of one of the above functions like:

extract.corticalthicknesses(mri_data, c("Age", "Gender"))

If you wanted to write a loop which would iterate through the above sets of fields, you can use the extract.byname function, which will take as input the MRI data and a string specifying which field set to return, so you could write something like the below:

for (fieldGroup in c("corticalvolumes", "subcortical", "hippocampal", "corticalareas", "corticalthicknesses", "corticalthicknessstds")) {
    extract.byname(mri_data, fieldGroup)

Discovering Information About a Feature

rsurfer provides methods to discover information about a specific feature:

feature <- "lh.bankssts.area"

get.hemisphere.side(feature) # Gets the hemisphere of the brain a feature belongs to (left or right), if it is neither of those then central is returned

getfieldgroup(feature) # Given a feature name this function will return some information as to what type the measurement is

getfieldgroup(feature, 2) # A second parameter can be input, determining how specific the returned information is. The default value of 1 is the most specific the returned information can be; the value of 2 is the least specific it can be.

get.opposite.hemisphere.measurement(feature) # Given a left hemisphere measurement, will return the corresponding right hemisphere; and vice-versa

Intracranial Volume Normalisation

rsurfer provides method to implement a variety of intracranial volume (ICV) normalisation techniques. Note that if any of these ICV normalisation methods produce an error saying that 'Gender must be a factor' then you can just insert the line of code: mri_data$Gender <- as.factor(mri_data$Gender)

Proportional Intracranial Volume Normalisation

This is the most commonly used method, and is the easiest to implement. the aim is to express a volume as the proportion of the brain it occupies and it is computed by:

\[v' = \frac{v}{ICV}\]

Where \(v\) is the unnormalised volume of the patient, \({ICV}\) is the intra cranial volume of the patient and \(v'\) is the normalised volume of the patient. This method is completely independent of any other subjects.

normalise(mri_data, "normalisation.proportional")

Residual Intracranial Volume Normalisation

The residual method assumes a linear relation between the volumes and the subject's ICV and estimates the relationship such that: \({v' = v - w_1(ICV - \overline{ICV})}\) where \({\overline{ICV}}\) is the average intra cranial volume across all subjects being considered, and \({w_1}\) is a coefficient computed by solving the linear regression problem of \({v' = w_1ICV + w_0}\).

normalise(mri_data, "normalisation.residual")

The residual method can be further refined by assuming a different relationship between male and female subjects' volumes and their ICVs such that if the subject is male: \({v_{MALE}‘ = v - w_{1,MALE}(ICV-\overline{ICV_{MALE}})}\) where \({w_{1,MALE}}\) is found by solving the linear regression problem of \({v’ = w_{1,MALE}ICV + w_{0,MALE}}\) where only male subjects are considered. And a similar problem for female patients where in the above example, males are replaced by females: \({v_{FEMALE}‘ = v - w_{1,FEMALE}(ICV-\overline{ICV_{FEMALE}})}\) where \({w_{1,FEMALE}}\) is found by solving the linear regression problem of \({v’ = w_{1,FEMALE}ICV + w_{0,FEMALE}}\) only considering female subjects.

normalise(mri_data, "normalisation.residualgender")

Covariate Intracranial Volume Normalisation

The covariate method \cite{Nordenskjold2013355} is similar to the residual method but it considers gender at the linear relationship level such that gender is another variable in the linear regression. The normalised volume is computed via \({v' = v - w_1(ICV - \overline{ICV}) + w_2(Gender - \overline{Gender})}\) where \({w_1}\) and \({w_2}\) are found by solving the linear regression problem: \({v' = w_0 + w_1ICV + w_2Gender}\).

normalise(mri_data, "normalisation.covariate")

Power Proportional Intracranial Volume Normalisation

The power proportion method \cite{liu2014power} computes the normalised volume as \({v'=\frac{v}{ICV^{w_1}}}\) where \({w_1}\) is computed by the power regression problem: \({v'=w_0ICV^{w_1}}\). Note that this power regression problem can be converted to a linear regression problem by taking the natural logarithm of both sides to create \({\ln(v') = w_1\ln(ICV)+\ln(w_0)}\).

normalise(mri_data, "normalisation.powerproportion")

Alzheimer's Disease Neuroimaging Initiative Data

The Alzheimer's Disease Neuroimaging Initiative (ADNI) is a database which collects various data (including structural MRIs) to be used in the prediction of Alzheimer's Disease. The structural MRIs belong to patients who are healthy or diagnosed with mild cognitive impairment or Alzheimer's disease. The structural MRIs can be processed with Freesurfer however the output data does not contain any information about the subject such as their age, gender or what they were diagnosed with. ADNI provides this data in two CSV files: ADNIMERGE.csv and DXSUM_ PDXCONV_ ADNIALL.csv. And requires you to extract various data from the files. rsurfer automates this process for baseline scans:

adni.setfiles("DXSUM_PDXCONV_ADNIALL.csv", "ADNIMERGE.csv") # Point rsurfer to the files
mri_data <- adni.mergewithfreesurferoutput(mri_data) # Merges data assuming unchanged subject IDs (the row names)

Information Extraction from Images Data

Information Extraction from Images (IXI) is a database which collects structural MRIs of healthy subjects from a wide age range. Similar to the ADNI functionality, rsurfer provides methods to merge the subject's age and gender with their structural MRI data.

ixi.setfile("IXI.csv") # Point rsurfer to the IXI file
mri_data <- ixi.mergewithfreesurferoutput(mri_data) # Merges data assuming unchanged subject IDs (the row names)

CAD Dementia Data

The CAD Dementia challenge provides a set of structural MRIs, like ADNI and IXI, rsurfer provides functionality to merge subject information with the imported data for both the training and test data.

mri_data <- caddementia.mergewithfreesurferoutput(mri_data)