Analysing Netlogo Simulations Using Netlogo

Kieran Alden


This vignette focuses on how spartan can provide parameter samples for and analyse Netlogo Simulations. Use of spartan to analyse such simulations was detailed in our paper “Easing”

To ensure our demonstration in this tutorial can be replicated, we will utilise a model that was available in the Model Library, albeit with a small number of modifications. The virus model (Wilensky, 1998) simulates how a virus is transmitted and perpetuated amongst a population, based on a number of factors. The model is fully described with the simulation, so we direct readers to study that detail prior to performing this tutorial. As an overview, there are four parameters: the number of people in the population (people), the ease at which the virus spreads (infectiousness), the probability a person recovers from the virus (chance-of-recover), and the duration (weeks) after which the person either recovers or dies (duration). This tutorial will aid researchers understand how a Netlogo model can be better understood using the parameter analysis techniques available within spartan. We recommend that you have read either the tutorials for Techniques 1-4 or through or PLoS Computational Biology paper (Spartan: A Comprehensive Tool for Understanding Uncertainty in Simulations of Biological Systems, Alden et al 2013) before commencing this tutorial.

The version of the model in the library has no defined end point, so our first change is to stop the simulation after 100 years. Secondly, judgements concerning each parameter are constructed by varying the value the parameter is assigned and determining the effect on simulation response. The current model states the percentage of people at the current timepoint who are infected and immune. As we want to examine performance across the simulation, we have added four output measures: counts of the number of people who have died through not recovering from the infection (death-thru-sickness), the number who died but were immune (death-but-immune), the number who died through old age and never caught the infection (death-old-age), and the number of people who died while infected but during the time period allowed for recovery (death-old-andsick). We then use spartan to determine how three of the four parameters above (we exclude people for the reasons this is linked to the output responses) impact simulation behaviour. Thirdly, for reasons that will become clearer later, we introduce a new global parameter, dummy, which has no role in the code, and thus no impact on the simulation, To ease the reproduction of results in this tutorial, we have made the modified version of the model available from the SPARTAN website (


Do note that the idea of this tutorial is to demonstrate the application of the toolkit, and is not intended to act as a full introduction to using Sensitivity Analysis techniques in the analysis of simulation results. In addition, it is assumed the reader has already worked through the vignette for spartan techniques 1-4.


Parameter Robustness (Technique 2 in SPARTAN)

The robustness of a Netlogo simulation to parameter alteration can be determined through the use of this approach. Following the method described by Read et al (2012), a set of parameters of interest are identified, and a range of potential values each parameter could lie within is assigned. The technique examines the sensitivity to a change in one parameter. Thus, the value of each is perturbed independently, with all other parameters remaining at their calibrated value. This technique works with the Netlogo BehaviourSpace feature. In that feature, you can specify a range to explore for each parameter, and Netlogo will construct an XML file from this information and run the experiments. The results are stored in a CSV table, which the researcher can then analyse. spartan works by producing this XML file without the need to be in Netlogo, and then once the researcher has performed runs based on the information in that file, spartan can analyse the resultant data. Note that this tutorial builds on the information in Technique 2 rather than replaces it, so it is recommended that you are aware of the detail in that tutorial.

Parameter Sampling

Now we are going to declare the variables required by the package to produce the Netlogo experiment file. Firstly, the spartan and XML libraries are imported, the latter to aid production of the Netlogo file. The variables required for this analysis are then declared in capital letters. The line underneath, beginning with a #, is a description of that being declared. Set the FILEPATH variable correctly to match the folder where you would like the Netlogo experiment file to be output to.

To get the value sets for each parameter:

This will produce one XML file, in the directory specified, containing a set-up for a Netlogo BehaviourSpace run. Run the experiment using Netlogo (we did this in the terminal in Linux using the headless version of Netlogo, code for this available on our website, but running this in Netlogo is fine too). This will in turn produce one CSV file, saved in the directory of your choosing when you run the experiment.

Analysing The Simulation Data

This section shows an analysis using the example data from our lab website, but the steps are just as applicable if you ran the experiments yourself. Here, all the results, for all parameters and values, are in one file. The technique below processes this file, recovering the results for each parameter and value pair for the specified timepoint, extracting these into one CSV file. This CSV file can then be processed using the analysis methods detailed in Technique 2, reducing the need for specific Netlogo techniques.

# Import the package
# Folder containing the Netlogo Behaviour Space table, AND where the
# processed results will be written to
# Name of the Netlogo Behaviour Space Table file
# Array of the parameters to be analysed, but ONLY those perturbed
# Value assigned to each parameter at calibration (the baseline value)
# The maximum value for each parameter
# The minimum value explored for each parameter
# Amount the parameter value was incremened during sampling
# Timestep of interest. The behaviour space table is likely to contain
# all timesteps - this narrows the analysis
# The simulation output measures being examined. Should be specified
# as they are in the Netlogo file
# For each parameter value being analysed, a file is created
# containing the median of each output measure, of each simulation run
# for that value. This sets the name of this file. 
# The results of the A-Test comparisons of each parameter value
# against that of the parameters baseline value are output as a file.
# This sets the name of this file. 
# A-Test result value either side of 0.5 at which the difference
# between two sets of results is significant
# What each measure represents. Used in graphing results
MEASURE_SCALE<-c("Number of People","Number of People",
"Number of People","Number of People")
# Not used in this case, but when a simulation is analysed at multiple
# timepoints (see tutorials 1-4)

                           ATESTRESULTSFILENAME, TIMESTEP)

# Note that PARAMVALS is set to NULL - we don't use that for Netlogo
                             TIMEPOINTS, TIMEPOINTSCALE)

Latin-Hypercube Sampling and Analysis (Technique 3 in SPARTAN)

Though Technique 2 of this toolkit elucidates the effects of perturbations of one parameter, it cannot show any non-linear effects which occur when two or more are adjusted simultaneously. This can be achieved using Technique 3, a Global Sensitivity Analysis technique. A number of parameter value sets are created through a latin-hypercube sampling approach, which selects values for each parameter from the parameter space, while aiming to reduce any possible correlations when the sample is produced. spartan then constructs a Netlogo experiment file for each parameter value set. The researcher then runs these in Netlogo. With there being a number of experiment files constructed with this technique, we would recommend that the reader uses a scripting language to perform these experiments, using the headless version of Netlogo. The script we have used to do this can be found on our lab website as an example. Once the runs are complete, spartan then analyses the resultant data, revealing any correlations between parameter and value, and thus indicating the paramters of greatest influence on the simulation.

Parameter Sampling

Parameter samples for Netlogo simulations can be generated using the code below, detailed fully in the vignette for Technique 2. The last variable, ALGORITHM, controls whether a fully optimised latin-hypercube algorithm is used, or parameter values chosen from each section of the hypercube randomly. Both these algorithms are taken from the lhs package. Note that although an optimised sample may be preferable, the generation of parameter values using an optimal algorithm may take a long time (in our experience, over 24 hours for just 7 parameters). The ALGORITHM variable can be set to either ”normal” or ”optimal”

Analysing the Results

Again the example shown here is for the example data from our lab website. Note for space reasons that the Netlogo parameter files themselves are not included in the download. This method iterates through all the Netlogo results for all parameter sets and collapses these into one CSV file, which can then be analysed using the techniques detailed in Technique 3.

# Import the package
# Folder containing the Netlogo Behaviour Space table,
# and where the processed results will be written to
# Name of the result file (csv) generated by Netlogo, with no file
# extension
# Location of a file containing the parameter value sets
# generated by the hypercube sampling (i.e. the file generated
# in the previous method of this tutorial. U
# Number of parameter samples generated from the hypercube
# The simulation output measures being examined. Should be specified
# as they are in the Netlogo file
# File name to give to the summary file that is produced showing
# the parameter value sets alongside the median results for each
# simulation output measure. 
# Timestep of interest. The behaviour space table is likely to contain
# all timesteps - this narrows the analysis
# Parameters of interest in this analysis
# What each measure represents. Used in graphing results
MEASURE_SCALE<-c("Number of People","Number of People",
"Number of People","Number of People")
# File name to give to the file showing the Partial Rank Correlation
# Coefficients for each parameter. Again note no file extension
# Not used in this case, but when a simulation is analysed at
# multiple timepoints (see Tutorials 1-4)




Technique 4: eFAST Sampling and Analysis (Technique 4)

This technique analyses simulation results generated through parametering using the eFAST approach (extended Fourier Amplitude Sampling Test). This perturbs the value of all parameters at the same time, with the aim of partitioning the variance in simulation output between input parameters. Values for each parameter are chosen using fourier frequency curves through a parameters potential range of values. A selected number of values are selected from points along the curve. Though all parameters are perturbed simultaneously, the method does focus on one parameter of interest in turn, by giving this a very different sampling frequency to that assigned to the other parameters. Thus for each parameter of interest in turn, a sampling frequency is assigned to each parameter and values chosen at points along the curve. So a set of simulation parameters then exists for each parameter of interest. As this is the case, this method can be computationally expensive, especially if a large number of samples is taken on the parameter search curve, or there are a large number of parameters. On top of this, to ensure adequate sampling each curve is also resampled with a small adjustment to the frequency, creating more parameter sets on which the simulation should be run. This attempts to limit any correlations and limit the effect of repeated parameter value sets being chosen. Thus, for a system where 8 parameters are being analysed, and 3 different sample curves used, 24 different sets of parameter value sets will be produced. Each of these 24 sets then contains the parameter values chosen from the frequency curves. This number of samples should be no lower than 65 (see the Marino paper for an explanation of how to select sample size). Once the sampling has been performed, simulation runs should be performed for each set generated. The eFAST algorithm then examines the simulation results for each parameter value set and, taking into account the sampling frequency used to produce those parameter values, partitions the variance in output between the input parameters.

SPARTAN has the ability to sample the space using the technique, produce Netlogo experiment files for these samples, through which the simulations can be run, and then analyse the resultant simulations.

Parameter Sampling

The method below constructs parameter sets using the eFAST approach, detailed fully in the vignette for Technique 4. Note here the additional parameter: Dummy. Statistical inference in eFAST is generated though comparing the variance of each parameter to that of a parameter known to have no influence on the simulation. The dummy parameter sample fulfils this role.

Analysing Simulation Data

The below example analyses the Netlogo virus model results obtained for an eFAST sample, available from the project website. The objective is to get the Netlogo results into one file, one that is then compatible with the analysis methods detailed in the vignette for Technique 4.

# The directory where the netlogo experiment file should be stored
# Name of the result file generated by Netlogo. The sample number and
# .csv are added to this
# The parameters being examined in this analysis. Include the dummy
# Number of resampling curves to use
# Number of value samples to take from each curve
# The output measures by which you are analysing the results.
# File created containing the median of each output measure, of each
# simulation for this parameter set. Note no file extension
# Timestep of interest. The behaviour space table is likely to contain
# all timesteps. This narrows the analysis
# Output measures to t-test to gain statistical significance
# T-Test confidence interval
# Boolean noting whether graphs should be produced
# Name of the final result file summarising the analysis, showing the
# partitioning of the variance between parameters. Note no file
# extension
# Not used in this case, but when a simulation is analysed at
# multiple timepoints (see Tutorials 1-4)


# Get all the results for each curve into one summary file for each curve

# Run the eFAST analysis