BioCircos: Generating circular multi-track plots

Loan Vulliard

2018-02-15

Introduction

This package allows to implement in ‘R’ Circos-like visualizations of genomic data, as proposed by the BioCircos.js JavaScript library, based on the JQuery and D3 technologies.
We will demonstrate here how to generate easily such plots and what are the main parameters to customize them. Each example can be run independently of the others.
For a complete list of all the parameters available, please refer to the package documentation.

Motivation

The amount of data produced nowadays in a lot of different fields assesses the relevance of reactive analyses and interactive display of the results. This especially true in biology, where the cost of sequencing data has dropped must faster than the Moore’s law prediction. New ways of integrating different level of information and accelerating the interpretation are therefore needed.

The integration challenge appears to be of major importance, as it allows a deeper understanding of the biological phenomena happening, that cannot be observed in the single analyses independently.

This package aims at offering an easy way of producing Circos-like visualizations to face distinct challenges :

The terminology used here arises from genomics but this tool may be of interest for different situations where different positional or temporal informations must be combined.

Installation

To install this package, you can use CRAN (the central R package repository) to get the last stable release or build the last development version directly from the GitHub repository.

From CRAN

From Github

Generating Circos-like visualizations

Principle

To produce a BioCircos visualization, you need to call the BioCircos method, that accepts a tracklist containing the different tracks to be displayed, the genome to be displayed and plotting parameters.
By default, an empty tracklist is used, and the genome is automatically set to use the chromosome sizes of the reference genome hg19 (GRCh37).

Genome configuration

A genome needs to be set in order to map all the coordinates of the tracks on it.
For now, the only pre-configured genome available is hg19 (GRCh37), for which the length of the main 22 genomic autosomal chromosome pairs and of the sexual chromosomes are available. The Y chromosome can be removed using the ychr parameter. Visual parameters are also available, such as by giving a vector of colors or a RColorBrewer palette to change the colors of each chromosome (parameter genomeFillColor), the space between each chromosome (chrPad) or their borders (displayGenomeBorder).
The ticks, displaying the scale on each chromosome, can be removed with genomeTicksDisplay, and the genome labels (chromosome names) can be brought closer or further away from the chromosomes with genomeLabelDy.

To use your own reference genome, you need to define a named list of chromosomal lengths and use it as the genome parameter. The names and lengths should match the coordinates you plan on using later for your tracks.
You may want to change the scale of the ticks on the chromosomes, to fit to your reference genome, with the genomeTickScale parameters.

Another use of a custom genome can be seen in the Bar tracks section.

Tracklists

The different levels of information will be displayed on different tracks of different types and located at different radii on the visualization. All the track-generating functions of this package return tracklists that can be added together into a single tracklist, to be given as the tracks argument of the BioCircos method.
The different kinds of tracks are presented in the following sections.
All tracks need to be named.

Text track

A first track simply corresponds to text annotations. The obligatory parameters are the track name and the text to be displayed. Some parameters such as the size, the opacity and the coordinates can be customized.

library(BioCircos)

tracklist = BioCircosTextTrack('myTextTrack', 'Some text', size = "2em", opacity = 0.5, 
  x = -0.67, y = -0.5)

BioCircos(tracklist, genomeFillColor = "PuOr",
  chrPad = 0, displayGenomeBorder = FALSE, 
  genomeTicksLen = 2, genomeTicksTextSize = 0, genomeTicksScale = 1e+8,
  genomeLabelTextSize = "9pt", genomeLabelDy = 0)

Background track

Another simple track type correspond to backgrounds, displayed under other tracks, in a given radius interval.

library(BioCircos)

tracklist = BioCircosBackgroundTrack("myBackgroundTrack", minRadius = 0.5, maxRadius = 0.8,
  borderColors = "#AAAAAA", borderSize = 0.6, fillColors = "#FFBBBB")  

BioCircos(tracklist, genomeFillColor = "PuOr",
  chrPad = 0.05, displayGenomeBorder = FALSE, 
  genomeTicksDisplay = FALSE,  genomeLabelTextSize = "9pt", genomeLabelDy = 0)

SNP track

To map punctual information associated with a single-dimensional value on the reference genome, such as a variant or an SNP associated with a confidence score, SNP tracks can be used.
It is therefore needed to specify the chromosome and coordinates where each points are mapped, as well as the corresponding value, which will be used to compute the radial coordinate of the points.
By default, points display a tooltip when hovered by the mouse.

library(BioCircos)

# Chromosomes on which the points should be displayed
points_chromosomes = c('X', '2', '7', '13', '9')
# Coordinates on which the points should be displayed
points_coordinates = c(102621, 140253678, 98567307, 28937403, 20484611) 
# Values associated with each point, used as radial coordinate 
#   on a scale going to minRadius for the lowest value to maxRadius for the highest value
points_values = 0:4

tracklist = BioCircosSNPTrack('mySNPTrack', points_chromosomes, points_coordinates, 
  points_values, colors = c("tomato2", "darkblue"), minRadius = 0.5, maxRadius = 0.9)

# Background are always placed below other tracks
tracklist = tracklist + BioCircosBackgroundTrack("myBackgroundTrack", 
  minRadius = 0.5, maxRadius = 0.9,
  borderColors = "#AAAAAA", borderSize = 0.6, fillColors = "#B3E6FF")  

BioCircos(tracklist, genomeFillColor = "PuOr",
  chrPad = 0.05, displayGenomeBorder = FALSE, yChr =  FALSE,
  genomeTicksDisplay = FALSE,  genomeLabelTextSize = 18, genomeLabelDy = 0)

Arc track

Arc tracks are displaying arcs along the genomic circle, between the radii given as the minRadius and maxRadius parameters. As for an SNP track, the chromosome and coordinates (here corresponding to the beginning and end of each arc) should be specified. By default, arcs display a tooltip when hovered by the mouse.

library(BioCircos)

arcs_chromosomes = c('X', 'X', '2', '9') # Chromosomes on which the arcs should be displayed
arcs_begin = c(1, 45270560, 140253678, 20484611)
arcs_end = c(155270560, 145270560, 154978472, 42512974)

tracklist = BioCircosArcTrack('myArcTrack', arcs_chromosomes, arcs_begin, arcs_end,
  minRadius = 1.18, maxRadius = 1.25, opacities = c(0.4, 0.4, 1, 0.8))

BioCircos(tracklist, genomeFillColor = "PuOr",
  chrPad = 0.02, displayGenomeBorder = FALSE, yChr =  FALSE,
  genomeTicksDisplay = FALSE,  genomeLabelTextSize = 0)

Bar tracks

Bar plots may be added on another type of tracks. The start and end coordinates of each bar, as well as the associated value need to be specified.
By default, the radial range of the track will stretch from the minimal to the maximum value of the track, but other boundaries may be specified with the range parameter.
Here, to add a track to the tracklist at each iteration of the loop, we initialize the tracks tracklist with an empty BioCircosTracklist object.

library(BioCircos)
library(RColorBrewer)
library(grDevices)

# Define a custom genome
genomeChr = LETTERS
lengthChr = 5*1:length(genomeChr)
names(lengthChr) <- genomeChr

tracks = BioCircosTracklist()
# Add one track for each chromosome
for (i in 1:length(genomeChr)){
  # Define histogram/bars to be displayed
  nbBars = lengthChr[i] - 1
  barValues = sapply(1:nbBars, function(x) 10 + nbBars%%x)
  barColor = colorRampPalette(brewer.pal(8, "YlOrBr"))(length(genomeChr))[i]
  # Add a track with bars on the i-th chromosome
  tracks = tracks + BioCircosBarTrack(paste0("bars", i), chromosome = genomeChr[i], 
    starts = (1:nbBars) - 1, ends = 1:nbBars, values = barValues, color = barColor, 
    range = c(5,75))
}

# Add background
tracks = tracks + BioCircosBackgroundTrack("bars_background", colors = "#2222EE")

BioCircos(tracks, genomeFillColor = "YlOrBr", genome = as.list(lengthChr), 
  genomeTicksDisplay = F, genomeLabelDy = 0)