Edges

Thomas Lin Pedersen

2017-02-23

If the natural ggplot2 equivalent to nodes is geom_point(), then surely the equivalent to edges must be geom_segment()? Well, sort of, but there’s a bit more to it than that.

One does not simply draw a line between two nodes

One does not simply draw a line between two nodes

While nodes are the sensible, mature, and predictably geoms, edges are the edgy (sorry), younger cousins that pushes the boundaries. To put it bluntly:

On the ggraph savannah you definetly want to be an edge!

Meet the geom_edge_*() family

While the introduction might feel a bit over-the-top it is entirely true. An edge is an abstract concept denoting a relationship between two enteties. A straight line is simply just one of many ways this relationship can be visualised. As we saw when discussing nodes sometimes it is not drawn at all but impied using containment or position (treemap, circle packing, and partition layouts), but more often it is shown using a line of some sort. This use-case is handled by the large family of edge geoms provided in ggraph. Some of the edges are general while others are dedicated to specific layouts. Lets creates some graphs for illustrary purpose first:

library(ggraph)
library(igraph)
set_graph_style(plot_margin = margin(1,1,1,1))
hierarchy <- as.dendrogram(hclust(dist(iris[, 1:4])))

# Classify nodes based on agreement between children
hierarchy <- tree_apply(hierarchy, function(node, children, ...) {
    if (is.leaf(node)) {
        attr(node, 'Class') <- as.character(iris[as.integer(attr(node, 'label')),5])
    } else {
        classes <- unique(sapply(children, attr, which = 'Class'))
        if (length(classes) == 1 && !anyNA(classes)) {
            attr(node, 'Class') <- classes
        } else {
            attr(node, 'Class') <- NA
        }
    }
    attr(node, 'nodePar') <- list(class = attr(node, 'Class'))
    node
}, direction = 'up')

hairball <- graph_from_data_frame(highschool)

# Classify nodes based on popularity gain
pop1957 <- degree(delete_edges(hairball, which(E(hairball)$year == 1957)), 
                  mode = 'in')
pop1958 <- degree(delete_edges(hairball, which(E(hairball)$year == 1958)), 
                  mode = 'in')
V(hairball)$pop_devel <- ifelse(pop1957 < pop1958, 'increased',
                                ifelse(pop1957 > pop1958, 'decreased', 
                                       'unchanged'))
V(hairball)$popularity <- pmax(pop1957, pop1958)
E(hairball)$year <- as.character(E(hairball)$year)

Fan

Sometimes the graph is not simple, i.e. it has multiple edges between the same nodes. Using links is a bad choice here because edges will overlap and the viewer will be unable to discover parallel edges. geom_edge_fan() got you covered here. If there are no parallel edges it behaves like geom_edge_link() and draws a straight line, but if parallel edges exists it will spread them out as arcs with different curvature. Parallel edges will be sorted by directionality prior to plotting so edges flowing in the same direction will be plottet together:

ggraph(hairball, layout = 'kk') + 
    geom_edge_fan(aes(colour = year))

Loops

Loops cannot be shown with regular edges as they have no length. A dedicated geom_edge_loop() exists for these cases:

# let's make some of the student love themselves
loopy_hairball <- add_edges(hairball, rep(1:5, each=2), year = rep('1957', 5))
ggraph(loopy_hairball, layout = 'kk') + 
    geom_edge_link(aes(colour = year), alpha = 0.25) + 
    geom_edge_loop(aes(colour = year))

The direction, span, and strength of the loop can all be controlled, but in general loops will add a lot of visual clutter to your plot unless the graph is very simple.

Density

This one is definetly strange, and I’m unsure of it’s usefulness, but it is here and it deserves an introduction. Consider the case where it is of interest to see which types of edges dominates certain areas of the graph. You can colour the edges, but edges can tend to get overplotted, thus reducing readability. geom_edge_density() lets you add a shading to your plot based on the density of edges in a certain area:

ggraph(hairball, layout = 'kk') + 
    geom_edge_density(aes(fill = year)) + 
    geom_edge_link(alpha = 0.25)

Arcs

While some insists that curved edges should be used in standard “hairball” graph visualisations it really is a poor choice, as it increases overplotting and interpretability for virtually no gain (unless complexiness is your thing). That doesn’t mean arcs have no use in graph visualizations. Linear and circular layouts can benefit greatly from them and geom_edge_arc() is provided precisely for this scenario:

ggraph(hairball, layout = 'linear') + 
    geom_edge_arc(aes(colour = year))

Arcs behave differently in circular layouts as they will always bend towards the center no matter the direction of the edge (the same thing can be achieved in a linear layout by setting fold = TRUE).

ggraph(hairball, layout = 'linear', circular = TRUE) + 
    geom_edge_arc(aes(colour = year)) + 
    coord_fixed()

Elbow

Aah… The classic dendrogram with its right angle bends. Of course such visualizations are also supported with the geom_edge_elbow(). It goes without saying that this type of edge requires a layout that flows in a defined direction, such as a dendrogram:

ggraph(hierarchy, layout = 'dendrogram') + 
    geom_edge_elbow()

Diagonals

If right angles aren’t really your thing ggraph provides a smoother version in the form of geom_edge_diagonal(). This edge is a quadratic bezier with control points positioned at the same x-value as the terminal nodes and halfway in-between the nodes on the y-axis. The result is more organic than the elbows:

ggraph(hierarchy, layout = 'dendrogram') + 
    geom_edge_diagonal()

It tend to look a bit weird with hugely unbalanced trees so use with care…

Hive

This is certainly a very specific type of edge, intended only for use with hive plots. It draws edges as quadratic beziers with control point positioned perpendicular to the axes of the hive layout:

ggraph(hairball, layout = 'hive', axis = 'pop_devel', sort.by = 'popularity') + 
    geom_edge_hive(aes(colour = year)) + 
    geom_axis_hive(label = FALSE) + 
    coord_fixed()

The three types of edge geoms

Almost all edge geoms comes in three variants. The basic variant (no suffix) as well as the variant suffixed with 2 (e.g. geom_edge_link2()) calculates a number (n) of points along the edge and draws it as a path. The variant suffixed with 0 (e.g. geom_edge_diagonal0()) uses the build in grid grobs to draw the edges directly (in case of a diagonal it uses bezierGrob()). It might seem strange to have so many different implementations of the same geoms but there’s a reason to the insanity…

Base variant

The basic edge geom is drawn by calculating a number of points along the edge path and draw a line between these. This means that you’re in control of the detail level of curved edges and that all complex calculations happens up front. Generally you will see better performance using the base variant rather than the 0-variant that uses grid grobs, unless you set the number of points to calculate to something huge (50–100 is usually sufficient for a smooth look). Apart from better performance you also get a nice bonus (you actually get several, but only one is discussed here): The possibility of drawing a gradient along the edge. Each calculated point gets an index value between 0 and 1 that specifies how far along the edge it is positioned and this value can be used to e.g. map to an alpha level to show the direction of the edge:

ggraph(hairball, layout = 'linear') + 
    geom_edge_arc(aes(colour = year, alpha = ..index..)) + 
    scale_edge_alpha('Edge direction', guide = 'edge_direction')