`diffusr`

implements algorithms for network diffusion such as *Markov random walks with restarts* and *weighted neighbor classification*. Network diffusion has been studied extensively in bioinformatics, e.g. in the field of cancer gene prioritization. Network diffusion algorithms generally spread information in the form of node weights along the edges of a graph to other nodes. These weights can for example be interpreted as temperature, an initial amount of water, the activation of neurons in the brain, or the location of a random surfer in the internet. The information (node weights) is iteratively propagated to other nodes until a equilibrium state or stop criterion occurs.

First load the package:

`library(diffusr)`

where \(r \in (0,1)\) is a *restart probability* of the Markov chain, \(\mathbf{W}\) is a column-normalized stochastic matrix (we do the normalization for you) and \(\mathbf{p}_0\) is the starting distribution of the Markov chain. We calculate the iterative updates, it is also possible to do the math using the nullspace of the matrix (comes later).

If you want to use Markov random walks just try something like this:

```
# count of nodes
n <- 5
# starting distribution (has to sum to one)
p0 <- as.vector(rmultinom(1, 1, prob=rep(.2, n)))
# adjacency matrix (either normalized or not)
graph <- matrix(abs(rnorm(n*n)), n, n)
# computation of stationary distribution
pt <- random.walk(p0, graph)
```

The stationary distribution should have changed quite a bit from the starting distribution:

` print(t(p0))`

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0 0 1 0 0
```

` print(t(pt))`

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.1579699 0.09090171 0.568345 0.1371669 0.0456165
```

You can also use a matrix `p0`

:

```
p0 <- matrix(c(p0, runif(20)), nrow=n)
pt <- random.walk(p0, graph)
pt
```

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.15797855 0.22293778 0.2070523 0.2169080 0.31336196
## [2,] 0.09090428 0.17121634 0.1190796 0.2168514 0.12881567
## [3,] 0.56834002 0.23619908 0.2407777 0.2375290 0.19927161
## [4,] 0.13716304 0.27505107 0.2894142 0.2606216 0.31220979
## [5,] 0.04561411 0.09459572 0.1436763 0.0680899 0.04634098
```

In the later case, a random walk is done over all columns of `p0`

separately.

In the last section we computed the iterative solution of the stationary distribution. You can also choose to do this analytically. In that case we need to take the inverse of the transition matrix which might lead to numerical instability, though. However, it usually runs faster than the iterative version.

```
pt <- random.walk(p0, graph, do.analytical=TRUE)
pt
```

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.15797754 0.22293812 0.2070527 0.21690821 0.31336300
## [2,] 0.09090392 0.17121639 0.1190797 0.21685139 0.12881601
## [3,] 0.56834038 0.23619873 0.2407775 0.23752855 0.19927109
## [4,] 0.13716367 0.27505105 0.2894139 0.26062183 0.31220924
## [5,] 0.04561450 0.09459571 0.1436761 0.06809002 0.04634066
```

Diffusion using *nearest neighbors* is done by traversing through a (weighted) graph and take all the neighbors of a node until a certain depths in the graph is reached. We find shortest paths using priority queues:

```
# count of nodes
n <- 10
# indexes(integer) of nodes for which neighbors should be searched
node.idxs <- c(1L, 5L)
# the adjaceny matrix (does not need to be symmetric)
graph <- rbind(cbind(0, diag(n-1)), 0)
# compute the neighbors until depth 3
neighs <- nearest.neighbors(node.idxs, graph, 3)
```

Let’s see what which nodes we got:

` print(neighs)`

```
## $`1`
## [1] 2 3 4
##
## $`5`
## [1] 6 7 8
```

where \(\mathbf{h}_0\) is the initial heat distribution, \(\mathbf{h}_t\) is the heat distribution at time \(t\) and \(\boldsymbol \lambda\) are the eigenvalues of the of your graph.

You can use the *Laplacian heat diffusion process* like this:

```
# count of nodes
n <- 5
# starting distribution (has to sum to one)
h0 <- as.vector(rmultinom(1, 1, prob=rep(.2, n)))
# adjacency matrix (either normalized or not)
graph <- matrix(abs(rnorm(n*n)), n, n)
# computation of stationary distribution
ht <- heat.diffusion(h0, graph)
```

Here are the results:

` print(t(h0))`

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0 0 0 0 1
```

` print(t(ht))`

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.03745158 0.1105449 0.2103066 0.1081645 0.6503663
```

As before, `p0`

can also be a matrix:

```
h0 <- matrix(c(h0, runif(20)), nrow=n)
ht <- heat.diffusion(h0, graph)
ht
```

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.03745158 0.06625331 0.2997306 0.3283995 0.6595426
## [2,] 0.11054488 0.11250140 0.6529470 0.3586668 0.4893143
## [3,] 0.21030660 0.28323035 0.3480478 0.4977635 0.8236085
## [4,] 0.10816445 0.19031576 0.4486015 0.2845056 0.3425159
## [5,] 0.65036629 0.46126983 0.4220911 0.6177126 0.3958938
```