26  Dynamic Network Visualization

This tutorial demonstrates methods for dynamic visualization of the spread of infectious disease over networks using the ndtv package for R. ndtv is part of the larger Statnet suite of software for the representation, modeling, and analysis of network data. It has been designed to work with network-based epidemic modeling data from EpiModel.

26.1 Installation

We need to start by installing the ndtv package. This package version has just been updated on CRAN. You can first start by trying:

install.packages("ndtv")

If that does not work (because binaries for ndtv may not yet be available on CRAN), you can try with:

# install this package in case you do not already have it installed
install.packages("remotes")
remotes::install_github("statnet/ndtv", subdir = "ndtv")

Then check the package version with, which should match the version number below.

library("ndtv")
packageVersion("ndtv")
[1] '0.13.3'

26.2 Setup

We start by loading EpiModel. The htmlwidgets library additionally allows for embedding ndtv movies into RMarkdown HTML pages like this (you do not need to install/load it for this course but we need it to build this page).

library("EpiModel")
library("htmlwidgets")

In general, one primary goal of dynamic network visualization is to gain insight into how the structure of networks over time can influence transmission dynamics. This is different from estimating epidemic outcomes such as disease incidence or prevalence over time. The latter will typically require large networks (thousands of nodes) over large numbers of time steps (decades of time) with many hundreds or thousands of simulations. This allows for more accurate and stable estimation of outcome averages and spread. Dynamic visualization, on the other hand, usually only requires a single simulation of a small network size over a short number of time steps. That is because the visualization is meant to demonstrate the mechanisms rather than estimate the effects.

26.3 Model 1: Edges-Only

We start by setting a seed to ensure reproducible results. You can skip this to get different results from the tutorial page.

set.seed(1234)

26.3.1 Epidemic Model

This shows an SI epidemic in a closed population where the infection probability is 1, and we will start to build up complexity to the network model step-by-step. We start with a network of 100 nodes, with an edges-only model in the formation and dissolution.

nw <- network_initialize(n = 100)
formation <- ~edges
target.stats <- 40
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 20)
est <- netest(nw, formation, target.stats, coef.diss)

For the epidemic model, we will replicate the poker chip example in which the transmission probability per act is 100%. We could modify this to be a lower (more relistic for most diseases) value, but we choose a high value here to show a speedy epidemic spread. We simulate the model over 25 time steps with a single simulation.

param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)

26.3.2 Network Visualization

ndtv is an extension package in Statnet that allows for dynamic visualization for networks over time. Because this involves animation processing, this may require some heavy duty computation.

26.3.2.1 Extract and Process the Networks

First we need to extract the network objects from the larger netsim object. This was one of the key uses of this object type first discussed in Session 3 today. Note that this object has details on the distinct time changes.

nw <- get_network(sim)
nw
NetworkDynamic properties:
  distinct change times: 28 
  maximal time range: -Inf until  Inf 

 Dynamic (TEA) attributes:
  Vertex TEAs:    testatus.active 

Includes optional net.obs.period attribute:
 Network observation period info:
  Number of observation spells: 2 
  Maximal time range observed: 0 until 26 
  Temporal mode: discrete 
  Time unit: step 
  Suggested time increment: 1 

 Network attributes:
  vertices = 100 
  directed = FALSE 
  hyper = FALSE 
  loops = FALSE 
  multiple = FALSE 
  bipartite = FALSE 
  net.obs.period: (not shown)
  total edges= 104 
    missing edges= 0 
    non-missing edges= 104 

 Vertex attribute names: 
    active status testatus.active vertex.names 

 Edge attribute names: 
    active 

Next, we need to add a time-varying nodal attribute to each network that we will animate. This will allow ndtv to color the nodes by disease status over time. See the help documentation for color_tea to see some of the options and details for this step.

nw <- color_tea(nw, verbose = FALSE)

26.3.2.2 Revisiting Static Visualizations

Before we start making a movie with this updated network object, recall the types of static visualizations of the co-evolving dynamic network and epidemic data that are possible with EpiModel. One approach is to extract and work with the transmission matrix within the netsim object.

tm <- get_transmat(sim)
tm
# A tibble: 39 × 7
# Groups:   at, sus [39]
      at   sus   inf infDur transProb actRate finalProb
   <dbl> <int> <int>  <dbl>     <dbl>   <dbl>     <dbl>
 1     2    13    64     16         1       1         1
 2     2    47    64     16         1       1         1
 3     3    72    47      1         1       1         1
 4     4    67    64     18         1       1         1
 5     8    81    67      4         1       1         1
 6     9     1    81      1         1       1         1
 7    10    29    67      6         1       1         1
 8    10    68     1      1         1       1         1
 9    11    16    68      1         1       1         1
10    11    57    68      1         1       1         1
# ℹ 29 more rows

Next, there are three built-in visualizations for the transmission matrix, as shown here. Previously we converted the transmission matrix into a phylo class object and plotted a phylogram with the ape package. Here, we are showing that we can also plot this transmat object directly as a phylogram, and also as two related formats. Each show the time-directed chain of transmissions over time, starting with an initial seed. The middle plot is a directed network (showing who infected whom, here in a single large component because there was one seed). The right plot is a “transmission timeline” that plots the temporal time (the 25 time steps in our model) against generation time for transmissions.

par(mfrow = c(1, 3), mar = c(3, 3, 2, 1))
plot(tm, style = "phylo")
plot(tm, style = "network", displaylabels = TRUE)
plot(tm, style = "transmissionTimeline")

Finally, with the updated ndtv network object from above, we can plot a “proximity timeline” that shows the proximity of the nodes in directed temporal graph space, colored by disease status. Here the network is the full contact network, not just the transmission network of infected nodes. This helps identify the micro-outbreaks of infection that are occurring within sub-clusters of the network with greater closeness.

par(mfrow = c(1, 1), mar = c(3, 3, 2, 1))
proximity.timeline(nw, 
                   vertex.col = "ndtvcol",
                   spline.style = "color.attribute",
                   mode = "sammon",
                   default.dist = 10,
                   chain.direction = "reverse")

26.3.2.3 Dynamic Network Movie

There are lots of possibilities for how to model our networks over time. Here we animate the first 25 time steps of the simulation. To find out all the options, consult the main vignette for the ndtv package.

slice.par <- list(start = 1, end = 25, interval = 1, 
                  aggregate.dur = 1, rule = "any")
render.par <- list(tween.frames = 10, show.time = FALSE)
plot.par <- list(mar = c(0, 0, 0, 0))

This step figures out where the nodes should be displayed in each animation frame. There are several dynamic layout options available, but we use the defaults here.

compute.animation(nw, slice.par = slice.par, verbose = TRUE)

Finally, we render the animation and save it out to an HTML file using d3 Javascript. This makes for nice-looking animations within webpages. See the ndtv help for all the potential options in this step.

render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = 0.9,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  filename = paste0(getwd(), "/movie.html"))

To embed movies into an RMarkdown document like this, we can set the output.mode accordingly.

render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = 0.9,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  output.mode = "htmlWidget")

26.4 Model 2: No Concurrency

Let’s return to the core network model and experiment with extra terms, and then run the movie again. Next we will add concurrent term in the model to control the number of nodes with overlapping partnerships. For reference, the expected number of concurrent nodes is a function of a Poisson distribution with the parameter equal to the mean degree (which is 2 x edges/N):

ppois(1, lambda = 0.8, lower.tail = FALSE) * 100
[1] 19.12079

One helpful ERGM term is concurrent. This is the number of nodes with degree of 2 or higher. From the target stats above, next we can add a value for concurrent above or below the expected value of 19.1, and see how that plays out for our network movie. You may find that your model does not fit if the concurrent target stat is too high. That’s because you can’t have a huge number of concurrent persons if the mean degree (related to concurrency) is 0.8. In other words, the model terms and associated target statistics must be compatible.

26.4.1 Epidemic Model

In this example, we select an extreme concurrency statistic of 0 (no nodes exhibit concurrency).

nw <- network_initialize(n = 100)
formation <- ~edges + concurrent
target.stats <- c(40, 0)
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 20)
est <- netest(nw, formation, target.stats, coef.diss)
Warning: 'glpk' selected as the solver, but package 'Rglpk' is not available;
falling back to 'lpSolveAPI'. This should be fine unless the sample size and/or
the number of parameters is very big.

We use the same epidemic parameters as above, but you can experiment with these too now.

set.seed(12345)
param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)

26.4.2 Static Transmission Output

Let’s just look at what the static transmission outputs look like without network concurrency but with the same epidemic parameters. This network with much more sparse connectivity results in a transmission tree that is also sparser.

par(mfrow = c(1, 3), mar = c(3, 3, 2, 1))
tm <- get_transmat(sim)
plot(tm, style = "phylo")
plot(tm, style = "network", displaylabels = TRUE)
plot(tm, style = "transmissionTimeline")

26.4.3 Dynamic Network Movie

We follow the same steps to make the movie. Notice here that nearly all nodes are connected within a dyad, and therefore that no concurrency implies few isolates given this relatively high mean degree (0.8). In fact, the logical maximum mean degree in a network with no concurrency would be 1.0. Notice how this “serial monogamy” constrains the speed of epidemic spread.

nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = 0.9,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  filename = paste0(getwd(), "/movie.html"))

26.5 Model 3: Relational Duration

One feature of network models is to represent persistent relationships, in which there are repeated contacts between the same set of persons over time. This involves modeling the incident formation and persistence of edges over time. We can also model the extremes of relational duration to see its impact on epidemic potential.

26.5.1 Model 3a: Very Long Durations

Here we use the same model as Model 1, but set the duration to 100,000 time steps. This effectively makes all edges fixed over the 25 simulation time steps.

nw <- network_initialize(n = 100)
formation <- ~edges
target.stats <- 40
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 1e5)
est <- netest(nw, formation, target.stats, coef.diss)

We use the same epidemic parameters as above.

param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)

We follow the same steps to make the movie. Nothing much happening, even with a transmission probability of 100%.

nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = 0.9,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  filename = paste0(getwd(), "/movie.html"))

26.5.2 Model 3b: Short-Duration Contacts

Here we use the same model as Model 1, but set the duration to 2 time steps. This makes edges randomly disolve with high probabilities at each time step.

nw <- network_initialize(n = 100)
formation <- ~edges
target.stats <- 40
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 2)
est <- netest(nw, formation, target.stats, coef.diss)

We use the same epidemic parameters as above.

param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)

Yikes! There is not only massive turnover in the edges at each time step (which makes the visualization a jumble of moving nodes), but the epidemic quickly reaches a saturation point.

nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = 0.9,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  filename = paste0(getwd(), "/movie.html"))

26.6 Model 4: Triangles

Next let’s try another model where we add extra “triangles”: clusters of three people connected together in a local group. The gwesp term allows us to do this without problems of model degeneracy (see Day 2 materials). The null value for the target statistic is less than 1, so let’s ramp it up a bit and see what happens. We will recycle the same epidemic parameters from the model above.

nw <- network_initialize(n = 100)
formation <- ~edges + gwesp(0, TRUE)
target.stats <- c(40, 25)
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 20)
est <- netest(nw, formation, target.stats, coef.diss)
sim <- netsim(est, param, init, control)

We follow the same steps to make the movie. Notice the distinct triangle formation in the network edges. Are triangles associated with faster or slower epidemic spread?

nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = 0.9,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  filename = paste0(getwd(), "/movie.html"))

26.7 Model 5: Extra Nodal Attributes

Finally, we show an example of how to work with additional vertex attributes within the model. Unlike in today’s Session 5 tutorial, which used the special group attribute, here we show how to use other (non-special) attributes in combination.

In this model, we want to parameterize a network model in which age and race/ethnicity attributes are represented. Race will be a single binary variable (0 and 1), whereas age will be an integer in years between 18 and 50. We use R’s built-in random sampling functions to generate values for these attributes and then set them on the network object.

nw <- network_initialize(n = 100)
nw <- set_vertex_attribute(nw, "race", rbinom(100, 1, 0.5))
nw <- set_vertex_attribute(nw, "age", sample(18:50, 100, TRUE))
nw
 Network attributes:
  vertices = 100 
  directed = FALSE 
  hyper = FALSE 
  loops = FALSE 
  multiple = FALSE 
  bipartite = FALSE 
  total edges= 0 
    missing edges= 0 
    non-missing edges= 0 

 Vertex attribute names: 
    age race vertex.names 

No edge attributes

This will be the most complex TERGM formula that we have seen so far. It introduces a couple of new ERGM terms for Day 3 and chains them all together in a single model. We start with an edges term with an associated target statistic of 50 (so an overall mean degree of 1). Next is a nodematch("race") term that specifies that 40 of those 50 edges are within the same race group. After that the nodefactor("race") term allows the mean degree to vary by race. We will work on parameterizing this statistic further in Day 4, but with a target statistic of 70, this means that the race=1 group has a mean degree of 70/50=1.4 (the nodefactor term has a default reference group that is the lowest value of the referenced attribute). Next, the absdiff term is a mixing term, like nodematch, but for a continuous attribute (like age). The target statistic of 100 represents the sum of the absolute differences in the age attribute across all dyads in the network; with 50 edges, this means that dyads are on average 2 years apart in age (in either direction). Finally, we include a concurrent term to parameterize network degree. In the dissolution model, we specify that the average duration of edges is 20 time steps in relations not matched on race, and 10 time steps in relations matched on race.

formation <- ~edges + nodematch("race") + nodefactor("race") +
              absdiff("age") + concurrent
target.stats <- c(50, 40, 70, 100, 30)
coef.diss <- dissolution_coefs(dissolution = ~offset(edges) + offset(nodematch("race")),
                               duration = c(20, 10))
est <- netest(nw, formation, target.stats, coef.diss)

Phew! The epidemic model is simple, and parameterized as before.

param <- param.net(inf.prob = 1)
init <- init.net(i.num = 10)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)

The first steps of making the network movie are the same as before.

nw <- get_network(sim)
nw <- color_tea(nw)
slice.par <- list(start = 1, end = 25, interval = 1,
                  aggregate.dur = 1, rule = "any")
render.par <- list(tween.frames = 10, show.time = FALSE)
plot.par <- list(mar = c(0, 0, 0, 0))
compute.animation(nw, slice.par = slice.par, verbose = TRUE)

But to visualize race and age in the movie, we will extract the attributes back from the network object. And then scale them for appropriate visuals. We convert race into two numbers, 4 and 50, the will represent the number of sides in the node (so: a square and a circle). We scale age by a factor of 1/25 to allow the node size to be a simple function of age.

race <- get_vertex_attribute(nw, "race")
race.shape <- ifelse(race == 1, 4, 50)

age <- get_vertex_attribute(nw, "age")
age.size <- age/25

And here’s our final movie!

render.d3movie(
  nw,
  render.par = render.par,
  plot.par = plot.par,
  vertex.cex = age.size,
  vertex.sides = race.shape,
  vertex.col = "ndtvcol",
  edge.col = "darkgrey",
  vertex.border = "lightgrey",
  displaylabels = FALSE,
  filename = paste0(getwd(), "/movie.html"))