Skip to contents

EpiModelHPC extends the EpiModel R package for running stochastic network epidemic models on high-performance computing (HPC) systems. If you are already using EpiModel’s netsim function to simulate epidemic dynamics on local hardware and need to scale up – running hundreds of simulations across parameter scenarios on a multi-node cluster – EpiModelHPC provides the tools to get there.

How EpiModelHPC Relates to EpiModel

EpiModel is the core R package for simulating mathematical models of infectious disease dynamics using stochastic, individual-based network models based on exponential-family random graph models (ERGMs). EpiModel handles network estimation (netest), epidemic simulation (netsim), and analysis of results on a single machine.

EpiModelHPC does not replace any of this. Instead, it wraps and extends EpiModel’s simulation engine with functionality needed when models become too computationally intensive for a single workstation:

  • Parallelization: Distributes simulation replicates across multiple cores on an HPC node.
  • Checkpointing: Automatically saves intermediate simulation state at configurable intervals, so that long-running jobs can resume from where they left off if interrupted (e.g., by a wall-time limit).
  • Scenario Batching: Runs multiple parameter scenarios (defined via EpiModel::create_scenario_list) in batched parallel jobs, with deterministic file naming for downstream merging.
  • Result Merging: Merges outputs from distributed batch jobs back into single simulation objects or tidy data frames for analysis.

If your simulations complete in reasonable time on a laptop, you likely do not need this package. EpiModelHPC is designed for the point at which you need to run large numbers of replicates, sweep across many scenarios, or your model’s per-replicate runtime exceeds what is practical without job scheduling and checkpointing.

How slurmworkflow Fits In

slurmworkflow is a companion R package that provides a general-purpose framework for defining and submitting multi-step Slurm job workflows from R. It handles the mechanics of writing sbatch scripts, managing job dependencies, and organizing output directories.

EpiModelHPC builds directly on slurmworkflow by providing step templates – pre-built workflow steps tailored to common EpiModel tasks:

EpiModelHPC Step Template Purpose
step_tmpl_netsim_scenarios() Submit scenario-based netsim simulations as a Slurm array job
step_tmpl_merge_netsim_scenarios() Merge batched simulation files into one file per scenario
step_tmpl_merge_netsim_scenarios_tibble() Convert merged results to tidy tibble format
step_tmpl_netsim_swfcalib_output() Run simulations using calibrated parameters from swfcalib
step_tmpl_renv_restore() Ensure the HPC project environment is up to date via renv

Each step template has a corresponding standalone function (e.g., netsim_scenarios()) that runs the same logic locally for testing before submitting to the cluster.

A typical applied workflow looks like:

  1. Estimate networks locally with EpiModel::netest.
  2. Test simulations locally with netsim_scenarios() on a small number of replicates.
  3. Define a slurmworkflow using step_tmpl_netsim_scenarios() and step_tmpl_merge_netsim_scenarios() to run at scale on the cluster.
  4. Merge and analyze results locally or on the cluster.

EpiModelHPC also provides pre-configured cluster settings for specific HPC environments (swf_configs_hyak() for the University of Washington HYAK cluster, swf_configs_rsph() for the Emory RSPH cluster) that supply sensible default sbatch options and R module-loading commands.

Installation

EpiModelHPC and its companion packages are hosted on GitHub. Install with:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("EpiModel/EpiModelHPC")

This will also install slurmworkflow and swfcalib from their GitHub repositories.

Key Functions

Simulation

Checkpointing

File Management

HPC Configuration

  • swf_configs_hyak() / swf_configs_rsph() – Return lists of sbatch options, renv build settings, and R module-loading commands for supported clusters.
  • pull_env_vars() – Extract Slurm environment variables (e.g., SLURM_ARRAY_TASK_ID) into R’s global environment.

System Requirements

While developed for Linux-based HPC clusters running the Slurm workload manager, the core parallelization and checkpointing functionality works on any system with multiple cores, including macOS and Windows workstations. The slurmworkflow integration and step templates are specific to Slurm-managed clusters.