swfcalib automates the calibration of complex multi-parameter, multi-output models on Slurm-equipped HPC systems. It uses slurmworkflow to orchestrate an iterative propose-evaluate loop without requiring a long-running pilot job, and was built to calibrate stochastic network epidemic models developed with EpiModel.
Role in the EpiModel Ecosystem
swfcalib sits at the intersection of several EpiModel packages that together support large-scale epidemic modeling on HPC clusters:
┌─────────────────────────────────────────────────────────────────────┐
│ EpiModelHIV-Template │
│ (project repo: workflows, data, params, calibration config) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ EpiModel / │ │slurmworkflow │ │ swfcalib │ │
│ │ EpiModelHIV │ │ │ │ │ │
│ │ │ │ Slurm step │ │ calibration logic: │ │
│ │ simulation │◄──┤ sequencing &├──►│ waves, jobs, │ │
│ │ engine │ │ job arrays │ │ proposals, & results │ │
│ └──────────────┘ └──────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘EpiModel / EpiModelHIV
EpiModel is the simulation engine for stochastic network epidemic models. EpiModelHIV extends it with HIV-specific modules. swfcalib treats the simulation as a black box: it passes a one-row tibble of parameter values (the proposal) to a user-defined simulator function and receives a one-row tibble of summary statistics (the outcomes) back. This means swfcalib is not limited to EpiModel – any model that conforms to this interface can be calibrated.
slurmworkflow
slurmworkflow provides the mechanism for chaining Slurm jobs into multi-step workflows. swfcalib uses it to implement an iterative loop on the HPC:
-
Step 1 (
calibration_step1): Process results from the previous iteration, assess convergence, and generate new proposals (or advance to the next wave / finish). -
Step 2 (
calibration_step2): Run the simulation in parallel across a Slurm job array – one simulation per proposal. The last batch rewinds the workflow back to Step 1. -
Step 3 (
calibration_step3): Finalize calibration, export the calibrated parameter set.
This looping structure means the calibration can run for days or weeks without a continuously running pilot job occupying a Slurm allocation.
EpiModelHIV-Template
The EpiModelHIV-Template is a project template that provides the directory structure, workflow scripts, and configuration files for running EpiModelHIV analyses on an HPC. The template includes a calibration workflow that uses swfcalib + slurmworkflow out of the box. In practice, researchers:
- Define their model’s simulator function and calibration targets in the template’s workflow scripts.
- Configure the
calib_object(waves, jobs, proposers, convergence criteria). - Use
slurmworkflowto build the three-step workflow and submit it to the HPC. -
swfcalibiterates autonomously until all parameters converge or the iteration limit is reached. - The calibrated parameters are exported as a CSV and fed into downstream simulation and analysis workflows defined in the template.
Should You Use swfcalib?
swfcalib is not the simplest calibration system to set up. It was designed to solve a specific set of problems. If you already have a system that works well, you should probably not switch to swfcalib.
However, swfcalib may be a good fit if:
- Your model has many parameters to calibrate and produces many outputs.
- You have domain knowledge about which parameters influence which outputs (allowing decomposition into independent calibration jobs).
- Your model outputs are noisy and require many replications per parameter set.
- You cannot or do not want a Slurm job running continuously for the duration of the calibration (days to weeks).
Design
The calibration process follows an iterative propose-evaluate loop: the model is run with a set of parameter proposals, results are assessed, and new proposals are generated. This continues until the model is fully calibrated.
Terminology
- Model: a function taking a proposal and returning some outcomes.
- Proposal: a set of parameter values to pass to the model.
- Outcomes: the output of a model run for a given proposal.
- Job: a calibration sub-problem – a set of parameters to calibrate using a subset of the outcomes.
- Wave: a set of independent jobs that can be calibrated using the same model runs.
Waves and Jobs
The calibration is split into sequential waves. Each wave contains one or more jobs that run in parallel, each focusing on a subset of parameters and their related outcomes. At each iteration within a wave, the model is run once per proposal and each job evaluates only the outcomes it cares about.
Once all jobs in a wave converge, their calibrated parameter values are locked in and the system advances to the next wave. This allows later waves to calibrate parameters that depend on those fixed earlier – for example, calibrating transmission scalers only after diagnosis and treatment rates have been determined.
User-Supplied Functions
swfcalib makes no assumptions about how proposals should be generated or how convergence should be assessed. Each job requires two user-supplied functions:
-
make_next_proposals: given current results, produce the next set of parameter proposals. -
get_result: given current results, determine if calibration is complete for this job (return a one-row tibble of calibrated values) or not yet (returnNULL).
swfcalib provides built-in function factories for common strategies:
| Function | Strategy |
|---|---|
make_proposer_se_range() |
Retain the best proposals (by squared error) and sample the next round from their ranges via Latin Hypercube Sampling. |
make_shrink_proposer() |
Shrink the proposal range by a factor around the current best guess. |
determ_end_thresh() |
Finish when enough simulations produce outcomes within specified thresholds of the targets. |
determ_poly_end() |
Fit a polynomial regression to predict the optimal parameter value; finish when predictions stabilize. |
Installation
You can install the development version of swfcalib like so:
remotes::install_github("EpiModel/swfcalib")Example
See the Getting Started vignette for a complete worked example calibrating an HIV epidemic model, including wave and job configuration, simulator function setup, and the slurmworkflow integration.