Entity generation

Page last modified

November 26, 2025

Learning objectives:

Create a simple arrivals model.
Organise simulation distributions with a registry.

Pre-reading:

As this builds on concepts introduced in prior pages, we suggest first reading:

Acknowledgements: Inspired by Rosser et al. (2025).

Required packages:

These should be available from environment setup in the “Test yourself” section of Environments.

import numpy as np
import simpy
from sim_tools.distributions import DistributionRegistry, Exponential

library(R6)
library(simmer)

How to run the example code

The examples below can be run in a script, notebook, or as part of a package. Setup and imports differ slightly depending on your method; see below for details.

Script / Notebook

Start with necessary imports (e.g., library(simmer)).
Use functions/class directly, or prefix with simmer::.
Roxygen @importFrom tags do not affect scripts - but are included in case you choose to use the package structure.

Package structure - recommended (see Structuring as a package page for more details)

Use @importFrom in docstrings.
Run devtools::document() (to update imports in your NAMESPACE) and devtools::load_all() (to load your package and dependencies).
Do not need library() imports in script, as loading your package will make all imported functions available automatically.

Creating a simple simulation model with arrivals

Before writing any code, it’s important that you have defined a conceptual model of the process to guide design of your simulation, and that the necessary parameters are defined. With these essentials in place, the process of building the simulation can start.

The best approach is to build the model step by step, testing and understanding each component before adding more complexity. Typically, the first step is to write code that generates arrivals into the system.

Parameter class

First, we need an object with the simulation parameters. Here, we use a class similar to that on the Parameters from script page, with some tweaks to the parameters contained.

class Parameters:
    """
    Parameter class.

    Attributes
    ----------
    interarrival_time : float
        Mean time between arrivals (minutes).
    run_length : int
        Total duration of simulation (minutes).
    verbose : bool
        Whether to print messages as simulation runs.
    """
    def __init__(
        self, interarrival_time=5, run_length=50, verbose=True
    ):
        """
        Initialise Parameters instance.

        Parameters
        ----------
        interarrival_time : float
            Time between arrivals (minutes).
        run_length : int
            Total duration of simulation (minutes).
        verbose : bool
            Whether to print messages as simulation runs.
        """
        self.interarrival_time = interarrival_time
        self.run_length = run_length
        self.verbose = verbose

Patient class

We’ll create a class to represent patients. Classes are useful as they can keep track of state and group logic (see Code organisation page for more).

For now, this will just have two attributes: patient_id and arrival_time. However, when we start tracking Performance measures, this class will be modified to store other attributes useful for output analysis.

class Patient:
    """
    Represents a patient.

    Attributes
    ----------
    patient_id : int
        Unique patient identifier.
    arrival_time : float
        Time patient entered the system (minutes).
    """
    def __init__(self, patient_id, arrival_time):
        """
        Initialises a new patient.

        Parameters
        ----------
        patient_id : int
            Unique patient identifier.
        arrival_time : float
            Time patient entered the system (minutes).
        """
        self.patient_id = patient_id
        self.arrival_time = arrival_time

Model class

We’ll also create a class for the model. This class is explained line by line below.

class Model:
    """
    Simulation model.

    Attributes
    ----------
    param : Parameters
        Simulation parameters.
    run_number : int
        Run number for random seed generation.
    env : simpy.Environment
        The SimPy environment for the simulation.
    patients : list
        List of Patient objects.
    arrival_dist : Exponential
        Distribution used to generate random patient inter-arrival times.
    """
    def __init__(self, param, run_number):
        """
        Create a new Model instance.

        Parameters
        ----------
        param : Parameters
            Simulation parameters.
        run_number : int
            Run number for random seed generation.
        """
        self.param = param
        self.run_number = run_number

        # Create SimPy environment
        self.env = simpy.Environment()

        # Create a random seed sequence based on the run number
        ss = np.random.SeedSequence(self.run_number)
        seeds = ss.spawn(1)

        # Set up attributes to store results
        self.patients = []

        # Initialise distributions
        self.arrival_dist = Exponential(
            mean=self.param.interarrival_time,
            random_seed=seeds[0]
        )

    def generate_arrivals(self):
        """
        Process that generates patient arrivals.
        """
        while True:
            # Sample and pass time to next arrival
            sampled_iat = self.arrival_dist.sample()
            yield self.env.timeout(sampled_iat)

            # Create a new patient
            patient = Patient(patient_id=len(self.patients)+1,
                              arrival_time=self.env.now)
            self.patients.append(patient)

            # Print arrival time
            if self.param.verbose:
                print(f"Patient {patient.patient_id} arrives " +
                      f"at time: {patient.arrival_time:.3f}")

    def run(self):
        """
        Run the simulation.
        """
        # Schedule arrival generator
        self.env.process(self.generate_arrivals())

        # Run the simulation
        self.env.run(until=self.param.run_length)

Explaining the model class

def __init__(self, param, run_number):
    """
    Create a new Model instance.

    Parameters
    ----------
    param : Parameters
        Simulation parameters.
    run_number : int
        Run number for random seed generation.
    """
    self.param = param
    self.run_number = run_number

There are two arguments required by the initialiser: param and run_number.

The param argument accepts an instance of the Parameters object, which groups all necessary simulation values in one place. This minimises the number of arguments required by Model, as it only needs one input for parameters.

The run_number will be used when generating the random seeds. This is important when performing multiple replications, as it ensures each run gets a unique seed (see the Replications page for more details).

# Create SimPy environment
self.env = simpy.Environment()

A SimPy Environment is the main simulation container. Everything that happens in the simulation is managed inside this environment. It tracks the flow of time and manages when events, such as arrivals, should occur.

For example, when patient arrivals are added, the environment is responsible for controlling their timing and execution.

# Create a random seed sequence based on the run number
ss = np.random.SeedSequence(self.run_number)
seeds = ss.spawn(1)

On the previous page, you saw how using separate random generators (streams) prevents changes in one process from affecting another. But to get truly independent random streams, you also need to make sure those generators start with seeds that are themselves independent. This avoids unwanted correlations and ensures reproducibility.

You might be tempted to create seeds for each generator using a simple pattern like: seed1 = run_number, seed2 = run_number * 2, seed = run_number*3.

However, it is more robust to use NumPy’s SeedSequence method. You simply provide a single starter seed (e.g., run_number), then call SeedSequence.spawn() to generate a list of independent seeds. In this case, we just create a list with one seed (spawn(1)).

Why do we want independent seeds? When you manually create seeds using simple rules (e.g., +1, +2), there’s a chance that your random streams will end up correlated. These hidden correlations can make simulation results less reliable, particularly if you run many experiments, parallelise tasks, or rely on random streams interacting. np.random.SeedSequence() is specifically designed to produce seeds that will generate independent streams.

Independent seeding also helps you avoid subtle mistakes that can happen with manual schemes. For example, if you used [run_number * 1, run_number * 2, run_number * 3] to make seeds for different generators, then for run_number = 0 every stream would get a seed of 0, making all your streams identical!

# Set up attributes to store results
self.patients = []

The patients list will store the Patient instances created for each arrival. The list stores references to the objects, so any changes made to a Patient’s attributes after being added will be reflected in the list.

# Initialise distributions
self.arrival_dist = Exponential(
    mean=self.param.interarrival_time,
    random_seed=seeds[0]
)

We’ve used the Exponential distribution from sim-tools to generate inter-arrival times for patients.

def generate_arrivals(self):
    """
    Process that generates patient arrivals.
    """
    while True:
        # Sample and pass time to next arrival
        sampled_iat = self.arrival_dist.sample()

The generate_arrivals method is where patient arrivals are produced throughout the model run.

Using while True ensures the arrival generator continues to cycle and create new arrivals until the overall simulation time limit is reached.

Each time the loop cycles, it draws the next inter-arrival time from the exponential distribution, waits for that interval, and then triggers an arrival event.

yield self.env.timeout(sampled_iat)

The statement yield self.env.timeout(sampled_iat) is how the simulation waits between successive arrivals. It tells the SimPy environment to pause this arrivals process for the sampled time interval.

# Create a new patient
patient = Patient(patient_id=len(self.patients)+1,
                  arrival_time=self.env.now)
self.patients.append(patient)

Create an instance of the Patient class for each arrival, with a patient_id and arrival_time.

# Print arrival time
if self.param.verbose:
    print(f"Patient {patient.patient_id} arrives " +
          f"at time: {patient.arrival_time:.3f}")

Printing the time of each arrival gives a quick and visible check that the arrival process is working as expected. For now, simple print statements are effective, but this can later be replaced by a more sophisticated logging approach (see Logging).

def run(self):
    """
    Run the simulation.
    """
    # Schedule arrival generator
    self.env.process(self.generate_arrivals())

Our final method in Model is run, which is responsible for setting up and running the simulation itself.

Before running the simulation, we have to tell SimPy that the process for creating arrivals should be managed as an “event generator” by calling self.env.process(). SimPy will now keep track of the arrival process and schedule its actions (timeouts and events) according to the rules laid out in our method.

# Run the simulation
self.env.run(until=self.param.run_length)

The simulation is now ready to go! We run it by calling self.env.run(), which will run the scheduled events until the desired end time is reached (taken from the Parameters object);

Run the model

To try running the model, set up the parameters and model objects as shown, then call run().

param = Parameters()
model = Model(param=param, run_number=0)
model.run()

Patient 1 arrives at time: 16.468
Patient 2 arrives at time: 20.283
Patient 3 arrives at time: 26.545
Patient 4 arrives at time: 27.675
Patient 5 arrives at time: 28.779
Patient 6 arrives at time: 37.778
Patient 7 arrives at time: 38.108
Patient 8 arrives at time: 42.611
Patient 9 arrives at time: 44.088

Parameter function

First, we need an object with the simulation parameters. Here, we use a function similar to that on the Parameters from script page, with some tweaks to the parameters contained.

#' Generate parameter list.
#'
#' @param interarrival_time Numeric. Time between arrivals (minutes).
#' @param run_length Numeric. Total duration of simulation (minutes).
#' @param verbose Boolean. Whether to print messages as simulation runs.
#'
#' @return A named list of parameters.
#' @export

create_params <- function(interarrival_time = 5L,
                          run_length = 50L,
                          verbose = TRUE) {
  list(
    interarrival_time = interarrival_time,
    run_length = run_length,
    verbose = verbose
  )
}

Model function

We’ll then create a function for the model (explained line by line below).

#' Run simulation.
#'
#' @param param List. Model parameters.
#' @param run_number Numeric. Run number for random seed generation.
#'
#' @importFrom simmer add_generator run simmer timeout trajectory
#'
#' @export

model <- function(param, run_number) {

  # Set random seed based on run number
  set.seed(run_number)

  # Create simmer environment
  env <- simmer("simulation", verbose = param[["verbose"]])

  # Define the patient trajectory
  patient <- trajectory("consultation") |>
    timeout(1L)

  env <- env |>
    # Add patient generator
    add_generator("patient", patient, function() {
      rexp(n = 1L, rate = 1L / param[["interarrival_time"]])
    }) |>
    # Run the simulation
    simmer::run(until = param[["run_length"]])

}

Explaining the model function

#' Run simulation.
#'
#' @param param List. Model parameters.
#' @param run_number Numeric. Run number for random seed generation.
#' 
#' @importFrom simmer add_generator run simmer timeout trajectory
#' 
#' @export

model <- function(param, run_number) {

There are two arguments required by the function: param and run_number.

The param argument requires the named list of parameters from the create_params() function, which groups all necessary simulation values in one list. This minimise the number of arguments required by model(), as it only needs one input for parameters.

# Set random seed based on run number
set.seed(run_number)

A global random seed is set based on the run_number.

# Create simmer environment
env <- simmer("simulation", verbose = param[["verbose"]])

The simmer environment is the main simulation container. Everything that happens in the simulation is organised inside this environment. It tracks the flow of time and manages when events, such as arrivals, should occur.

For example, when patient arrivals are added, the environment is responsible for controlling their timing and execution.

The verbose argument is set to TRUE in param, and enables activity information to be printed during simulation execution. This gives a quick and visible check that the arrival process is working as expected. Custom logging approaches are explored later (see Logging).

# Define the patient trajectory
patient <- trajectory("consultation") |>
  timeout(1)

In simmer, a trajectory must be defined for any type of arrival. It specifies the sequence of steps each arrival follows in the simulation. For now, we will just set a simple trajectory: timeout(1), which means the arrival waits for one time unit, effectively doing nothing.

env <- env |>
  # Add patient generator
  add_generator("patient", patient, function() {
    rexp(n = 1L, rate = 1L / param[["interarrival_time"]])
  }) |>
  # Run the simulation
  simmer::run(until = param[["run_length"]])

We use add_generator to set up the patient arrivals. It uses the exponential distribution (rexp()) to sample inter-arrival times (with mean specified by param).

The simulation is then run for a specified duration using the run() method. Because some R packages also have a function run, its safest to use simmer::run() to avoid function name conflicts.

Run the model

To try running the model, set up the parameters and run model().

param <- create_params()
model(param = param, run_number = 1L)

         0 |    source: patient          |       new: patient0         | 3.77591
   3.77591 |   arrival: patient0         |  activity: Timeout          | 1
   3.77591 |    source: patient          |       new: patient1         | 9.68412
   9.68412 |   arrival: patient1         |  activity: Timeout          | 1
   9.68412 |    source: patient          |       new: patient2         | 10.4127
   10.4127 |   arrival: patient2         |  activity: Timeout          | 1
   10.4127 |    source: patient          |       new: patient3         | 11.1116
   11.1116 |   arrival: patient3         |  activity: Timeout          | 1
   11.1116 |    source: patient          |       new: patient4         | 13.292
    13.292 |   arrival: patient4         |  activity: Timeout          | 1
    13.292 |    source: patient          |       new: patient5         | 27.7668
   27.7668 |   arrival: patient5         |  activity: Timeout          | 1
   27.7668 |    source: patient          |       new: patient6         | 33.9146
   33.9146 |   arrival: patient6         |  activity: Timeout          | 1
   33.9146 |    source: patient          |       new: patient7         | 36.613
    36.613 |   arrival: patient7         |  activity: Timeout          | 1
    36.613 |    source: patient          |       new: patient8         | 41.3959
   41.3959 |   arrival: patient8         |  activity: Timeout          | 1
   41.3959 |    source: patient          |       new: patient9         | 42.1311
   42.1311 |   arrival: patient9         |  activity: Timeout          | 1
   42.1311 |    source: patient          |       new: patient10        | 49.0848
   49.0848 |   arrival: patient10        |  activity: Timeout          | 1
   49.0848 |    source: patient          |       new: patient11        | 52.8949

Distribution registry

In more advanced models, you may need multiple different random distributions (e.g., for arrivals, consultations, transfers, service times). Managing these with individual distribution objects quickly becomes tedious and can clutter your code.

The DistributionRegistry class from sim-tools provides a structured way to organise all your distributions. Instead of instantiating each distribution separately, you define them in a single configuration dictionary and let the registry handle the set-up.

This will feel excessive for our basic example, but having a distribution registry can make things easier for larger models.

Parameter class

In the Parameters class, distributions are now specified in a nested dictionary called dist_config. Each entry describes the distribution type and its parameters.

Here, the distribution type is hard-coded whilst the mean can be changed. This is fine in our simple example, but for more complex or flexible models, these settings would usually be read from an external configuration file, which allows for easier adjustment without hard coding.

class Parameters:
    """
    Parameter class.

    Attributes
    ----------
    dist_config : dict
        Holds all distribution specifications.
    run_length : int
        Total duration of simulation (minutes).
    verbose : bool
        Whether to print messages as simulation runs.
    """
    def __init__(
        self, interarrival_time=5, run_length=50, verbose=True
    ):
        """
        Initialise Parameters instance.

        Parameters
        ----------
        interarrival_time : float
            Time between arrivals (minutes).
        run_length : int
            Total duration of simulation (minutes).
        verbose : bool
            Whether to print messages as simulation runs.
        """
        self.dist_config = {
            "interarrival_time": {
                "class_name": "Exponential",
                "params": {"mean": interarrival_time}
            }
        }
        self.run_length = run_length
        self.verbose = verbose

Patient class

No changes needed!

Model class

The model now uses DistributionRegistry.create_batch() to set up all required distributions at once.

This creates a dictionary and we can access specific distributions via e.g., self.dist["interarrival_time"].

Seed spawning is now also handled by the registry.

class Model:
    """
    Simulation model.

    Attributes
    ----------
    param : Parameters
        Simulation parameters.
    run_number : int
        Run number for random seed generation.
    env : simpy.Environment
        The SimPy environment for the simulation.
    patients : list
        List of Patient objects.
    arrival_dist : Exponential
        Distribution used to generate random patient inter-arrival times.
    """
    def __init__(self, param, run_number):
        """
        Create a new Model instance.

        Parameters
        ----------
        param : Parameters
            Simulation parameters.
        run_number : int
            Run number for random seed generation.
        """
        self.param = param
        self.run_number = run_number

        # Create SimPy environment
        self.env = simpy.Environment()

        # Set up attributes to store results
        self.patients = []

        # Create all the distributions
        self.dist = DistributionRegistry.create_batch(
            config=dict(self.param.dist_config), main_seed=self.run_number
        )
        if self.param.verbose:
            print(self.dist)

    def generate_arrivals(self):
        """
        Process that generates patient arrivals.
        """
        while True:
            # Sample and pass time to next arrival
            sampled_iat = self.dist["interarrival_time"].sample()
            yield self.env.timeout(sampled_iat)

            # Create a new patient
            patient = Patient(patient_id=len(self.patients)+1,
                              arrival_time=self.env.now)
            self.patients.append(patient)

            # Print arrival time
            if self.param.verbose:
                print(f"Patient {patient.patient_id} arrives " +
                      f"at time: {patient.arrival_time:.3f}")

    def run(self):
        """
        Run the simulation.
        """
        # Schedule arrival generator
        self.env.process(self.generate_arrivals())

        # Run the simulation
        self.env.run(until=self.param.run_length)

Run the model

We can see that results are consistent with before.

param = Parameters()
model = Model(param=param, run_number=0)

{'interarrival_time': Exponential(mean=5)}

model.run()

Patient 1 arrives at time: 16.468
Patient 2 arrives at time: 20.283
Patient 3 arrives at time: 26.545
Patient 4 arrives at time: 27.675
Patient 5 arrives at time: 28.779
Patient 6 arrives at time: 37.778
Patient 7 arrives at time: 38.108
Patient 8 arrives at time: 42.611
Patient 9 arrives at time: 44.088

Changes in results when switch to using distribution registry

If you switch from setting up distributions manually to using the sim-tools DistributionRegistry, you may see differences in which seeds get used for your distributions, even if your overall parameters and input seed stay the same. This is because the order in which the seed sequence is assigned to each distribution can change depending on your setup.

Manual set-up (you choose the assignment order)

Say you’re simulating child and adult arrivals to a clinic. When you set up the distributions manually, you decide on which seeds are used. For example:

# Manual order: child first, adult second
child_arrival_dist = Exponential(mean=..., random_seed=seeds[0])
adult_arrival_dist = Exponential(mean=..., random_seed=seeds[1])

Registry set-up

If you use DistributionRegistry.create_batch() with sort=True (the default), the distributions are sorted alphabetically before assigning seeds. This means they are always initialised in a predictable, consistent order - regardless of how the configuration dictionary has been written. However, this will often mean changes from your manual approach. For example:

Adult arrivals -> seeds[0]
Child arrivals -> seeds[1]

With sort=False, seeds are assigned using the order in your configuration dictionary as written. This can mean it is consistent with your manual approach - although also means it will change if you change the order in the dictionary. For example:

Child arrivals -> seeds[0]
Adult arrivals -> seeds[1]

One solution is to write a DistributionRegistry class (based on that in the Python package sim-tools) to provide a structured way to organise all your distributions. Instead of creating each distribution separately in the code, you define them in a single configuration dictionary and let the registry handle the set-up.

This will feel excessive for our basic example, but having a distribution registry can make things easier for larger models.

DistributionRegistry

This is coped from the R stroke pathway example simulation model.

Click to visit rdesrap_stroke repository

#' Registry for parametrised probability distributions
#'
#' @description
#' The DistributionRegistry manages and generates parameterized samplers for a
#' variety of probability distributions. Common distributions are included by
#' default, and more can be added.
#'
#' Once defined, you can create sampler objects for each distribution -
#' individually (`create`) or in batches (`create_batch`) - and then easily
#' draw random samples from these objects.
#'
#' To add more built-in distributions, edit `initialize()`. To add custom ones
#' at any time, use `register()`.
#'
#' @docType class
#' @importFrom R6 R6Class
#' @export

DistributionRegistry <- R6Class("DistributionRegistry", list( # nolint: object_name_linter

  #' @field registry Named list of registered distribution generator functions.
  registry = list(),

  #' @description
  #' Pre-registers a set of common base R distribution generators.
  initialize = function() {
    self$register("exponential", function(mean) {
      function(size = 1L) rexp(size, rate = 1L / mean)
    })
    self$register("uniform", function(min, max) {
      function(size = 1L) runif(size, min = min, max = max)
    })
    self$register("discrete", function(values, prob) {
      values <- unlist(values)
      prob <- unlist(prob)
      stopifnot(length(values) == length(prob), prob >= 0L)
      if (round(abs(sum(prob) - 1L), 2L) > 0.01) {
        stop(sprintf(
          "'prob' must sum to 1 +- 0.01. Sum: %s", abs(sum(unlist(prob)))
        ), call. = FALSE)
      }
      # Sampling function
      function(size = 1L) {
        sample(values, size = size, replace = TRUE, prob = prob)
      }
    })
    self$register("normal", function(mean, sd) {
      function(size = 1L) rnorm(size, mean = mean, sd = sd)
    })
    self$register("lognormal", function(meanlog, sdlog) {
      function(size = 1L) rlnorm(size, meanlog = meanlog, sdlog = sdlog)
    })
    self$register("poisson", function(lambda) {
      function(size = 1L) rpois(size, lambda = lambda)
    })
    self$register("binomial", function(size_param, prob) {
      function(size = 1L) rbinom(size, size = size_param, prob = prob)
    })
    self$register("geometric", function(prob) {
      function(size = 1L) rgeom(size, prob = prob)
    })
    self$register("beta", function(shape1, shape2) {
      function(size = 1L) rbeta(size, shape1 = shape1, shape2 = shape2)
    })
    self$register("gamma", function(shape, rate) {
      function(size = 1L) rgamma(size, shape = shape, rate = rate)
    })
    self$register("chisq", function(df) {
      function(size = 1L) rchisq(size, df = df)
    })
    self$register("t", function(df) {
      function(size = 1L) rt(size, df = df)
    })
  },

  #' @description
  #' Register a distribution generator under a name.
  #'
  #' Typically, the generator should be a function that:
  #' 1. Accepts parameters for a distribution.
  #' 2. Returns another function - the *sampler* - which takes a `size`
  #' argument and produces that many random values from the specified
  #' distribution.
  #'
  #' By storing generators rather than fixed samplers, you can create as many
  #' different samplers as you want later, each with different parameters,
  #' while reusing the same generator code.
  #'
  #' @param name Distribution name (string)
  #' @param generator Function to create a sampler given its parameters.
  register = function(name, generator) {
    self$registry[[name]] <- generator
  },

  #' @description
  #' Get a registered distribution generator by name.
  #'
  #' @param name Distribution name (string)
  #' @return Generator function for the distribution.
  get = function(name) {
    if (!(name %in% names(self$registry)))
      stop(
        sprintf(
          c("Distribution '%s' not found.\nAvailable distributions:\n\t%s\n",
            "Use register() to add new distributions."),
          name, paste(names(self$registry), collapse = ",\n\t")
        ),
        call. = FALSE
      )
    self$registry[[name]]
  },

  #' @description
  #' Convert mean/sd to lognormal parameters, returning the corresponding
  #' \code{meanlog} and \code{sdlog} parameters used by R's \code{rlnorm()}.
  #'
  #' @param params Named list with two elements: mean and sd.
  #' @return A named list of the same structure, but with elements
  #' \code{meanlog} and \code{sdlog} for each patient type.
  transform_to_lnorm = function(params) {
    variance <- params$sd^2L
    sigma_sq <- log(variance / (params$mean^2L) + 1L)
    sdlog <- sqrt(sigma_sq)
    meanlog <- log(params$mean) - sigma_sq / 2L
    list(meanlog = meanlog, sdlog = sdlog)
  },

  #' @description
  #' Create a parameterised sampler for a distribution.
  #'
  #' The returned function draws random samples of a specified size from
  #' the given distribution with fixed parameters.
  #'
  #' For "lognormal", if "meanlog" and "sdlog" are present in the parameters,
  #' they will be used as-is. If not, but "mean" and "sd" are provided, these
  #' will be transformed into "meanlog"/"sdlog" automatically.
  #'
  #' @param name Distribution name
  #' @param ... Parameters for the generator
  #' @return A function that draws samples when called.
  create = function(name, ...) {
    dots <- list(...)
    if (name == "lognormal") {
      if (!is.null(dots$meanlog) && !is.null(dots$sdlog)) {
        dots <- dots
      } else if (!is.null(dots$mean) && !is.null(dots$sd)) {
        transformed <- self$transform_to_lnorm(dots)
        dots <- c(transformed, dots[setdiff(names(dots), c("mean", "sd"))])
      } else {
        stop("Please supply either 'meanlog' and 'sdlog', or 'mean' and 'sd' ",
             "for a lognormal distribution.", call. = FALSE)
      }
    }
    # Calls the `get()` method above which finds the distribution generator
    # function, then do.call() populates it with dots (a list of arguments).
    generator <- self$get(name)
    do.call(generator, dots)
  },

  #' @description
  #' Batch-create samplers from a configuration list.
  #'
  #' The configuration should be a list of lists, each sublist specifying a
  #' `class_name` (distribution) and `params` (parameter list for that
  #' distribution).
  #'
  #' @param config Named or unnamed list. Each entry is a list with
  #'   'class_name' and 'params'.
  #' @return List of parameterised samplers (named if config is named).
  create_batch = function(config) {
    if (!is.list(config)) {
      stop("config must be a list (named or unnamed)", call. = FALSE)
    }
    # Calls `create()` for each distribution specified in config
    lapply(config, function(cfg) {
      do.call(self$create, c(cfg$class_name, cfg$params))
    })
  }
)
)

Parameter function

In the create_params function, distributions are now specified in a nested list called dist_config. Each entry describes the distribution type and its parameters.

#' Generate parameter list.
#'
#' @param interarrival_time Numeric. Time between arrivals (minutes).
#' @param run_length Numeric. Total duration of simulation (minutes).
#' @param verbose Boolean. Whether to print messages as simulation runs.
#'
#' @return A named list of parameters.
#' @export

create_params <- function(interarrival_time = 5L,
                          run_length = 50L,
                          verbose = TRUE) {
  list(
    dist_config = list(
      interarrival_time = list(
        class_name = "exponential", 
        params = list(mean = interarrival_time)
      )
    ), 
    run_length = run_length,
    verbose = verbose
  )
}

Model function

#' Run simulation.
#'
#' @param param List. Model parameters.
#' @param run_number Numeric. Run number for random seed generation.
#'
#' @importFrom simmer add_generator run simmer timeout trajectory
#'
#' @export

model <- function(param, run_number) {

  # Set random seed based on run number
  set.seed(run_number)

  # Create simmer environment
  env <- simmer("simulation", verbose = param[["verbose"]])

  # Set up sampling distributions 
  registry <- DistributionRegistry$new() 
  param[["dist"]] <- registry$create_batch( 
    as.list(param[["dist_config"]]) 
  ) 

  # Define the patient trajectory
  patient <- trajectory("consultation") |>
    timeout(1L)

  env <- env |>
    # Add patient generator
    add_generator("patient", patient, function() {
      param[["dist"]][["interarrival_time"]]() 
    }) |>
    # Run the simulation
    simmer::run(until = param[["run_length"]])

}

Run the model

We can see that results are consistent with before.

param <- create_params()
model(param = param, run_number = 1L)

         0 |    source: patient          |       new: patient0         | 3.77591
   3.77591 |   arrival: patient0         |  activity: Timeout          | 1
   3.77591 |    source: patient          |       new: patient1         | 9.68412
   9.68412 |   arrival: patient1         |  activity: Timeout          | 1
   9.68412 |    source: patient          |       new: patient2         | 10.4127
   10.4127 |   arrival: patient2         |  activity: Timeout          | 1
   10.4127 |    source: patient          |       new: patient3         | 11.1116
   11.1116 |   arrival: patient3         |  activity: Timeout          | 1
   11.1116 |    source: patient          |       new: patient4         | 13.292
    13.292 |   arrival: patient4         |  activity: Timeout          | 1
    13.292 |    source: patient          |       new: patient5         | 27.7668
   27.7668 |   arrival: patient5         |  activity: Timeout          | 1
   27.7668 |    source: patient          |       new: patient6         | 33.9146
   33.9146 |   arrival: patient6         |  activity: Timeout          | 1
   33.9146 |    source: patient          |       new: patient7         | 36.613
    36.613 |   arrival: patient7         |  activity: Timeout          | 1
    36.613 |    source: patient          |       new: patient8         | 41.3959
   41.3959 |   arrival: patient8         |  activity: Timeout          | 1
   41.3959 |    source: patient          |       new: patient9         | 42.1311
   42.1311 |   arrival: patient9         |  activity: Timeout          | 1
   42.1311 |    source: patient          |       new: patient10        | 49.0848
   49.0848 |   arrival: patient10        |  activity: Timeout          | 1
   49.0848 |    source: patient          |       new: patient11        | 52.8949

Explore the example models

Nurse visit simulation

Click to visit pydesrap_mms repository

Key file	`simulation/param.py` `simulation/patient.py` `simulation/model.py`
What to look for?	The structure is similar to the first example on this page. The `Param` class (in `param.py`) stores all simulation settings, including `patient_inter`. The `Model` class (in `model.py`) sets up the simulation environment, initialises the required distribution for arrivals, and `generate_patient_arrivals()` samples inter-arrival times and triggers arrivals.
Why it matters?	Clear simple set-up with separation of parameters and model logic.

Click to visit rdesrap_mms repository

Key file	`R/parameters.R` `simulation/patient.py` `R/model.R`
What to look for?	The structure is similar to the first example on this page. The `parameters` function stores all simulation settings, including `patient_inter`. The `model` function sets up the simulation environment, creates a patient arrival generator, and runs the simulation.
Why it matters?	Clear simple set-up with separation of parameters and model logic.

Stroke pathway simulation

Click to visit pydesrap_stroke repository

Key file	`simulation/parameters.py` `simulation/model.py`
What to look for?	The model uses a `DistributionRegistry` to manage multiple distributions for different entities and processes. The `dist_config` from `Param` (`parameters.py`) is used by the registry. The dictionary of distributions is nested for easy access - restructured as `dist[type][unit][patient]`. Arrivals are generated using the adaptable `patient_generator()` method, which selects the appropriate distribution to sample based on patient type and unit.
Why it matters?	Using a distribution registry and a flexible arrival generator method avoids repeated, hard-coded logic and simplifies maintenance in this more complex model.

Click to visit rdesrap_stroke repository

Key file	`inst/extdata/parameters.json` `R/parameters.R` `R/model.R` `R/distribution_registry.R`
What to look for?	The model uses a `DistributionRegistry` class to manage multiple distributions for different entities and processes. The `dist_config` from `parameters()` is used by the registry. The list of distributions is nested for easy access - restructured as `dist[type][unit][patient]`. The arrival generator is created using the `add_patient_generator()` functions, which sets up an appropriate distribution to sample based on patient type and unit.
Why it matters?	Using a distribution registry and a flexible arrival generator method avoids repeated, hard-coded logic and simplifies maintenance in this more complex model.

Test yourself

Have a go at setting up a basic simulation model with entity generation, while following good package organisation.

Task:

Copy the code: Take Parameters and Model class from above (simple or with distribution registry) and place them in modules within your package (e.g., parameters.py and model.py within a simulation/ folder).
Set up a notebook: Create a new notebook file (e.g., notebooks/run_model.ipynb).
Import and run the model: From your notebook, import Parameters and Model from your package and run the model.

Copy the code: Take the create_params() and model() functions from above (simple or with distributoin registry) and place them in modules within your package (e.g., create_params.R and model.R within an R/ folder).
Set up an Rmarkdown file: Create a new Rmarkdown file (e.g., rmarkdown/run_model.Rmd).
Import and run the model: From your Rmarkdown file, import create_params and model from your package and run the model.

Experiment:
- If you copied the simple versions, try using the versions with distribution registries, and vice versa.
- Try different seeds by varying run_number in your notebook.
- Try adding a new arrival type - add the distribution parameters in Parameters, and see how you can sample from them in Model.

Tip

If you are unsure how to organise these classes as a package, checkout out the Structuring as a Package page for a tutorial on this.

Further information

“HSMA - little book of DES” from Sammi Rosser, Dan Chalk and Amy Heather 2025

Introduction to writing discrete event simulation models for healthcare in SimPy.

N/A