Replications

Learning objectives:

  • Adapt our model to run multiple replications.

Pre-reading:

This page continues on from: Performance measures.

Entity generation → Entity processing → Initialisation bias → Performance measures → Replications

Required packages:

These should be available from environment setup in the “Test yourself” section of Environments.

from IPython.display import HTML
from itables import to_html_datatable
import numpy as np
import pandas as pd
import scipy.stats as st
import simpy
from sim_tools.distributions import Exponential
library(dplyr)
library(kableExtra)
library(knitr)
library(simmer)
library(tidyr)


Replications in discrete event simulation are necessary because each run includes randomness, leading to different results every time.

Running multiple replications reveals the range and variability of possible system outcomes, allowing the calculation of averages and confidence intervals for results.

How to run replications

We will continue working with the model developed on the previous pages. This model contains all the metrics from the Performance measures page (with utilisation implemented via the time-weighted approach).

Parameter class

Let’s add a new attribute number_of_runs. We’ve arbitrarily set this to five, but the Number of replications page shows how to choose a suitable number to run.

class Parameters:
    """
    Parameter class.

    Attributes
    ----------
    interarrival_time : float
        Mean time between arrivals (minutes).
    consultation_time : float
        Mean length of doctor's consultation (minutes).
    number_of_doctors : int
        Number of doctors.
    warm_up_period : int
        Duration of the warm-up period (minutes).
    data_collection_period : int
        Duration of the data collection period (minutes).
    number_of_runs : int
        The number of runs (i.e., replications).
    verbose : bool
        Whether to print messages as simulation runs.
    """
    def __init__(
        self, interarrival_time=5, consultation_time=10,
        number_of_doctors=3, warm_up_period=30, data_collection_period=40,
        number_of_runs=5, verbose=False
    ):
        """
        Initialise Parameters instance.

        Parameters
        ----------
        interarrival_time : float
            Time between arrivals (minutes).
        consultation_time : float
            Length of consultation (minutes).
        number_of_doctors : int
            Number of doctors.
        warm_up_period : int
            Duration of the warm-up period (minutes).
        data_collection_period : int
            Duration of the data collection period (minutes).
        number_of_runs : int
            The number of runs (i.e., replications).
        verbose : bool
            Whether to print messages as simulation runs.
        """
        self.interarrival_time = interarrival_time
        self.consultation_time = consultation_time
        self.number_of_doctors = number_of_doctors
        self.warm_up_period = warm_up_period
        self.data_collection_period = data_collection_period
        self.number_of_runs = number_of_runs
        self.verbose = verbose

Patient class

No changes required.

class Patient:
    """
    Represents a patient.

    Attributes
    ----------
    patient_id : int
        Unique patient identifier.
    period : str
        Arrival period (warm up or data collection) with emoji.
    arrival_time : float
        Time patient entered the system (minutes).
    wait_time : float
        Time spent waiting for the doctor (minutes).
    time_with_doctor : float
        Time spent in consultation with a doctor (minutes).
    end_time : float
        Time that patient leaves (minutes), or NaN if remain in system.
    """
    def __init__(self, patient_id, period, arrival_time):
        """
        Initialises a new patient.

        Parameters
        ----------
        patient_id : int
            Unique patient identifier.
        period : str
            Arrival period (warm up or data collection) with emoji.
        arrival_time : float
            Time patient entered the system (minutes).
        """
        self.patient_id = patient_id
        self.period = period
        self.arrival_time = arrival_time
        self.wait_time = np.nan
        self.time_with_doctor = np.nan
        self.end_time = np.nan

Monitored resource class

No changes required.

class MonitoredResource(simpy.Resource):
    """
    SimPy Resource subclass that tracks utilisation, queue length and mean
    number of patients in the system.

    Attributes
    ----------
    time_last_event : list
        Time of the most recent resource request or release event.
    area_resource_busy : list
        Cumulative time resources were busy, for computing utilisation.
    area_n_in_queue : list
        Cumulative time patients spent queueing for a resource, for
        computing average queue length.

    Notes
    -----
    Class adapted from Monks, Harper and Heather 2025.
    """

    def __init__(self, *args, **kwargs):
        """
        Initialise the MonitoredResource, resetting monitoring statistics.

        Parameters
        ----------
        *args :
            Positional arguments to be passed to the parent class.
        **kwargs :
            Keyword arguments to be passed to the parent class.
        """
        # Initialise a SimPy Resource
        super().__init__(*args, **kwargs)
        # Run the init_results() method
        self.init_results()

    def init_results(self):
        """
        Reset monitoring attributes.
        """
        self.time_last_event = [self._env.now]
        self.area_resource_busy = [0.0]
        self.area_n_in_queue = [0.0]

    def request(self, *args, **kwargs):
        """
        Update time-weighted statistics before requesting a resource.

        Parameters
        ----------
        *args :
            Positional arguments to be passed to the parent class.
        **kwargs :
            Keyword arguments to be passed to the parent class.

        Returns
        -------
        simpy.events.Event
            Event for the resource request.
        """
        # Update time-weighted statistics
        self.update_time_weighted_stats()
        # Request a resource
        return super().request(*args, **kwargs)

    def release(self, *args, **kwargs):
        """
        Update time-weighted statistics before releasing a resource.

        Parameters
        ----------
        *args :
            Positional arguments to be passed to the parent class.
        **kwargs :
            Keyword arguments to be passed to the parent class.

        Returns
        -------
        simpy.events.Event
            Event for the resource release.
        """
        # Update time-weighted statistics
        self.update_time_weighted_stats()
        # Release a resource
        return super().release(*args, **kwargs)

    def update_time_weighted_stats(self):
        """
        Update the time-weighted statistics for resource usage.

        Notes
        -----
        - These sums can be referred to as "the area under the curve".
        - They are called "time-weighted" statistics as they account for how
          long certain events or states persist over time.
        """
        # Calculate time since last event
        time_since_last_event = self._env.now - self.time_last_event[-1]

        # Add record of current time
        self.time_last_event.append(self._env.now)

        # Add "area under curve" of resources in use
        # self.count is the number of resources in use
        self.area_resource_busy.append(self.count * time_since_last_event)

        # Add "area under curve" of people in queue
        # len(self.queue) is the number of requests queued
        self.area_n_in_queue.append(len(self.queue) * time_since_last_event)

Model class

No changes required.

class Model:
    """
    Simulation model.

    Attributes
    ----------
    param : Parameters
        Simulation parameters.
    run_number : int
        Run number for random seed generation.
    env : simpy.Environment
        The SimPy environment for the simulation.
    doctor : simpy.Resource
        SimPy resource representing doctors.
    patients : list
        List of Patient objects.
    results_list : list
        List of dictionaries with the attributes of each patient.
    area_n_in_system : list of float
        List containing incremental area contributions used for
        time-weighted statistics of the number of patients in the system.
    time_last_n_in_system : float
        Simulation time at last update of the number-in-system statistic.
    n_in_system: int
        Current number of patients present in the system, including
        waiting and being served.
    arrival_dist : Exponential
        Distribution used to generate random patient inter-arrival times.
    consult_dist : Exponential
        Distribution used to generate length of a doctor's consultation.
    """
    def __init__(self, param, run_number):
        """
        Create a new Model instance.

        Parameters
        ----------
        param : Parameters
            Simulation parameters.
        run_number : int
            Run number for random seed generation.
        """
        self.param = param
        self.run_number = run_number

        # Create SimPy environment
        self.env = simpy.Environment()

        # Create resource
        self.doctor = MonitoredResource(
            self.env, capacity=self.param.number_of_doctors
        )

        # Create a random seed sequence based on the run number
        ss = np.random.SeedSequence(self.run_number)
        seeds = ss.spawn(2)

        # Set up attributes to store results
        self.patients = []
        self.results_list = []
        self.area_n_in_system = [0]
        self.time_last_n_in_system = self.env.now
        self.n_in_system = 0

        # Initialise distributions
        self.arrival_dist = Exponential(mean=self.param.interarrival_time,
                                        random_seed=seeds[0])
        self.consult_dist = Exponential(mean=self.param.consultation_time,
                                        random_seed=seeds[1])

    def update_n_in_system(self, inc):
        """
        Update the time-weighted statistics for number of patients in system.

        Parameters
        ----------
        inc : int
            Change in the number of patients (+1, 0, -1).
        """
        # Compute time since last event and calculate area under curve for that
        duration = self.env.now - self.time_last_n_in_system
        self.area_n_in_system.append(self.n_in_system * duration)
        # Update time and n in system
        self.time_last_n_in_system = self.env.now
        self.n_in_system += inc

    def generate_arrivals(self):
        """
        Process that generates patient arrivals.
        """
        while True:
            # Sample and pass time to next arrival
            sampled_iat = self.arrival_dist.sample()
            yield self.env.timeout(sampled_iat)

            # Check whether arrived during warm-up or data collection
            if self.env.now < self.param.warm_up_period:
                period = "\U0001F538 WU"
            else:
                period = "\U0001F539 DC"

            # Create a new patient
            patient = Patient(patient_id=len(self.patients)+1,
                              period=period,
                              arrival_time=self.env.now)
            self.patients.append(patient)

            # Update the number in the system
            self.update_n_in_system(inc=1)

            # Print arrival time
            if self.param.verbose:
                print(f"{patient.period} Patient {patient.patient_id} " +
                      f"arrives at time: {patient.arrival_time:.3f}")

            # Start process of consultation
            self.env.process(self.consultation(patient))

    def consultation(self, patient):
        """
        Process that simulates a consultation.

        Parameters
        ----------
        patient :
            Instance of the Patient() class representing a single patient.
        """
        start_wait = self.env.now
        # Patient requests access to a doctor (resource)
        with self.doctor.request() as req:
            yield req

            # Record how long patient waited before consultation
            patient.wait_time = self.env.now - start_wait

            if self.param.verbose:
                print(f"{patient.period} Patient {patient.patient_id} " +
                      f"starts consultation at: {self.env.now:.3f}")

            # Sample consultation duration and pass time spent with doctor
            patient.time_with_doctor = self.consult_dist.sample()
            yield self.env.timeout(patient.time_with_doctor)

            # Update number in system
            self.update_n_in_system(inc=-1)

            # Record end time
            patient.end_time = self.env.now
            if self.param.verbose:
                print(f"{patient.period} Patient {patient.patient_id} " +
                      f"leaves at: {patient.end_time:.3f}")

    def reset_results(self):
        """
        Reset results.
        """
        self.patients = []
        self.doctor.init_results()
        # For number in system, we reset area and time but not the count, as
        # it should include any remaining warm-up patients in the count
        self.area_n_in_system = [0]
        self.time_last_n_in_system = self.env.now

    def warmup(self):
        """
        Reset result collection after the warm-up period.
        """
        if self.param.warm_up_period > 0:
            # Delay process until warm-up period has completed
            yield self.env.timeout(self.param.warm_up_period)
            # Reset results variables
            self.reset_results()
            if self.param.verbose:
                print(f"Warm up period ended at time: {self.env.now}")

    def run(self):
        """
        Run the simulation.
        """
        # Schedule arrival generator and warm-up
        self.env.process(self.generate_arrivals())
        self.env.process(self.warmup())

        # Run the simulation
        self.env.run(until=(self.param.warm_up_period +
                            self.param.data_collection_period))

        # At simulation end, update time-weighted statistics by accounting
        # for the time from the last event up to the simulation finish.
        self.doctor.update_time_weighted_stats()

        # Run final calculation of number in system
        self.update_n_in_system(inc=0)

        # Create list of dictionaries containing each patient's attributes
        self.results_list = [x.__dict__ for x in self.patients]

Summary statistics function

This function computes descriptive statistics for a dataset, returning the mean, standard deviation, and 95% confidence interval.

  • If the input has no valid values, all outputs are NaN.

  • If the input has fewer than 3 values, it returns the mean only, with the rest as NaN.

  • If the input has 3 or more values, it calculates the mean, standard deviation, and a 95% confidence interval using the t-distribution (scipy.stats).

  • If all values are identical (zero variance), the confidence interval equals the mean.

def summary_stats(data):
    """
    Calculate mean, standard deviation and 95% confidence interval (CI).

    Parameters
    ----------
    data : pd.Series
        Data to use in calculation.

    Returns
    -------
    tuple
        (mean, standard deviation, CI lower, CI upper).
    """
    # Remove any NaN from the series
    data = data.dropna()

    # Find number of observations
    count = len(data)

    # If there are no observations, then set all to NaN
    if count == 0:
        mean, std_dev, ci_lower, ci_upper = np.nan, np.nan, np.nan, np.nan
    # If there is only one or two observations, can do mean but not others
    elif count < 3:
        mean = data.mean()
        std_dev, ci_lower, ci_upper = np.nan, np.nan, np.nan
    # With more than one observation, can calculate all...
    else:
        mean = data.mean()
        std_dev = data.std()
        # Special case for CI if variance is 0
        if np.var(data) == 0:
            ci_lower, ci_upper = mean, mean
        else:
            # Calculation of CI uses t-distribution, which is suitable for
            # smaller sample sizes (n<30)
            ci_lower, ci_upper = st.t.interval(
                confidence=0.95,
                df=count-1,
                loc=mean,
                scale=st.sem(data))
    return mean, std_dev, ci_lower, ci_upper

Runner class

We run replications with a new method run_reps(), explained line-by-line below.

class Runner:
    """
    Run the simulation.

    Attributes
    ----------
    param : Parameters
        Simulation parameters.
    """
    def __init__(self, param):
        """
        Initialise a new instance of the Runner class.

        Parameters
        ----------
        param : Parameters
            Simulation parameters.
        """
        self.param = param

    def run_single(self, run):
        """
        Runs the simulation once and performs results calculations.

        Parameters
        ----------
        run : int
            Run number for the simulation.

        Returns
        -------
        dict
            Contains patient-level results and results from each run.
        """
        model = Model(param=self.param, run_number=run)
        model.run()

        # Patient results
        patient_results = pd.DataFrame(model.results_list)
        patient_results["run"] = run
        patient_results["time_in_system"] = (
            patient_results["end_time"] - patient_results["arrival_time"]
        )
        # For each patient, if they haven't seen a doctor yet, calculate
        # their wait as current time minus arrival, else set as missing
        patient_results["unseen_wait_time"] = np.where(
            patient_results["time_with_doctor"].isna(),
            model.env.now - patient_results["arrival_time"], np.nan
        )

        # Run results
        run_results = {
            "run_number": run,
            "arrivals": len(patient_results),
            "mean_wait_time": patient_results["wait_time"].mean(),
            "mean_time_with_doctor": (
                patient_results["time_with_doctor"].mean()
            ),
            "mean_utilisation_tw": (
                sum(model.doctor.area_resource_busy) / (
                    self.param.number_of_doctors *
                    self.param.data_collection_period
                )
            ),
            "mean_queue_length": (
                sum(model.doctor.area_n_in_queue) /
                self.param.data_collection_period
            ),
            "mean_time_in_system": patient_results["time_in_system"].mean(),
            "mean_patients_in_system": (
                sum(model.area_n_in_system) /
                self.param.data_collection_period
            ),
            "unseen_count": patient_results["time_with_doctor"].isna().sum(),
            "unseen_wait_time": patient_results["unseen_wait_time"].mean()
        }

        return {
            "patient": patient_results,
            "run": run_results
        }

    def run_reps(self): 
        """
        Execute a single model configuration for multiple runs.

        Returns
        -------
        dict
            Contains patient-level results, results from each run and
            overall results.
        """
        # Run replications
        all_results = [self.run_single(run)
                       for run in range(self.param.number_of_runs)]

        # Separate results from each run into appropriate lists
        patient_results_list = [result["patient"] for result in all_results]
        run_results_list = [result["run"] for result in all_results]

        # Convert lists into dataframes
        patient_results_df = pd.concat(
            patient_results_list, ignore_index=True
        )
        run_results_df = pd.DataFrame(run_results_list)

        # Calculate average results and uncertainty from across all runs
        uncertainty_metrics = {}
        run_col = run_results_df.columns

        # Loop through the run performance measure columns
        # Calculate mean, standard deviation and 95% confidence interval
        for col in run_col[~run_col.isin(["run_number", "scenario"])]: 
            uncertainty_metrics[col] = dict(zip(
                ["mean", "std_dev", "lower_95_ci", "upper_95_ci"], 
                summary_stats(run_results_df[col])
            ))
        # Convert to dataframe
        overall_results_df = pd.DataFrame(uncertainty_metrics)

        return {
            "patient": patient_results_df, 
            "run": run_results_df, 
            "overall": overall_results_df
        }

Explaining Runner.run_reps()

def run_reps(self):
    """
    Execute a single model configuration for multiple runs.

    Returns
    -------
    dict
        Contains patient-level results, results from each run and
        overall results.
    """
    # Run replications
    all_results = [self.run_single(run)
                    for run in range(self.param.number_of_runs)]

It starts by calling run_single() repeatedly for the requested number of simulation runs, passing the run number to set the random seed (ensuring each run uses a different seed!).


# Separate results from each run into appropriate lists
patient_results_list = [result["patient"] for result in all_results]
run_results_list = [result["run"] for result in all_results]

# Convert lists into dataframes
patient_results_df = pd.concat(
    patient_results_list, ignore_index=True
)
run_results_df = pd.DataFrame(run_results_list)

The results then get stacked on top of each other, so you end up with one big dataframe for all patients, and another for all runs. The replication column shows which run each row came from.


# Calculate average results and uncertainty from across all runs
uncertainty_metrics = {}
run_col = run_results_df.columns

# Loop through the run performance measure columns
# Calculate mean, standard deviation and 95% confidence interval
for col in run_col[~run_col.isin(["run_number", "scenario"])]:
    uncertainty_metrics[col] = dict(zip(
        ["mean", "std_dev", "lower_95_ci", "upper_95_ci"],
        summary_stats(run_results_df[col])
    ))
# Convert to dataframe
overall_results_df = pd.DataFrame(uncertainty_metrics)

We then calculate summary statistics (mean, standard deviation, and 95% confidence interval) for each run in run_results_df using summary_stats().


return {
    "patient": patient_results_df,
    "run": run_results_df,
    "overall": overall_results_df
}

At the end, the function returns a dictionary with all patient results (patient), all run results (run), and a summary table with statistics calculated across all runs (overall).

Run the model

runner = Runner(param=Parameters())
result = runner.run_reps()
HTML(to_html_datatable(result["patient"]))
Loading ITables v2.5.2 from the internet... (need help?)
HTML(to_html_datatable(result["run"]))
Loading ITables v2.5.2 from the internet... (need help?)
HTML(to_html_datatable(result["overall"]))
Loading ITables v2.5.2 from the internet... (need help?)

Parameter function

Let’s add a new attribute number_of_runs. We’ve arbitrarily set this to five, but the Number of replications page shows how to choose a suitable number to run.

#' Generate parameter list.
#'
#' @param interarrival_time Numeric. Time between arrivals (minutes).
#' @param consultation_time Numeric. Mean length of doctor's
#'   consultation (minutes).
#' @param number_of_doctors Numeric. Number of doctors.
#' @param warm_up_period Numeric. Duration of the warm-up period (minutes).
#' @param data_collection_period Numeric. Duration of the data collection
#'   period (minutes).
#' @param number_of_runs Numeric. The number of runs (i.e., replications).
#' @param verbose Boolean. Whether to print messages as simulation runs.
#'
#' @return A named list of parameters.
#' @export

create_params <- function(
  interarrival_time = 5L,
  consultation_time = 10L,
  number_of_doctors = 3L,
  warm_up_period = 30L,
  data_collection_period = 40L,
  number_of_runs = 5L, 
  verbose = FALSE
) {
  list(
    interarrival_time = interarrival_time,
    consultation_time = consultation_time,
    number_of_doctors = number_of_doctors,
    warm_up_period = warm_up_period,
    data_collection_period = data_collection_period,
    number_of_runs = number_of_runs, 
    verbose = verbose
  )
}

Warm-up function

No changes required.

#' Filters arrivals and resources to remove warm-up results.
#'
#' @param result Named list with two tables: monitored arrivals & resources.
#' @param warm_up_period Length of warm-up period.
#'
#' @importFrom dplyr arrange filter group_by mutate n slice ungroup
#' @importFrom rlang .data
#'
#' @return The named list `result`, but with the tables (`arrivals` and
#' `resources`) filtered to remove warm-up results.
#' @export

filter_warmup <- function(result, warm_up_period) {
  # Skip filtering if the warm-up period is zero
  if (warm_up_period == 0L) return(result)

  # Arrivals: Keep only patients who came during the data collection period
  result[["arrivals"]] <- result[["arrivals"]] |>
    group_by(.data[["name"]]) |>
    filter(all(.data[["start_time"]] >= warm_up_period)) |>
    ungroup()

  # Resources: Filter to resource events in the data collection period
  dc_resources <- filter(result[["resources"]],
                         .data[["time"]] >= warm_up_period)

  if (nrow(dc_resources) > 0L) {

    # For each resource, get the last even during warm-up, and replace time
    # for that event with the start time of the data collection period
    last_usage <- result[["resources"]] |>
      filter(.data[["time"]] < warm_up_period) |>
      arrange(.data[["time"]]) |>
      group_by(.data[["resource"]]) |>
      slice(n()) |>
      mutate(time = warm_up_period) |>
      ungroup()

    # Set that last event of the resource/s as the first row of the dataframe
    # else calculations would assume all resources were idle at the start of
    # the data collection period
    result[["resources"]] <- rbind(last_usage, dc_resources)

  } else {
    # No events after warm-up; use filtered resource table as is
    result[["resources"]] <- dc_resources
  }

  result
}

Functions to calculate different metrics

No changes required.

#' Calculate the number of arrivals
#'
#' @param arrivals Dataframe with times for each patient with each resource.
#'
#' @importFrom dplyr n_distinct summarise ungroup
#' @importFrom rlang .data
#'
#' @return Tibble with column containing total number of arrivals.
#' @export

calc_arrivals <- function(arrivals) {
  arrivals |>
    summarise(arrivals = n_distinct(.data[["name"]])) |>
    ungroup()
}


#' Calculate the mean wait time for each resource
#'
#' @param arrivals Dataframe with times for each patient with each resource.
#'
#' @importFrom dplyr group_by summarise ungroup
#' @importFrom rlang .data
#' @importFrom tidyr drop_na pivot_wider
#' @importFrom tidyselect any_of
#'
#' @return Tibble with columns containing result for each resource.
#' @export

calc_mean_wait <- function(arrivals) {

  # Create subset of data that removes patients who were still waiting
  complete_arrivals <- drop_na(arrivals, any_of("wait_time"))

  # Calculate mean wait time for each resource
  complete_arrivals |>
    group_by(.data[["resource"]]) |>
    summarise(mean_wait_time = mean(.data[["wait_time"]])) |>
    pivot_wider(names_from = "resource",
                values_from = "mean_wait_time",
                names_glue = "mean_wait_time_{resource}") |>
    ungroup()
}


#' Calculate the mean length of time patients spent with each resource
#'
#' @param arrivals Dataframe with times for each patient with each resource.
#' @param resources Dataframe with times patients use or queue for resources.
#'
#' @importFrom dplyr group_by summarise ungroup
#' @importFrom rlang .data
#' @importFrom tidyr drop_na pivot_wider
#' @importFrom tidyselect any_of
#'
#' @return Tibble with columns containing result for each resource.
#' @export

calc_mean_serve_length <- function(arrivals, resources) {

  # Create subset of data that removes patients who were still waiting
  complete_arrivals <- drop_na(arrivals, any_of("serve_length"))

  # Calculate mean serve time for each resource
  complete_arrivals |>
    group_by(.data[["resource"]]) |>
    summarise(mean_serve_time = mean(.data[["serve_length"]])) |>
    pivot_wider(names_from = "resource",
                values_from = "mean_serve_time",
                names_glue = "mean_time_with_{resource}") |>
    ungroup()
}


#' Calculate the resource utilisation
#'
#' Utilisation is given by the total effective usage time (`in_use`) over
#' the total time intervals considered (`dt`).
#'
#' Credit: The utilisation calculation is adapted from the
#' `plot.resources.utilization()` function in simmer.plot 0.1.18, which is
#' shared under an MIT Licence (Ucar I, Smeets B (2023). simmer.plot: Plotting
#' Methods for 'simmer'. https://r-simmer.org
#' https://github.com/r-simmer/simmer.plot.).
#'
#' @param resources Dataframe with times patients use or queue for resources.
#'
#' @importFrom dplyr group_by mutate ungroup
#' @importFrom rlang .data
#' @importFrom tidyr pivot_wider
#'
#' @return Tibble with columns containing result for each resource.
#' @export

calc_utilisation <- function(resources) {

  # Calculate utilisation
  util_df <- resources |>
    group_by(.data[["resource"]]) |>
    mutate(
      # Time between this row and the next
      interval_duration = lead(.data[["time"]]) - .data[["time"]],
      # Ensures effective capacity is never less than number of servers in
      # use (in case of situations where servers may exceed "capacity").
      effective_capacity = pmax(.data[["capacity"]], .data[["server"]]),
      # Divide number of servers in use by effective capacity
      # Set to NA if effective capacity is 0
      utilisation = ifelse(.data[["effective_capacity"]] > 0L,
                           .data[["server"]] / .data[["effective_capacity"]],
                           NA_real_)
    )

  # Find total utilisation
  util_df |>
    summarise(
      # Multiply each utilisation by its own unique duration. The total of
      # those is then divided by the total duration of all intervals.
      # Hence, we are calculated a time-weighted average utilisation.
      utilisation = (sum(.data[["utilisation"]] *
                            .data[["interval_duration"]], na.rm = TRUE) /
                        sum(.data[["interval_duration"]], na.rm = TRUE))
    ) |>
    pivot_wider(names_from = "resource",
                values_from = "utilisation",
                names_glue = "utilisation_{resource}") |>
    ungroup()
}


#' Calculate the time-weighted mean queue length.
#'
#' @param arrivals Dataframe with times for each patient with each resource.
#'
#' @importFrom dplyr arrange group_by lead mutate summarise ungroup
#' @importFrom tidyr pivot_wider
#'
#' @return Tibble with column containing mean queue length.
#' @export

calc_mean_queue_length <- function(arrivals) {
  arrivals |>
    group_by(.data[["resource"]]) |>
    # Sort by arrival time
    arrange(.data[["start_time"]]) |>
    # Calculate time between this row and the next
    mutate(
      interval_duration = (lead(.data[["start_time"]]) - .data[["start_time"]])
    ) |>
    # Multiply each queue length by its own unique duration. The total of
    # those is then divided by the total duration of all intervals.
    # Hence, we are calculated a time-weighted average queue length.
    summarise(mean_queue_length = (
      sum(.data[["queue_on_arrival"]] *
            .data[["interval_duration"]], na.rm = TRUE) /
        sum(.data[["interval_duration"]], na.rm = TRUE)
    )
    ) |>
    # Reshape dataframe
    pivot_wider(names_from = "resource",
                values_from = "mean_queue_length",
                names_glue = "mean_queue_length_{resource}") |>
    ungroup()
}


#' Calculate the mean time in the system for finished patients.
#'
#' @param arrivals Dataframe with times for each patient with each resource.
#'
#' @importFrom dplyr summarise ungroup
#'
#' @return Tibble with column containing the mean time in the system.
#' @export

calc_mean_time_in_system <- function(arrivals) {
  arrivals |>
    summarise(
      mean_time_in_system = mean(.data[["time_in_system"]], na.rm = TRUE)
    ) |>
    ungroup()
}


#' Calculate the time-weighted mean number of patients in the system.
#'
#' @param patient_count Dataframe with patient counts over time.
#'
#' @importFrom dplyr arrange lead mutate summarise ungroup
#'
#' @return Tibble with column containing mean number of patients in the system.
#' @export

calc_mean_patients_in_system <- function(patient_count) {
  patient_count |>
    # Sort by time
    arrange(.data[["time"]]) |>
    # Calculate time between this row and the next
    mutate(interval_duration = (lead(.data[["time"]]) - .data[["time"]])) |>
    # Multiply each patient count by its own unique duration. The total of
    # those is then divided by the total duration of all intervals.
    # Hence, we are calculated a time-weighted average patient count.
    summarise(
      mean_patients_in_system = (
        sum(.data[["count"]] * .data[["interval_duration"]], na.rm = TRUE) /
          sum(.data[["interval_duration"]], na.rm = TRUE)
      )
    ) |>
    ungroup()
}

Run results function

No changes required.

#' Get average results for the provided single run.
#'
#' @param results Named list with `arrivals` from `get_mon_arrivals()` and
#'   `resources` from `get_mon_resources()`
#'   (`per_resource = TRUE` and `ongoing = TRUE`).
#' @param run_number Integer index of current simulation run.
#'
#' @importFrom dplyr bind_cols
#' @importFrom tibble tibble
#'
#' @return Tibble with processed results from replication.
#' @export

get_run_results <- function(results, run_number) {
  metrics <- list(
    calc_arrivals(results[["arrivals"]]),
    calc_mean_wait(results[["arrivals"]]),
    calc_mean_serve_length(results[["arrivals"]]),
    calc_utilisation(results[["resources"]]),
    calc_mean_queue_length(results[["arrivals"]]),
    calc_mean_time_in_system(results[["arrivals"]]),
    calc_mean_patients_in_system(results[["patients_in_system"]])
  )
  dplyr::bind_cols(tibble(replication = run_number), metrics)
}

Model function

No changes required.

#' Run simulation.
#'
#' @param param List. Model parameters.
#' @param run_number Numeric. Run number for random seed generation.
#'
#' @importFrom dplyr arrange bind_rows desc left_join mutate transmute
#' @importFrom rlang .data
#' @importFrom simmer add_generator add_resource get_mon_arrivals
#' @importFrom simmer get_attribute get_mon_resources now release run seize
#' @importFrom simmer set_attribute simmer timeout trajectory
#' @importFrom tidyr drop_na pivot_wider
#' @importFrom tidyselect all_of
#'
#' @return Named list with tables `arrivals`, `resources` and `run_results`.
#' @export

model <- function(param, run_number) {

  # Set random seed based on run number
  set.seed(run_number)

  # Create simmer environment
  env <- simmer("simulation", verbose = param[["verbose"]])

  # Define the patient trajectory
  patient <- trajectory("consultation") |>
    # Record queue length on arrival 
    set_attribute("doctor_queue_on_arrival", 
                  function() get_queue_count(env, "doctor")) |> 
    seize("doctor", 1L) |>
    # Record time resource is obtained
    set_attribute("doctor_serve_start", function() now(env)) |>
    # Record sampled length of consultation 
    set_attribute("doctor_serve_length", function() { 
      rexp(n = 1L, rate = 1L / param[["consultation_time"]]) 
    }) |> 
    timeout(function() get_attribute(env, "doctor_serve_length")) |> 
    release("doctor", 1L)

  env <- env |>
    # Add doctor resource
    add_resource("doctor", param[["number_of_doctors"]]) |>
    # Add patient generator (mon=2 so can get manual attributes)
    add_generator("patient", patient, function() {
      rexp(n = 1L, rate = 1L / param[["interarrival_time"]])
    }, mon = 2L) |>
    # Run the simulation
    simmer::run(until = (param[["warm_up_period"]] +
                           param[["data_collection_period"]]))

  # Extract information on arrivals and resources from simmer environment
  result <- list(
    arrivals = get_mon_arrivals(env, per_resource = TRUE, ongoing = TRUE),
    resources = get_mon_resources(env)
  )

  # Get the extra arrivals attributes
  extra_attributes <- get_mon_attributes(env) |>
    dplyr::select("name", "key", "value") |>
    # Add column with resource name, and remove that from key
    mutate(resource = gsub("_.+", "", .data[["key"]]),
           key = gsub("^[^_]+_", "", .data[["key"]])) |>
    pivot_wider(names_from = "key", values_from = "value")

  # Merge extra attributes with the arrival data
  result[["arrivals"]] <- left_join(
    result[["arrivals"]], extra_attributes, by = c("name", "resource")
  )

  # Add time in system (unfinished patients will set to NaN) 
  result[["arrivals"]][["time_in_system"]] <- ( 
    result[["arrivals"]][["end_time"]] - result[["arrivals"]][["start_time"]] 
  )

  # Filter to remove results from the warm-up period
  result <- filter_warmup(result, param[["warm_up_period"]])

  # Gather all start and end times, with a row for each, marked +1 or -1 
  # Drop NA for end time, as those are patients who haven't left system 
  # at the end of the simulation 
  arrivals_start <- transmute( 
    result[["arrivals"]], time = .data[["start_time"]], change = 1L 
  ) 
  arrivals_end <- result[["arrivals"]] |> 
    drop_na(all_of("end_time")) |> 
    transmute(time = .data[["end_time"]], change = -1L) 
  events <- bind_rows(arrivals_start, arrivals_end) 

  # Determine the count of patients in the service with each entry/exit 
  result[["patients_in_system"]] <- events |> 
    # Sort events by time 
    arrange(.data[["time"]], desc(.data[["change"]])) |> 
    # Use cumulative sum to find number of patients in system at each time 
    mutate(count = cumsum(.data[["change"]])) |> 
    dplyr::select(c("time", "count")) 

  # Replace "replication" value with appropriate run number
  result[["arrivals"]] <- mutate(result[["arrivals"]],
                                 replication = run_number)
  result[["resources"]] <- mutate(result[["resources"]],
                                  replication = run_number)
  result[["patients_in_system"]] <- mutate(result[["patients_in_system"]], 
                                           replication = run_number) 

  # Calculate wait time for each patient
  result[["arrivals"]] <- result[["arrivals"]] |>
    mutate(wait_time = .data[["serve_start"]] - .data[["start_time"]])

  # Calculate the average results for that run
  result[["run_results"]] <- get_run_results(result, run_number)

  result
}

Runner function

This new function runner() runs the simulation model multiple times based on the specified param[["number_of_runs"]]. It passes the run number to set the random seed (ensuring each run uses a different seed!).

It combines the results into unified output tables for arrivals, resources and run-level results. If only one replication is requested, it simply returns that single result without combining.

#' Run simulation for multiple replications.
#'
#' @param param Named list of model parameters.
#'
#' @return List with three tables: arrivals, resources and run_results.
#' @export

runner <- function(param) {
  # Run simulation for multiple replications
  results <- lapply(
    seq_len(param[["number_of_runs"]]),
    function(i) {
      model(run_number = i, param = param)
    }
  )

  # If only one replication, return without changes
  if (param[["number_of_runs"]] == 1L) {
    return(results[[1L]])
  }

  # Combine results from all replications into unified tables
  all_arrivals <- do.call(
    rbind, lapply(results, function(x) x[["arrivals"]])
  )
  all_resources <- do.call(
    rbind, lapply(results, function(x) x[["resources"]])
  )

  # run_results may have missing data - use bind_rows to fill NAs automatically
  all_run_results <- dplyr::bind_rows(
    lapply(results, function(x) x[["run_results"]])
  )

  all_patients_in_system <- dplyr::bind_rows(
    lapply(results, function(x) x[["patients_in_system"]])
  )

  # Return a list containing the combined tables
  list(
    arrivals = all_arrivals,
    resources = all_resources,
    run_results = all_run_results,
    patients_in_system = all_patients_in_system
  )
}

Run the model

param <- create_params()
result <- runner(param = param)
result[["arrivals"]] |>
  arrange(replication, start_time) |>
  kable() |>
  scroll_box(height = "400px")
name start_time end_time activity_time resource replication queue_on_arrival serve_start serve_length time_in_system wait_time
patient6 30.75713 NA NA doctor 1 0 30.75713 44.2393422 NA 0.0000000
patient7 36.02984 46.38228 10.3524395 doctor 1 0 36.02984 10.3524395 10.3524395 0.0000000
patient8 45.41002 51.95749 6.5474664 doctor 1 0 45.41002 6.5474664 6.5474664 0.0000000
patient9 47.09469 52.97948 5.8847972 doctor 1 0 47.09469 5.8847972 5.8847972 0.0000000
patient10 58.91726 65.33619 6.4189259 doctor 1 0 58.91726 6.4189259 6.4189259 0.0000000
patient11 60.38786 66.04652 5.6586552 doctor 1 0 60.38786 5.6586552 5.6586552 0.0000000
patient12 60.91823 NA NA doctor 1 0 65.33619 11.7331211 NA 4.4179608
patient13 61.21542 NA NA doctor 1 1 66.04652 9.9681296 NA 4.8310963
patient14 64.10899 NA NA doctor 1 2 NA NA NA NA
patient7 34.95654 42.92251 7.9659722 doctor 2 0 34.95654 7.9659722 7.9659722 0.0000000
patient8 42.18289 NA NA doctor 2 0 42.18289 44.9194233 NA 0.0000000
patient9 50.70005 56.90389 6.2038429 doctor 2 0 50.70005 6.2038429 6.2038429 0.0000000
patient10 52.48242 59.37100 6.8885801 doctor 2 0 52.48242 6.8885801 6.8885801 0.0000000
patient11 56.64556 NA NA doctor 2 0 56.90389 15.9005238 NA 0.2583307
patient12 58.31782 NA NA doctor 2 0 59.37100 10.8950082 NA 1.0531762
patient13 58.71667 NA NA doctor 2 1 NA NA NA NA
patient14 61.47146 NA NA doctor 2 1 NA NA NA NA
patient15 61.49116 NA NA doctor 2 2 NA NA NA NA
patient16 64.59513 NA NA doctor 2 3 NA NA NA NA
patient7 31.68591 33.68054 1.9946313 doctor 3 0 31.68591 1.9946313 1.9946313 0.0000000
patient8 33.62819 45.72654 12.0983475 doctor 3 0 33.62819 12.0983475 12.0983475 0.0000000
patient9 39.55955 57.04574 17.4861950 doctor 3 0 39.55955 17.4861950 17.4861950 0.0000000
patient10 43.92088 53.91676 8.1902235 doctor 3 0 45.72654 8.1902235 9.9958851 1.8056616
patient11 44.62077 59.64239 5.7256241 doctor 3 1 53.91676 5.7256241 15.0216166 9.2959925
patient12 54.85752 58.46069 1.4149504 doctor 3 0 57.04574 1.4149504 3.6031694 2.1882191
patient13 63.70977 NA NA doctor 3 0 63.70977 13.7027903 NA 0.0000000
patient14 67.52795 NA NA doctor 3 0 67.52795 33.0249086 NA 0.0000000
patient10 38.64686 42.86897 1.6786021 doctor 4 1 41.19037 1.6786021 4.2221176 2.5435155
patient11 39.86742 61.20384 18.3348719 doctor 4 1 42.86897 18.3348719 21.3364246 3.0015528
patient12 67.56626 68.15052 0.5842628 doctor 4 0 67.56626 0.5842628 0.5842628 0.0000000
patient13 69.52374 NA NA doctor 4 0 69.52374 12.4758717 NA 0.0000000
patient9 35.27459 41.59611 6.3215214 doctor 5 0 35.27459 6.3215214 6.3215214 0.0000000
patient10 44.05495 47.86583 3.8108866 doctor 5 0 44.05495 3.8108866 3.8108866 0.0000000
patient11 54.93294 60.80512 5.8721868 doctor 5 0 54.93294 5.8721868 5.8721868 0.0000000
patient12 56.26097 68.57552 12.3145493 doctor 5 0 56.26097 12.3145493 12.3145493 0.0000000
patient13 56.92370 67.95916 7.1540346 doctor 5 0 60.80512 7.1540346 11.0354604 3.8814258
patient14 60.15735 69.76202 7.2308198 doctor 5 1 62.53120 7.2308198 9.6046736 2.3738538
patient15 66.24633 NA NA doctor 5 0 67.95916 14.7750160 NA 1.7128224
kable(result[["resources"]]) |> scroll_box(height = "400px")
resource time server queue capacity queue_size system limit replication
doctor 30.00000 2 0 3 Inf 2 Inf 1
doctor 30.75713 3 0 3 Inf 3 Inf 1
doctor 32.18941 2 0 3 Inf 2 Inf 1
doctor 35.63447 1 0 3 Inf 1 Inf 1
doctor 36.02984 2 0 3 Inf 2 Inf 1
doctor 45.41002 3 0 3 Inf 3 Inf 1
doctor 46.38228 2 0 3 Inf 2 Inf 1
doctor 47.09469 3 0 3 Inf 3 Inf 1
doctor 51.95749 2 0 3 Inf 2 Inf 1
doctor 52.97948 1 0 3 Inf 1 Inf 1
doctor 58.91726 2 0 3 Inf 2 Inf 1
doctor 60.38786 3 0 3 Inf 3 Inf 1
doctor 60.91823 3 1 3 Inf 4 Inf 1
doctor 61.21542 3 2 3 Inf 5 Inf 1
doctor 64.10899 3 3 3 Inf 6 Inf 1
doctor 65.33619 3 2 3 Inf 5 Inf 1
doctor 66.04652 3 1 3 Inf 4 Inf 1
doctor 30.00000 3 0 3 Inf 3 Inf 2
doctor 30.99578 2 0 3 Inf 2 Inf 2
doctor 34.95654 3 0 3 Inf 3 Inf 2
doctor 38.60992 2 0 3 Inf 2 Inf 2
doctor 42.18289 3 0 3 Inf 3 Inf 2
doctor 42.92251 2 0 3 Inf 2 Inf 2
doctor 45.44842 1 0 3 Inf 1 Inf 2
doctor 50.70005 2 0 3 Inf 2 Inf 2
doctor 52.48242 3 0 3 Inf 3 Inf 2
doctor 56.64556 3 1 3 Inf 4 Inf 2
doctor 56.90389 3 0 3 Inf 3 Inf 2
doctor 58.31782 3 1 3 Inf 4 Inf 2
doctor 58.71667 3 2 3 Inf 5 Inf 2
doctor 59.37100 3 1 3 Inf 4 Inf 2
doctor 61.47146 3 2 3 Inf 5 Inf 2
doctor 61.49116 3 3 3 Inf 6 Inf 2
doctor 64.59513 3 4 3 Inf 7 Inf 2
doctor 30.00000 2 0 3 Inf 2 Inf 3
doctor 31.68591 3 0 3 Inf 3 Inf 3
doctor 32.05731 2 0 3 Inf 2 Inf 3
doctor 33.62819 3 0 3 Inf 3 Inf 3
doctor 33.68054 2 0 3 Inf 2 Inf 3
doctor 39.55955 3 0 3 Inf 3 Inf 3
doctor 43.92088 3 1 3 Inf 4 Inf 3
doctor 44.62077 3 2 3 Inf 5 Inf 3
doctor 45.72654 3 1 3 Inf 4 Inf 3
doctor 53.91676 3 0 3 Inf 3 Inf 3
doctor 54.85752 3 1 3 Inf 4 Inf 3
doctor 57.04574 3 0 3 Inf 3 Inf 3
doctor 58.46069 2 0 3 Inf 2 Inf 3
doctor 59.64239 1 0 3 Inf 1 Inf 3
doctor 63.70977 2 0 3 Inf 2 Inf 3
doctor 67.52795 3 0 3 Inf 3 Inf 3
doctor 30.00000 3 2 3 Inf 5 Inf 4
doctor 31.07024 3 1 3 Inf 4 Inf 4
doctor 38.64686 3 2 3 Inf 5 Inf 4
doctor 38.97027 3 1 3 Inf 4 Inf 4
doctor 39.86742 3 2 3 Inf 5 Inf 4
doctor 41.19037 3 1 3 Inf 4 Inf 4
doctor 42.86897 3 0 3 Inf 3 Inf 4
doctor 43.89745 2 0 3 Inf 2 Inf 4
doctor 51.43846 1 0 3 Inf 1 Inf 4
doctor 61.20384 0 0 3 Inf 0 Inf 4
doctor 67.56626 1 0 3 Inf 1 Inf 4
doctor 68.15052 0 0 3 Inf 0 Inf 4
doctor 69.52374 1 0 3 Inf 1 Inf 4
doctor 30.00000 3 1 3 Inf 4 Inf 5
doctor 32.59593 3 0 3 Inf 3 Inf 5
doctor 33.65738 2 0 3 Inf 2 Inf 5
doctor 35.27459 3 0 3 Inf 3 Inf 5
doctor 41.59611 2 0 3 Inf 2 Inf 5
doctor 44.05495 3 0 3 Inf 3 Inf 5
doctor 45.40062 2 0 3 Inf 2 Inf 5
doctor 47.86583 1 0 3 Inf 1 Inf 5
doctor 54.93294 2 0 3 Inf 2 Inf 5
doctor 56.26097 3 0 3 Inf 3 Inf 5
doctor 56.92370 3 1 3 Inf 4 Inf 5
doctor 60.15735 3 2 3 Inf 5 Inf 5
doctor 60.80512 3 1 3 Inf 4 Inf 5
doctor 62.53120 3 0 3 Inf 3 Inf 5
doctor 66.24633 3 1 3 Inf 4 Inf 5
doctor 67.95916 3 0 3 Inf 3 Inf 5
doctor 68.57552 2 0 3 Inf 2 Inf 5
doctor 69.76202 1 0 3 Inf 1 Inf 5
kable(result[["run_results"]])
replication arrivals mean_wait_time_doctor mean_time_with_doctor utilisation_doctor mean_queue_length_doctor mean_time_in_system mean_patients_in_system
1 9 1.1561321 12.600360 0.7276328 0.0867587 6.972457 2.3250216
2 10 0.2185845 15.462225 0.7846974 0.3030654 7.019465 2.3551596
3 8 1.6612341 11.704709 0.8021880 0.2856074 10.033307 1.7861155
4 4 1.3862671 8.268402 0.5661071 0.9366037 8.714268 0.8466787
5 7 1.1383003 8.211288 0.8055934 0.1965982 8.159880 1.5215678
result[["patients_in_system"]] |>
  kable() |>
  scroll_box(height = "400px")
time count replication
30.75713 1 1
36.02984 2 1
45.41002 3 1
46.38228 2 1
47.09469 3 1
51.95749 2 1
52.97948 1 1
58.91726 2 1
60.38786 3 1
60.91823 4 1
61.21542 5 1
64.10899 6 1
65.33619 5 1
66.04652 4 1
34.95654 1 2
42.18289 2 2
42.92251 1 2
50.70005 2 2
52.48242 3 2
56.64556 4 2
56.90389 3 2
58.31782 4 2
58.71667 5 2
59.37100 4 2
61.47146 5 2
61.49116 6 2
64.59513 7 2
31.68591 1 3
33.62819 2 3
33.68054 1 3
39.55955 2 3
43.92088 3 3
44.62077 4 3
45.72654 3 3
53.91676 2 3
54.85752 3 3
57.04574 2 3
58.46069 1 3
59.64239 0 3
63.70977 1 3
67.52795 2 3
38.64686 1 4
39.86742 2 4
42.86897 1 4
61.20384 0 4
67.56626 1 4
68.15052 0 4
69.52374 1 4
35.27459 1 5
41.59611 0 5
44.05495 1 5
47.86583 0 5
54.93294 1 5
56.26097 2 5
56.92370 3 5
60.15735 4 5
60.80512 3 5
66.24633 4 5
67.95916 3 5
68.57552 2 5
69.76202 1 5

Explore the example models

Nurse visit simulation

GitHub Click to visit pydesrap_mms repository

Key files simulation/param.py
simulation/summary_stats.py
simulation/runner.py
What to look for? This is set up to run replications in parallel by default, as introduced on the Parallel processing page.

GitHub Click to visit rdesrap_mms repository

Key files R/parameters.R
R/runner.R
What to look for? This is set up to run replications sequentially by default, but has functionality for running in parallel as introduced on the Parallel processing page. This means some differences in set up (with sequential also using future_lapply in that case - see parallel page for more details).

Stroke pathway simulation

GitHub Click to visit pydesrap_stroke repository

Key files simulation/parameters.py
simulation/runner.py
What to look for? Like nurse visit simulation, this is also set up to support parallel processing.

GitHub Click to visit rdesrap_stroke repository

Key files R/parameters.R
R/runner.R
What to look for? Like nurse visit simulation, this is also set up to support parallel processing.

Test yourself

Extend your model to support multiple replications, as shown above.