CCU models#

import pandas as pd
import numpy as np

Model outputs#

The results of the two generated simulations models were identical to 2 decimal places. The results for stage 1 and stage 2 models are reported below in Table 1. The table show the 4 performance measures in the design and how these vary across six experiments with the model.

The model results did not replicate those reported in Griffith’s et al; or results have a higher arrival rate overall and higher occupancy of the CCU. The explanation would appear to be that we did not have access to information about the empirical distributions used for elective patients in the original article.

caption = "Table 1: Comparison of critical care model outputs: stage 1 versus stage 2 " \
          "(internal replication). Figures are mean (sd)."
ccu_summary = pd.read_csv("ccu_model_comparison.csv", index_col=['Study Stage', 
                                                                 'metric'])
ccu_summary = ccu_summary.style.set_caption(caption)
ccu_summary
Table 1: Comparison of critical care model outputs: stage 1 versus stage 2 (internal replication). Figures are mean (sd).
    23 beds 24 beds 25 beds 26 beds 27 beds 28 beds
Study Stage metric            
Stage 1 0. Patient Count 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83)
1. Cancelled Elective Operations 390.6 (30.57) 337.8 (38.75) 279.0 (39.13) 231.4 (33.83) 178.4 (32.46) 139.8 (27.58)
2. Bed Utilization 0.9 (0.02) 0.9 (0.02) 0.9 (0.02) 0.9 (0.02) 0.8 (0.02) 0.8 (0.02)
3. Bed Occupancy 21.3 (0.49) 21.8 (0.50) 22.3 (0.54) 22.6 (0.56) 23.0 (0.58) 23.3 (0.62)
4. Mean Unplanned Admission Waiting Time (hours) 103.8 (72.08) 62.5 (55.23) 35.0 (29.28) 20.8 (15.59) 12.0 (7.66) 7.0 (3.76)
Stage 2 0. Patient Count 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83) 1,650.4 (17.83)
1. Cancelled Elective Operations 390.6 (30.57) 337.8 (38.75) 279.0 (39.13) 231.4 (33.83) 178.4 (32.46) 139.8 (27.58)
2. Bed Utilization 0.9 (0.02) 0.9 (0.02) 0.9 (0.02) 0.9 (0.02) 0.8 (0.02) 0.8 (0.02)
3. Bed Occupancy 21.3 (0.49) 21.8 (0.50) 22.3 (0.54) 22.6 (0.56) 23.0 (0.58) 23.3 (0.62)
4. Mean Unplanned Admission Waiting Time (hours) 103.8 (72.08) 62.5 (55.23) 35.0 (29.28) 20.8 (15.59) 12.0 (7.66) 7.0 (3.76)

Model code#

The final code files from stage 1 and stage 2 (our internal replication) for the critical care unit model and its interface were overall very similar. Minor differences included naming of variables, functions and classes. Another minor difference was the setup of random number generators for each activity in the model. However, both approaches in stage 1 and 2 were acceptable. Disregarding comments and documentation, stage 1 generated a model consisting of 262 line of code and stage 2 generated 355 lines of code. Both models passed the same batch of 28 verification tests.

A more substantial difference is that the stage 2 code is arguably easier to understand for a new user than stage 1 code. For example, the LLM generated an Experiment class where each parameter used in a statistical distribution was implemented as a named variable. For example, the mean inter-arrival times all had a clear variable that could be set. Where-as in stage one code the LLM generated an Experiment class where inter-arrival means were set via a list of unnamed parameter values. This increased clarify resulted in more lines of code in stage 2 than stage 1; although we do not consider this a good or bad outcome. A similar difference in clarity can be seen in the code to convert the mean and standard deviation of a log normal into scale and shape parameters (suitable for the numpy lognormal functions). In stage 2 the logic was (optimally) implemented in a reusable function. In stage one the conversion logic is coded directly into each process and harder to follow and test. The difference in the design of the python classes representing an experiment and CCU model logic can be seen in the number of class attributes and methods in ccu_component_comparison.

Table 3 Description of model code components Stage 1 versus Stage 2.#

Component

Number of Attributes

Number of Methods/Functions

Stage 1

Stage 2

Stage 1

Stage 2

Experiment class

13

27

3

2

CCU model logic class

4

9

10

12

Functions

N/A

N/A

6

6

Lines of code data#

!pygount --suffix=py --format=summary ../../02_CCU/ccu_formatted_code.py
?25l
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
┏━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━━━━┳━━━━━┓
┃ Language  Files      %  Code     %  Comment    % ┃
┡━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━━━━╇━━━━━┩
│ Python   │     1 │ 100.0 │  262 │ 73.6 │       4 │ 1.1 │
├──────────┼───────┼───────┼──────┼──────┼─────────┼─────┤
│ Sum      │     1 │ 100.0 │  262 │ 73.6 │       4 │ 1.1 │
└──────────┴───────┴───────┴──────┴──────┴─────────┴─────┘
# code for stroke
# !pygount --suffix=py --format=summary ../../03_stroke/stroke_rehab_model.py
# !pygount --suffix=py --format=summary ../../03_stroke/stroke_rehab_interface.py
!pygount --suffix=py --format=summary ../../02_CCU/ccu_formatted_code_stage2.py
?25l
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
┏━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━━━━┳━━━━━┓
┃ Language  Files      %  Code     %  Comment    % ┃
┡━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━━━━╇━━━━━┩
│ Python   │     1 │ 100.0 │  355 │ 74.0 │      17 │ 3.5 │
├──────────┼───────┼───────┼──────┼──────┼─────────┼─────┤
│ Sum      │     1 │ 100.0 │  355 │ 74.0 │      17 │ 3.5 │
└──────────┴───────┴───────┴──────┴──────┴─────────┴─────┘

Prompts#

In total 22 iterations of the model were used to build the model and interface. In stage 1 this consisted of 26 prompts passed to the LLM. The number of prompts increased to 36 in stage 2. In total 5 of the 10 extra prompts occurred in the first 2 iterations of the model. Some minor additional prompting was needed to ensure comparable performance measures. The final iteration of the model was a bug fix that was only relevant to stage 1; therefore stage 2 saved one prompt. The table below presents 22.

Iteration

Added functionality

Stage 1

Stage 2

Difference

1

Unplanned arrivals

1

4

3

2

Add treatment

1

3

2

3

Elective patients

2

2

0

4

Organise input parameters

2

2

0

5

Add a warm-up period

1

1

0

6

Elective cancellations (KPI)

1

1

0

7

Bed Utilisation (KPI)

2

2

0

8

Waiting time (KPI)

1

2

1

9

Bed occupancy (KPI)

1

3

2

10

Patient count (KPI)

1

3

2

11

Multiple replications (1)

1

1

0

12

Multiple replications (2)

1

1

0

13

Multiple replications (3)

1

1

0

14

Summarise results

1

1

0

15

Common random numbers (1)

1

1

0

16

Common random numbers (2)

1

1

0

17

Common random numbers (3)

1

2

1

18

Common random numbers (4)

1

1

0

19

Batching experiments

1

1

0

20

streamlit interface (1)

1

1

0

21

streamlit interface (2)

1

1

0

22

streamlit interface (3)

1

1

0

23

Bug fix

1

0

-1

Totals

26

36

10

LateX for manuscript#

This does include the totals row

caption = "The number of prompts given to the LLM " \
          + "at each iteration of the CCU model)."

prompt_results = (
    pd.read_csv("data/ccu_prompt_table.csv",
                index_col=['Iteration'])
)
                 
print(prompt_results.style.to_latex(caption=caption))
\begin{table}
\caption{The number of prompts given to the LLM at each iteration of the CCU model).}
\begin{tabular}{llrrr}
 & Added functionality & Stage 1 & Stage 2 & Difference \\
Iteration &  &  &  &  \\
1 & Unplanned arrivals & 1 & 4 & 3 \\
2 & Add treatment & 1 & 3 & 2 \\
3 & Elective patients & 2 & 2 & 0 \\
4 & Organise input parameters & 2 & 2 & 0 \\
5 & Add a warm-up period & 1 & 1 & 0 \\
6 & Elective cancellations (KPI) & 1 & 1 & 0 \\
7 & Bed Utilisation (KPI) & 2 & 2 & 0 \\
8 & Waiting time (KPI) & 1 & 2 & 1 \\
9 & Bed occupancy (KPI) & 1 & 3 & 2 \\
10 & Patient count (KPI) & 1 & 3 & 2 \\
11 & Multiple replications (1) & 1 & 1 & 0 \\
12 & Multiple replications (2) & 1 & 1 & 0 \\
13 & Multiple replications (3) & 1 & 1 & 0 \\
14 & Summarise results & 1 & 1 & 0 \\
15 & Common random numbers (1) & 1 & 1 & 0 \\
16 & Common random numbers (2) & 1 & 1 & 0 \\
17 & Common random numbers (3) & 1 & 2 & 1 \\
18 & Common random numbers (4) & 1 & 1 & 0 \\
19 & Batching experiments & 1 & 1 & 0 \\
20 & streamlit interface (1) & 1 & 1 & 0 \\
21 & streamlit interface (2) & 1 & 1 & 0 \\
22 & streamlit interface (3) & 1 & 1 & 0 \\
23 & Bug fix & 1 & 0 & -1 \\
\end{tabular}
\end{table}