Day 11

reproduce
Author

Amy Heather

Published

October 7, 2024

Note

Ran remaining experiments and created figures and tables, although unfortunately not reproduced and no further troubleshooting ideas. Total time used: 17h 41m (44.2%)

09.20-09.35: Running and processing Experiment 3

Was going to alter Experiment 3 to just do the two shorter scenarios, but then realised those had finished on Friday, and the process had only remained running for the longer scenario. However, the timings were missing, as that would have been done once all completed. Hence, still amended the script to just run two, and then set these to run, as I need to know the times.

source ~/miniconda3/bin/activate
conda activate hernandez2015
python -m Experiment3

I checked the results, and - unsurprisingly, given previous - these are similar but differ from the article.

Once this finished, it had a run time of: 3 hours 13 minutes

09.36-09.41, 09.50-10.07: Experiment 4

Experiment 4 is the tri-objective with a maximum of 1, 2 or 3 line managers (known as greeters in the code) ran with population 50 and 25 generations. We set the upper bounds for the number of line managers in StaffAllocationProblem.py:

self.upperBounds = [60, 60, 60, 60]

Hence, I’m assuming I’ll need to alter this to:

self.upperBounds = [1, 60, 60, 60]
self.upperBounds = [2, 60, 60, 60]
self.upperBounds = [3, 60, 60, 60]

To do so programmatically, I’ll need to make this an input to the class StaffAllocationProblem(), and then likewise in ExperimentRunner() and main.py where it is called, alike how I did for objectiveTypes.

10.09-10.14, 10.18-10.28: Experiment 5

Experiment 5 has 6 dispensing, 6 sceening, 4 line manager and one medical. It then varies the number of replications from 1-7.

To set staff numbers, I’m assuming I’ll need to set upperBounds and lowerBounds to the same values, and so modified the code accordingly to allow input of lowerBounds.

To set number of replications, I’m assuming this is referring to runs, as that is an input that was set up, and I have previously assumed that when it says to set to run three times, it is referring to that parameter.

It doesn’t state population and generations, but I’m assuming generations is 1 (as it’s a fixed number of staff), and that population is 1000.

10.51-10.57: Processing Experiment 4

Run time: 37 minutes

As observed for similar figures previously, although patterns are similar, the axis values differ sufficiently that this is not reproduced (e.g. 10-120 staff members instead of 10-70, and 2000-6000 throughput rather than 500-5000 throughput).

11.09-11.55, 11.57-12.10, 12.14-12.24: Troubleshooting Experiment 5

Run time: 41 minutes

However, the result from each was identical. I’m wondering if I changed the right thing? Looking at PODSimulation.py they have option of amending capacities:

########################
name = 'greeter'
n = 0
self.resources[name] = simpy.Resource(capacity=capacities[n], 
                                        name=name,
                                        monitored=True)
self.monitors[name] = simpy.Monitor(name=name, ylab=ylab)

########################
name = 'screener'
n = 1
self.resources[name] = simpy.Resource(capacity=capacities[n], 
                                        name=name,
                                        monitored=True)
self.monitors[name] = simpy.Monitor(name=name, ylab=ylab)

########################        
name = 'dispenser'
n = 2
self.resources[name] = simpy.Resource(capacity=capacities[n], 
                                        name=name,
                                        monitored=True)        
self.monitors[name] = simpy.Monitor(name=name, ylab=ylab)

########################
name = 'medic'
n = 3
self.resources[name] = simpy.Resource(capacity=capacities[n], 
                                        name=name,
                                        monitored=True)
self.monitors[name] = simpy.Monitor(name=name, ylab=ylab)

This seems to imply there are 0 greeters, 1 screener, 2 dispensers and 3 medics.

I’m wondering if perhaps I shouldn’t be running this like I have done so far, which is by searching through candidate solutions, given that this should only be the result of running the discrete event simulation? I completely changed Experiment5.py to run PODSimulation() directly. I borrowed code from StaffAllocationProblem.py.

Running this manually, I found I could get the same result (459.333333 throughoutput, 64.820665 time).

Then, in PODSimulation.py, I stumbled across the code that looks like it was designed to run this experiment:

if __name__ == '__main__':
    #greeter, screener, dispenser, medic
    
    startTime = datetime.datetime.now()
    print "program started:", startTime
    #capacities = [1,1,1,1]
    capacities = [4, 6, 6, 1]
    
    seeds = get_20_seeds()
    #seeds = [123]
    simulations = []
    for seed in seeds:
        simul = PODSimulation(capacities)
        simul.model(seed)
        simulations.append(simul)
    resultsAnalyzer = ResultsAnalyzer.ResultsAnalyzer(simulations)
    resultsAnalyzer.show_results()
    endTime = datetime.datetime.now()
    print "program finished:", endTime 
    print "simulation length: ", endTime - startTime

I copied this into Experiment5.py, adapting it so that it saved the individual results to a file (rather than printing average results), and so it saved the time to a file too. To get individual results took a bit of work to figure out.

From StaffAllocationProblem.py, we know throughput = simulatorRunner.get_processed_count().

In SimulatorRunner.py, we see that:

def get_processed_count(self):
        return self.avgProcessedCount

And -

self.avgProcessedCount = self.resultsAnalyzer.get_avg_total_number_out()

From ResultsAnalyzer.py, we can see that:

def get_avg_total_number_out(self):
        return self.avgTotalNumberOut

And -

self.avgTotalNumberOut += simul.get_number_out() / float(n)

Where self.n = len(simulations).

Hence, to get the throughput per simulation, we just need simul.get_number_out(). I saved this to .txt.

Run time: 8 seconds

I checked plotting_staff_results.r but it didn’t seem to have any code for this figure, so I wrote some to produce the figure.

And, alike I have found for other figures, I see a similar pattern in the results, although different values on the axises.

13.21-13.51, 13.58-14.27: Appendix A.1 (Table 3)

Appendix A.1 shows mean and confidence intervals for several different metrics. It is run with:

  • 10% pre-screened
  • 4 line manager
  • 6 dispensing
  • 6 screening
  • 1 medical

Each with twenty replications. Hence, it appears this just directly uses the segment of code I had previously identified in PODSimulation.py and adapted for Experiment 5. These return average and half-width (which, as from this source, understand that the distance from mean to edge of confidence interval can be called the precision, margin of error or half-width).

It appears to have all the metrics needed, so all I needed to do was convert it into a table. From def __str__ I could see what components were used to make the printed output, and so what I needed for the table.

Add pandas to environment so could output a dataframe, although had to find one that was compatible with numpy 1.8.0, which meant using pandas from pip as condas doesn’t go back that far. It took a long time to install the pip dependencies, and then had module errors of ImportError: No module named dateutil.tz, though unresolved with install of python-dateutil, and so I decided to just output to csv and process in R.

However, I then started getting an error when importing myutils: ImportError: matplotlib requires dateutil. I think this probably resulted from my adding python-dateutil and then just pruning environment, so I deleted it and rebuilt it.

Run time: 10 seconds

Again however, this differed from the original.

import pandas as pd

pd.read_csv('table3.csv')
Estimate Avg LowerBound UpperBound
0 Wait time 66.66 63.75 69.57
1 No. in (Designees) 11979.15 11935.90 12022.40
2 No. out (Designees) 473.20 458.87 487.53
3 Dispensing wait time 18.84 18.33 19.35
4 Line mngr. wait time 23.73 23.60 23.86
5 Med. eval. wait time 9.06 6.05 12.06
6 Screening wait time 15.04 14.79 15.29
7 Dispensing no. waiting 434.81 423.35 446.27
8 Line mngr. no. waiting 4733.55 4705.73 4761.38
9 Med. eval. no. waiting 2.19 1.42 2.96
10 Screening no. waiting 569.06 560.52 577.61

14.29-14.38: Appendix A.2 (Table 4)

Table 4 is described as being results with line manager and 10% pre-screened. Hence, I’m assuming it’s just the pre-screen 10 results from Experiment 1?

These have 255 rows though, so I’m then assuming it’s just the first 40 rows.

It doesn’t appear reproduced, with quite different results, which was as I expected, given prior results.

res = pd.read_csv('exp1_prescreen10.txt', sep='\t')

print(res.shape)

res.head(40)
(255, 7)
greeter screener dispenser medic resources throughput time
0 4 1 14 1 20 435.000000 52.355130
1 4 1 14 3 22 435.666667 51.441467
2 7 1 14 1 23 650.000000 46.791091
3 8 1 14 1 24 669.666667 48.432571
4 7 1 14 3 25 647.000000 46.533784
5 4 6 14 1 25 1046.000000 53.590160
6 9 1 14 2 26 752.333333 43.787625
7 4 6 14 2 26 1048.333333 46.674927
8 5 6 14 1 26 1058.333333 61.717741
9 9 1 14 3 27 753.666667 43.688801
10 4 6 14 3 27 1096.666667 45.042496
11 4 6 14 4 28 1073.666667 44.577338
12 8 6 14 1 29 1103.000000 60.900434
13 4 6 14 6 30 1073.333333 44.503737
14 16 1 14 1 32 1079.666667 35.059892
15 4 10 14 4 32 1103.333333 43.141365
16 8 6 14 4 32 1111.666667 48.838884
17 8 6 14 5 33 1107.666667 48.787158
18 4 10 14 6 34 1094.000000 42.299897
19 4 6 25 1 36 1347.666667 46.158593
20 4 6 25 2 37 1351.666667 41.018076
21 5 6 25 1 37 1436.666667 50.700946
22 4 6 25 3 38 1371.666667 38.838681
23 5 6 25 2 38 1449.000000 42.375245
24 4 6 25 4 39 1371.000000 38.189643
25 5 6 25 3 39 1432.666667 42.123040
26 7 6 25 1 39 1546.666667 45.183401
27 5 6 25 4 40 1433.000000 39.702079
28 7 6 25 2 40 1545.333333 42.707594
29 4 10 25 1 40 1876.000000 43.735103
30 4 6 25 6 41 1373.666667 38.180682
31 4 10 25 2 41 1877.666667 38.542261
32 4 10 27 1 42 2055.000000 43.479165
33 4 10 25 4 43 1862.000000 31.677448
34 4 10 27 2 43 2045.666667 37.892850
35 4 10 25 5 44 1909.000000 31.112816
36 4 10 27 3 44 2006.333333 35.759831
37 18 1 25 1 45 1322.333333 30.449133
38 4 10 25 6 45 1915.333333 31.019967
39 4 10 27 4 45 2045.333333 32.732751

14.49-14.57: Final look over

I looked over the code again, trying to spot anything I could alter to help resolve discrepancy, but had no further ideas. I doubled checked the capacities in PODSimulation.py but was satisified these were coming from the provided inputs.

With no further ideas, I will stop at this point, and get (a) consensus from another team member on reproduction success, and (b) message the author to inform them and ask for suggestions if wish (although being quite aware that this was a very long time ago for them, so shouldn’t imagine that would be appropriate in this case).

Time at this point: 1061 minutes.

Timings

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 858

# Times from today
times = [
    ('09.20', '09.35'),
    ('09.36', '09.41'),
    ('09.50', '10.07'),
    ('10.09', '10.14'),
    ('10.18', '10.28'),
    ('10.51', '10.57'),
    ('11.09', '11.55'),
    ('11.57', '12.10'),
    ('12.14', '12.24'),
    ('13.21', '13.51'),
    ('13.58', '14.27'),
    ('14.29', '14.38'),
    ('14.49', '14.57')]

calculate_times(used_to_date, times)
Time spent today: 203m, or 3h 23m
Total used to date: 1061m, or 17h 41m
Time remaining: 1339m, or 22h 19m
Used 44.2% of 40 hours max