Day 7 – Reproducing Shoaib and Ramamohan 2022

Note

Reproduced in-text result 5 but couldn’t get 7 so emailed author. Total time used: 28h 14m (70.6%). Completed individual evaluation against guidelines (just require second opinion on uncertain and unmet).

9.15-9.25, 9.31-9.41, 9.46-9.47: Continuing to troubleshoot in-text result 5

I noticed that my “fix” of changing a 6 yesterday actually messed up the ipd bed occ results (they’re now 1.1 something). The change was:

bed_util.append(bed_time/(days*1440*6)) to
bed_util.append(bed_time/(days*1440*delivery_bed_n))

However, logically, this looks like it should be right? Since the calculation is days (length of simulation), 1440 (24h in min), and then 6 (the standard number of beds). And it’s working out the proportion of time that beds are utilised, so it makes sense that this six should be the number of beds. Moreover, for the first result, it should’ve remained the same as before (as delivery_bed_n=6).

I then realised my mistake! I had input delivery_bed_n when I should have input inpatient_bed_n! And this fixed it!

Reviewing the values, although they are a little lower (3-4%), they do represent a marked increase from the 6 bed values, and so I feel (bearing that in mind), that this difference is minimal enough to consider them successfully reproduced as of 9.47.

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 1536

# Times from today
times = [
    ('9.15', '9.25'),
    ('9.31', '9.41'),
    ('9.46', '9.47')
]

calculate_times(used_to_date, times)

Time spent today: 21m, or 0h 21m
Total used to date: 1557m, or 25h 57m
Time remaining: 843m, or 14h 3m
Used 64.9% of 40 hours max

9.53-10.03, 10.07-10.08, 10.35-11.00, 11.04-11.23: Continuing to troubleshoot in-text result 7

I can’t spot any issues in PHC.py currently, so I tried instead outputting the full results spreadsheet and to see if there could be any differences. However, the results for NCD nurse utilisation matched up with the results from the replication results.

Continued looking over article, code and results to see if I can spot anything, to no avail.

Tried changing the proportion of cases assigned to the staff nurse to see what impact this had (although well aware that text does state that it is 10%, and not more). As expected, the utilisation drops further - here, with 20% of cases assigned, the utilisation dropped to 69% (which is much closer to the original study’s 71%). However, that doesn’t align with how they described the scenario in the text.

import pandas as pd

pd.read_csv('intext7_20.csv')

	Output	Normal	Admin from NCD to staff nurse	20% OPD appointments from NCD to staff nurse	Admin and 20% OPD appointments from NCD to staff nurse
0	NCD nurse utilisation	1.237	1.007	0.924	0.689

I considered changing the arrivals although, realistically, this should be fine, since in-text result 7 just builds on the scenario established in in-text result 6, which matches up to the text, and - as stated in Figure 2B - the doctor’s consultation time should not affect the outcome (so shouldn’t need to change that either).

I kept looking back over everything, but still, could not spot anything. Have emailed Tom and Alison to ask for their thoughts (as well as to get consensus on reproduction of prior items). I’m not sure if this is necessarily a case for contacting the author or not, given that the scenario appears to have been successfully set up (with utilisation dropping - and dropping further if 20% assigned).

11.30-12.22, 13.15-13.45: Email from Tom and email to Shoaib

Tom confirmed the other items were successfully reproduced. He also gave thoughts on in-text result 7, having looked over the code.

He thought the scenario code I’d add was straightforward and logic seemed correct.
He suggested changing percentage of time to see what value is needed to get apx. 70% - I confirmed that 0.2 got close.
He asked about stability with replication. I double checked the change in NCD nurse utilisation between 10 and 100 replications (got results from Table 6 with 10 replications from old commit, processed and saved it with the logbook post from 24th June). Confirmed it was tiny. However, to rule it out, will re-run over lunch.
He suggested otherwise to contact the study authors to let them know we’ve been very successful but see if they have any suggestions for this one final thing that we can’t quite get to work.

I did a bit of tidying prior to emailing to Shoaib, drafted the email, and ran 100 replications for the final scenario over lunch.

I also then add the results from setting the help as 20% rather than 10% (modifying the way I set this up in PHC.py so I could easily do so from the notebook).

I then sent the email to Shoaib. This now marks the end of my work on in-text result 7 (since emailing was part of the troubleshooting for that) - although I may work on this further if we receive a response from Shoaib with any suggestions.

Timings for reproduction

After this point, unless otherwise noted, timings will be for the evaluation stage (which are seperate to the reproduction timings).

# Minutes used prior to today
used_to_date = 1536

# Times from today
times = [
    ('9.15', '9.25'),
    ('9.31', '9.41'),
    ('9.46', '9.47'),
    ('9.53', '10.03'),
    ('10.07', '10.08'),
    ('10.35', '11.00'),
    ('11.04', '11.23'),
    ('11.30', '12.22'),
    ('13.15', '13.45')
]

calculate_times(used_to_date, times)

Time spent today: 158m, or 2h 38m
Total used to date: 1694m, or 28h 14m
Time remaining: 706m, or 11h 46m
Used 70.6% of 40 hours max

14.05-14.17: Badges

Uncertain:

License: Couldn’t remember if we’d agreed that a license added upon our request should be marked as a condition met or not. Tentatively marked as ‘1’.
Complete: Uncertain on this, as I had to modify the script alot to get results (i.e. writing scenario logic). However, arguably, I did have all the information I would need to do that? Tentatively marked as ‘1’.
Structure: Does use object-oriented programming (which is great), but parameters are hard coded and uses global variables. Tentatively marked as ‘0’.

14.18-14.26: STARS

Uncertain:

License: As for badges.
ORCID: Have marked as not met, as there are no ORCID in the repository, although Varun Ramamohan does have ORCID ID with publication in SIMULATION.

14.28-14.57, 15.10-15.47: STRESS-DES

Uncertain:

1.2 Model outputs: Is what I have quoted sufficient for rating of “Partially”?

Reflections on STRESS-DES

Doesn’t necessarily feel like names are essential (re: 1.3)

Not certain on difference between (A) and (B) (re: 1.3)

Overlap between 2.4 scheduling of arrivals and 2.5.5 detail the arrival mechanism

15.48-16.28: ISPOR-SDM

Uncertain:

12 Cross-validation: Is my conclusion reasonable?
14 Predictive validity: There’s alot of validation work done, but not sure I can spot this, or any justification against it?

Timings for evaluation

# Minutes used prior to today
used_to_date = 0

# Times from today
times = [
    ('14.05', '14.17'),
    ('14.18', '14.26'),
    ('14.28', '14.57'),
    ('15.10', '15.47'),
    ('15.48', '16.28')
]

calculate_times(used_to_date, times, limit=False)

Time spent today: 126m, or 2h 6m
Total used to date: 126m, or 2h 6m

Summary report

With evaluation complete (barring discussion on uncertainities and unmet), now started working on summary report.