Day 4 – Reproducing Hernandez et al. 2015

Note

Created Figures 5 and 6 from mini-run, and started trying to setup to run on remote machine. Total time used: 6h 26m (16.1%)

09.08-09.22, 09.32-09.46: Run and plot all 9 scenarios from experiment 1, with mini version of parameters

Modified main.py into Experiment1.py. Changed so it saves the time for each experiment run to txt (as important to record this, but could potentially get lost if just print to screen, when we start running larger numbers).

Ran all with 10 population and 1 generation. Wrote code to identify and loop through importing the results from experiment1/ folder.

09.47-09.48, 10.00-10.05: Re-ran to confirm that results were consistent between runs

This was to check that provided seeds ensured consistency. At the same time, tweaked how time was output. Indeed, consistent between runs, which is great and really helpful that this was already set up by the author.

Total run time for the 9 scenarios was 7 minutes.

11.48-12.00, 13.11-13.37: Creating Figure 6

There is some code in plotting_staff_results.r for “barplot staff, stratified by prescreened”, which appears relevant to Figure 6.

I installed reshape2 to melt the data.

As before, this function gave a great starting point to get matching figure, although it did require more work this time to get it more similar to the article.

Changing facet wrap to throughput
Article appears to plot ascending throughput, and perhaps a subsample
When add fill colour by pre-screen, realise that it’s including multiple rows in the same plot. This doesn’t appear to be the case in the original. They also don’t have any consecutive graphs with the same throughput. Hence, it appears I’d maybe need to remove duplicates with same throughput? I tried this and it looked like it maybe is heading in the right direction

14.30-14.47: Running on remote machine

Cloned and built environment on remote machine (more computational power than my local machine, but not HPC). Had to install conda -

wget https://repo.anaconda.com/miniconda/Miniconda3-py39_24.7.1-0-Linux-x86_64.sh
bash Miniconda3-py39_24.7.1-0-Linux-x86_64.sh

source ~/miniconda3/bin/activate
conda activate hernandez2015

python -m Experiment1

However, this had an error:

...
13, in <module>
    import ExperimentRunner
  File "ExperimentRunner.py", line 15, in <module>
    import SimulatorRunner
  File "SimulatorRunner.py", line 11, in <module>
    import PODSimulation
  File "PODSimulation.py", line 13, in <module>
    import SimPy.SimPlot as simplot
  File "/home/amyremote/miniconda3/envs/hernandez2015/lib/python2.7/site-packages/SimPy/SimPlot.py", line 68, in <module>
    class SimPlot(object):
  File "/home/amyremote/miniconda3/envs/hernandez2015/lib/python2.7/site-packages/SimPy/SimPlot.py", line 69, in SimPlot
    def __init__(self, root = Tk()):
  File "/home/amyremote/miniconda3/envs/hernandez2015/lib/python2.7/lib-tk/Tkinter.py", line 1815, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable

This is likely related to me working in a terminal-only set-up, with Tkinter wanting access to a display for GUI operations. echo $DISPLAY returned blank. Tried setting to export DISPLAY=:0.0 but then just an error _tkinter.TclError: couldn't connect to display ":0.0".

Timings

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 297

# Times from today
times = [
    ('09.08', '09.22'),
    ('09.32', '09.46'),
    ('09.47', '09.48'),
    ('10.00', '10.05'),
    ('11.48', '12.00'),
    ('13.11', '13.37'),
    ('14.30', '14.47')]

calculate_times(used_to_date, times)

Time spent today: 89m, or 1h 29m
Total used to date: 386m, or 6h 26m
Time remaining: 2014m, or 33h 34m
Used 16.1% of 40 hours max