Day 5 – Reproducing Kim et al. 2021

Note

Finished runnning scenarios, and reproduced in-text result 1, figure 5 and table 3. Total time used: 13h 25m (33.5%).

09.12-09.19, 09.22-09.24: Get aorta sizes from AAA deaths in surv scenario 0

Create a script functions/extra_functions.R and moved the function I created yesterday - get_aaa_death_aorta_sizes() - from run_aaamodel_surv_scen1.R to extra_functions.R. This is so that:

It sits alongside other model functions
It can be used in scenario 0 and scenario 1 without duplication

I test ran it on a small sample before setting it to run on a remote machine.

Timings:

Full run (one scenario): 4.633 minutes = 278 seconds = 4 minutes 38 seconds

09.34-09.35, 10.28-10.29, 11.05-11.07: Set surv scenario 4C-E to run

Running on the remote machine…

Timings for 4C:

First run: 5.452 minutes = 327 seconds = 5 minutes 27 seconds
All runs: 35.234 minutes = 2114 seconds = 35 minutes 14 seconds

Timings for 4D:

First run: 5.429 minutes = 326 seconds = 5 minutes 26 seconds
All runs: 35.083 minutes= 2105 seconds = 35 minutes 5 seconds

Timings for 4E:

First run: 5.465 minutes = 328 seconds = 5 minutes 28 seconds
All runs: 34.994 minutes = 2096 seconds = 34 minutes 56 seconds

Also ran some others again on the remote machine that were missing run times:

65yo scen 0: 3.528 min = 212 seconds = 3 minutes 32 seconds
65yo scen 1: 29.230 min = 1754 seconds = 29 minutes 14 seconds
Surv scen 4b: 34.955 min = 2097 seconds = 34 minutes 57 seconds

Reassuringly, for those re-run, no change in results observed (so seed control all working as anticipated).

15.42-16.02: In-text result 1

Combined results from my function added to surv scenario 0 and 1, to view deaths by aorta size. The numbers are pretty similar to the original - exacty the same for one year (0, 2 and 7 deaths in the small, medium and large groups repsectively). They differed a little more for the suspension over 2 years:

Small: Original 1, Mine 5
Medium: Original 17, Mine 10
Large: Original 22, Mine 27

However, given the reduced sample size we run, I would consider this reasonably reproduced (with a consistent pattern, and relatively similar numbers), although will see what others say when send for a consensus opinion.

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 741

# Times from today
times = [
    ('09.12', '09.19'),
    ('09.22', '09.24'),
    ('09.34', '09.35'),
    ('10.28', '10.29'),
    ('11.05', '11.07'),
    ('15.42', '16.02')]

calculate_times(used_to_date, times)

Time spent today: 33m, or 0h 33m
Total used to date: 774m, or 12h 54m
Time remaining: 1626m, or 27h 6m
Used 32.2% of 40 hours max

16.06-16.17: Figure 5

The model output from surv scen 3 includes thres 5.5 period 0, and then thresh 7 for periods 0.5 +. This aligns with the description in Table 1 in the paper. Hence, I’m assuming that calculations are in comparison against 5.5 period 0 (and that there is not a thresh 7 period 0 anywhere, since thresh 5.5 is otherwise the default).

Feel this is reproduced at 16.17.

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 741

# Times from today
times = [
    ('09.12', '09.19'),
    ('09.22', '09.24'),
    ('09.34', '09.35'),
    ('10.28', '10.29'),
    ('11.05', '11.07'),
    ('15.42', '16.02'),
    ('16.06', '16.17')]

calculate_times(used_to_date, times)

Time spent today: 44m, or 0h 44m
Total used to date: 785m, or 13h 5m
Time remaining: 1615m, or 26h 55m
Used 32.7% of 40 hours max

16.19-16.26: Finishing up Table 3

Created the final section of table 3 from scenario 3, and then combined with prior sections to produced Table 3.

Feel this is reproduced at 16.26.

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 741

# Times from today
times = [
    ('09.12', '09.19'),
    ('09.22', '09.24'),
    ('09.34', '09.35'),
    ('10.28', '10.29'),
    ('11.05', '11.07'),
    ('15.42', '16.02'),
    ('16.06', '16.17'),
    ('16.19', '16.27')]

calculate_times(used_to_date, times)

Time spent today: 52m, or 0h 52m
Total used to date: 793m, or 13h 13m
Time remaining: 1607m, or 26h 47m
Used 33.0% of 40 hours max

16.40-16.53: Supplementary table 2

Looking through the code and results:

Scenario 4A has scan suspension + 10% dropout over 1 year
Scenario 4B has scan suspension + 10% dropout over 2 years
Scenario 4C has scan suspension + 10% dropout over 2 years + 7cm threshold for 2 years
Scenario 4D has scan suspension + 10% dropout over 2 years + 7cm threshold for 2 years + 2 lockdown (op suspension) periods
Scenario 4E has scan suspension + 10% dropout over 2 years + 7cm threshold for 3 months and 5.5cm threshold for 3 months

Supplementary table 1 and supplementary figure 3 appears to include just 4A to 4C, plus a prior scenario that just had scan suspension. Hence, I removed their run times from the README.

Scan suspension is presented in Figure 3, which uses scenarios 0 and 1.

Timings

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 741

# Times from today
times = [
    ('09.12', '09.19'),
    ('09.22', '09.24'),
    ('09.34', '09.35'),
    ('10.28', '10.29'),
    ('11.05', '11.07'),
    ('15.42', '16.02'),
    ('16.06', '16.17'),
    ('16.19', '16.26'),
    ('16.40', '16.53')]

calculate_times(used_to_date, times)

Time spent today: 64m, or 1h 4m
Total used to date: 805m, or 13h 25m
Time remaining: 1595m, or 26h 35m
Used 33.5% of 40 hours max