Day 2

setup
scope
reproduce
Author

Amy Heather

Published

July 4, 2024

Note

Defined scope and problem-solving renv. Total time used: 2h 29m (6.2%)

Untimed: Set up RStudio and test quarto site with R

I did not time this as it is not specific to this reproduction, but additional set-up as not done reproduction in R yet (since the test-run was conducted in Python).

This involved installing/updating RStudio, learning how to run and work with a quarto book on that platform, and and troubleshooting any issues in getting the quarto book up and running.

Environment

  • Updating to the latest version of RStudio, as suggested in the Quarto docs
  • Installing renv: install.packages("renv")
  • Setting the working directory: setwd("~/Documents/stars/stars-reproduce-huang-2019")
  • Initialised an empty R environment: renv::init(bare=TRUE)
  • Set renv to use explicit dependencies: renv::settings$snapshot.type("explicit")
  • Created a DESCRIPTION file
  • Ran renv::snapshot() which returned that project is not activated yet, so I selected option to Activate the project and use the project library. This generated an .Rprofile file.
  • I then tried to open the project (File > Open Project) but this failed. So I tried File > New Project > Existing Directory (which created an .Rproj file), then reran renv::init(bare=TRUE), then renv::snapshot(), and selected to install packages and then snapshot.
  • Synced with GitHub (excluding .Rhistory, which is just a history of executed commands), using Git panel in top right corner
  • Add rmarkdown to DESCRIPTION and rebuilt environment (via renv::snapshot() and selecting to install)

Then came across pkgr, and decided to give that a go, following their tutorial

  • Deleted renv and associated files (.Rprofile and renv.lock) with renv::deactivate(clean=TRUE)
  • Installed pkgr following the instructions on their latest release:
sudo wget https://github.com/metrumresearchgroup/pkgr/releases/download/v3.1.1/pkgr_3.1.1_linux_amd64.tar.gz -O /tmp/pkgr.tar.gz
sudo tar xzf /tmp/pkgr.tar.gz pkgr
sudo mv pkgr /usr/local/bin/pkgr
sudo chmod +x /usr/local/bin/pkgr
  • Created a pkgr.yml file
# Version of pkgr.yml and, at this point, should always say Version: 1
Version: 1

# pkgr will pull dependencies listed in DESCRIPTION
Descriptions:
- DESCRIPTION

# If DESCRIPTION is provided, then this section only needs to include packages
# that you would like to use for development purposes that are not in your
# DESCRIPTION file (i.e. not formal dependencies of your package) - e.g. devtools
# Packages:

# Specify where to pull packages from
# If list CRAN and MPN, will look on CRAN first, then MPN (which is useful for
# dependencies no on CRAN). Can list a location for specific packages in Packages:
Repos:
  - CRAN: https://cran.rstudio.com
  - MPN: https://mpn.metworx.com/snapshots/stable/2022-02-11 # used for mrgval

# Specify Lockfile or Library to tell pkgr where to install packages
# We are using renv to isolate our package environment - renv will tell pkgr where to install them
Lockfile:
  Type: renv
  • In terminal, ran pkgr plan, but get error ARN[0000] error getting library path from renv: Error in loadNamespace(x) : there is no package called ‘renv’
    • If I start a new R session and run packageVersion("renv"), it returns that it is installed
    • Trying to reinstall with install.packages("renv") makes no difference.
    • Tried restarting R and opening a new terminal

I looked through issues and couldn’t spot anything, and then realised this was a fairly small package which hadn’t had any changes in half a year, so on reflection, probably not a reliable option to choose. So went back to set up similar to before of:

  • renv::init(bare=TRUE) with explicit snapshot
  • renv::snapshot() (and realised it didn’t update with change to DESCRIPTION before simply because I hadn’t put a comma after each package!)

To render the Quarto book (in a similar to way to how we did in VSCode), just click the Render button.

Now, returning to what started this - trying to get the .TIFF supplementary file to display…

  • Add tiff to DESCRIPTION
  • renv::status() showed that the package was used but not installed, and renv::snapshot() with option 2 installed the package

Using specific versions

  • Add explict versions of R and packages to DESCRIPTION
  • Attempted to downgrade tiff. renv::status() and renv::snapshot() did not noticed. From this issue, it appears that this should work for renv::install() and, indeed, that recognises it although get issue:
Warning: failed to find source for 'tiff 0.1.11' in package repositories
Error: failed to retrieve package 'tiff@0.1.11'
  • I checked the archive for tiff on CRAN and found there is a 0.1-11 (prior to the current 0.1-12)
  • If I deleted it (remove.packages("tiff")) and then redid renv::snapshot(), it again would not notice the versions
  • I tried to do it manually with remotes (rather than devtools as devtools has so many dependencies) - I installed remotes and then ran remotes::install_version("tiff", "0.1.11"). This seemed successful, except packageVersion("tiff") still returned 0.1.12? Although actually, on inspection, you can see it if 0.1.11. However, it wasn’t able to do that from DESCRIPTION.
  • I removed it and tried again with a direct renv::install("tiff@0.1-11") which was successful
  • I then tried again with DESCRIPTION, but instead set it to tiff@0.1-11, which was successful likewise! And if it was tiff (==0.1-11)! So it appears its a bit fussy about matching up to the format in the CRAN archive .tar.gz files.
  • I then found that renv::snapshot() ignores the version if it’s tiff (==0.1-11) but adheres if it is tiff@0.1-11 - yay!

Having finished with this experiment, I deleted and rebuilt with latest versions - but found it had errors installing them where defined like tiff@0.1-12. Hence, returned to tiff (==0.1-11), and just had to make sure to do renv::install() before renv::snapshot() (rather than rely on snapshot to install the packages).

Fixing GitHub action to render and publish the book

With no changes to GitHub action, had an error of:

[14/18] quarto_site/study_publication.qmd
Error in file(filename, "r", encoding = encoding) : 
  cannot open the connection
Calls: source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file 'renv/activate.R': No such file or directory
Execution halted
Error in file(filename, "r", encoding = encoding) : 
  cannot open the connection
Calls: source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file 'renv/activate.R': No such file or directory
Execution halted
Problem with running R found at /usr/bin/Rscript to check environment configurations.
Please check your installation of R.

ERROR: Error
    at renderFiles (file:///opt/quarto/bin/quarto.js:78079:29)
    at eventLoopTick (ext:core/01_core.js:153:7)
    at async renderProject (file:///opt/quarto/bin/quarto.js:78477:25)
    at async renderForPublish (file:///opt/quarto/bin/quarto.js:109332:33)
    at async renderForPublish (file:///opt/quarto/bin/quarto.js:104864:24)
    at async Object.publish1 [as publish] (file:///opt/quarto/bin/quarto.js:105349:26)
    at async publishSite (file:///opt/quarto/bin/quarto.js:109369:38)
    at async publish7 (file:///opt/quarto/bin/quarto.js:109588:61)
    at async doPublish (file:///opt/quarto/bin/quarto.js:109548:13)
    at async publishAction (file:///opt/quarto/bin/quarto.js:109559:9)
Error: Process completed with exit code 1

Attempting to solve this…

  • Add installation of R and set up of R environment with actions from r-lib (trying setup-renv and setup-r-dependencies) for environment. However, it fails for installation of R dependencies with the error message:
Run r-lib/actions/setup-r-dependencies@v2
Run # Set site library path
Error in file(filename, "r", encoding = encoding) : 
  cannot open the connection
Calls: source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file 'renv/activate.R': No such file or directory
Execution halted
Error: Process completed with exit code 1.
  • Based on this forum post, I tried removing the .Rprofile from git
  • This seemed to improve slightly, although setup-r-dependencies then failed with an error in a pak subprocess seemingly for a package called “.”. Tried switching to setup-renv (which bases on renv.lock) which was then successful! (although takes 4 minutes to install R dependencies, so 6m 55s total)

14.14-14.31: Reading the article

Read throughout and highlighted a copy of the article.

14.33-14.50: Define scope of article

Went through figures and tables to define scope (and convert and crop the .TIFF supplementary to .JPG so easier to display). From looking through text of article, identified a few extra results not in the figures: the quoted decrease in wait times. Although these are very related to the figures, as it wouldn’t be able to look at the figure and deduce the average wait time reduction, these represent additional results.

There was one line in the discussion that caught my attention - “The quality of the ECR service appears to be robust to important parameters, such as the number of radiologists” - but I feel the interpretation of this is quite ambiguous (as to whether it is a model result or interpretation from other results), and doesn’t have anything specific to action, so will not include in scope.

15.05-15.10: Consensus on scope with Tom

Discussed with Tom (and he also had another look over afterwards). Happy with scope choices, and agree that the line from the discussion is simply too ambiguous to action.

15.35-15.43: Exploring app and simulation visualisation

As an addendum to the reading, explored the app and linked simulation configuration visualisation.

For the configuration, it just opened to the CLOUDES homepage, so I tried creating an account then going to the link (turns out you need an account to access). The link still did not work nor the ID, but when I search for “Huang”, I was able to find a diagram: https://beta.cloudes.me/loadSim?simId=17482&pageId=rTbqE (ID 17482). When run, this played through the simulation showing arrivals and queues etc.

15.44-15.47: Prepare release

Modified CHANGELOG and CITATION ahead of release.

15.55-15.58: Archived on Zenodo

Created GitHub release with archiving activated on Zenodo.

16.04-16.58: Look over code and set up environment

No dependency management, so will create renv based on the imports and the dates of the repository - with exception that article mentions:

  • Simmer (version 4.1.0)

The article dates are:

  • Received - 31 March 2019
  • Accepted - 4 June 2019
  • Published - 27 June 2019

The GitHub repository has two commits, both on 27 May 2019. As per protocol, will go with earliest of published and code, which is 27 May 2019.

It looks likely that all the relevant code will be in server.R (with ui.R just being for the ShinyApp, which is not in scope to reproduce, as it is not presented as a key result within the paper). As such, looking at the imports from that R script, and identifying versions on or prior to 27 May 2019…

I’ll set each of these to be max these versions, to help with dependency conflicts when set-up environment, but then convert to fixed versions once know what worked.

Created a DESCRIPTION file in reproduction/:

Title: huang2019
Depends: 
    R (<= 3.6)
Imports:
    simmer (<=4.2.2),
    simmer.plot (<=0.1.15),
    dplyr (<=0.8.1),
    plotly (<=4.9.0),
    gridExtra (<=2.2.1)

Want to create another renv for that sub-folder (seperate to the renv in our main folder). To do so I ran the following commands in the console:

  1. setwd("~/Documents/stars/stars-reproduce-huang-2019/reproduction") (to move to reproduction/)
  2. renv::deactivate()
  3. renv::status() to confirm none were active
  4. renv::init(bare=TRUE) and selected 1 for using the explicit dependencies from DESCRIPTION. This then restarted the R session and created and opened a new project: reproduction. It made the following new files and folders:
  • .Rprofile (with just source("renv/activate.R"))
  • reproduction.Rproj
  • renv/ with the environment
  1. renv::install() to install the packages and their specified versions. However, looking over the versions it planned to install, we had:
  • simmer [4.4.6.3]
  • simmer.plot [0.1.18]
  • dplyr [1.1.4]
  • plotly [4.10.4]
  • gridExtra [2.3]

I cancelled it and tried changing everything to explicit versions (==). This then matched up to what I wanted in the planned installs -

  • simmer [4.2.2]
  • simmer.plot [0.1.15]
  • dplyr [1.1.4]
  • plotly [4.9.0]
  • gridExtra [2.2.1]

However, there was an error with simmer: ERROR: compilation failed for package ‘simmer’, and so still just have renv in environment. I tried installing this specific version manually with remotes:

  • renv::install("remotes")
  • remotes::install_version("simmer", "4.2.2")

Unfortunately, the same error appeared. I then tried installing from GitHub instead of CRAN:

  • remotes::install_github("r-simmer/simmer@v4.2.2")

But this failed again as before.

I tried focusing just on R to begin with, as I realised I have to install and change that manually. I followed this tutorial and ran in terminal:

  • sudo snap install curl
  • sudo apt-get update
  • sudo apt-get install gdebi-core
  • export R_VERSION=3.6
  • curl -O https://cdn.rstudio.com/r/ubuntu-2204/pkgs/r-${R_VERSION}_1_amd64.deb
  • sudo gdebi r-${R_VERSION}_1_amd64.deb

However, I then got an error: Failed to open the software package. The package might be corrupted or you are not allowed to open the file. Check the permissions of the file.

I switched over to the R documentation and clicked on Ubuntu and then “For older R releases, see the corresponding README.” This said:

To obtain the latest R 3.6 packages, use:

deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/
or

deb https://cloud.r-project.org/bin/linux/ubuntu xenial-cran35/
or

deb https://cloud.r-project.org/bin/linux/ubuntu trusty-cran35/

Timings

import sys
sys.path.append('../')
from timings import calculate_times

# Minutes used prior to today
used_to_date = 45

# Times from today
times = [
    ('14.14', '14.31'),
    ('14.33', '14.50'),
    ('15.05', '15.10'),
    ('15.35', '15.43'),
    ('15.55', '15.58'),
    ('16.04', '16.58')]

calculate_times(used_to_date, times)
Time spent today: 104m, or 1h 44m
Total used to date: 149m, or 2h 29m
Time remaining: 2251m, or 37h 31m
Used 6.2% of 40 hours max