Day 9

compendium
Author

Amy Heather

Published

June 11, 2024

Note

Continued working on research compendium (online coding environment, documentation).

Work log

Research compendium

Accessing the repository on browser without any installs

Explored options for virtual hosting of materials…

Google Colab:

!python3 --version
!sudo apt-get install python3.8
!sudo apt-get update -y
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1
!sudo update-alternatives --config python3
!python3 --version

Open In Colab

BinderHub:

  • Created BinderHub config file to use the GHCR image (reproduction/binderhub/config.yaml)
config:
  BinderHub:
    use_registry: true
    image_prefix: ghcr.io/pythonhealthdatascience/covid19:latest

Binder

Binder
  • Tried simplifying this (as Binder recommends against Docker), so removed the config file and instead copied the environment file into a .binder folder in the root, as then Binder should ignore configuration files elsewhere…
    • Requires the .binder/ folder to be in the root (so can’t be within the reproduction folder)
    • Took 3 minutes 20s to set up
    • Test-run of reproduction.ipynb failed - said there was “no module named ‘pandas’”, and !pip list shows that expected packages from environment.yaml haven’t been installed
    • Check of python version (!python --version) returns 3.10.14 (incorrect)
  • Tried a final time, creating a requirements.txt instead with packages, and runtime.txt with the Python version within the .binder/ folder…
    • Took 3m 25s to set up
    • When attempting to test it, connection to the kernel failed and it could not run the cells (“A connection to the notebook server could not be established. The notebook will continue trying to reconnect. Check your network connection or notebook server configuration.”). This persisted after a restart.

Requirements:

joblib==0.14.1
jupyterlab==1.2.6
matplotlib==3.1.3
notebook==6.0.3
numpy==1.18.1
pandas==1.0.1
pip==20.0.2
pytest==5.3.5
scipy==1.4.1
seaborn==0.10.0
simpy==3.0.11
statsmodels==0.11.0

Runtime:

python-3.8.1

Conclusion: Decided that for the purposes of these reproduction assessments, to not continue pursuing this at the current moment in time, and to just keep with the options of the environment file or the Binder image (locally or from GHCR).

Other aspects of research compendium

README:

  • Add operating system (found by running cat /etc/os-release, lscpu and free -g (to see RAM in GB))
  • Recorded runtime (with measurement of time elapsed added to the notebook)
  • Add model summary (copied from final report, with addition of the pathway figure)
  • Add reproduction scope (copied from scope.qmd)
  • Created repository overview
  • Tidied instructions on environment set up and running the model
  • Add some extra details on citation and license.

Notebook:

  • Made some minor additions, but primarily referred back to the README.md