Day 11

compendium
Author

Amy Heather

Published

July 18, 2024

Note

Working on research compendium: Finishing up tests, and lots of troubleshooting docker.

Untimed: Research compendium

Tests

Having re-ran all the scenarios from scratch, I replaced the files in tests/testthat/expected_results/ and then ran testthat::test_dir("tests/testthat").

is_true(compare) returned error Error inis_true(compare): unused argument (compare) so switched back to expect_equal().

However, these were then all successful! Included instructions to run these tests, run time, and what you might expect to see, to the reproduction README.

Docker

Troubleshooting installation of packages when building images

Ran sudo docker build --tag huang2019 . -f ./docker/Dockerfile from reproduction/ (which is where the renv is located). Hit an error:

15.45 Warning: failed to find source for 'Matrix 1.7-0' in package repositories
15.45 Warning: error downloading 'https://cloud.r-project.org/src/contrib/Archive/Matrix/Matrix_1.7-0.tar.gz' [cannot open URL 'https://cloud.r-project.org/src/contrib/Archive/Matrix/Matrix_1.7-0.tar.gz']
15.45 Error: failed to retrieve package 'Matrix@1.7-0'

...

ERROR: failed to solve: process "/bin/sh -c R -e \"renv::restore()\"" did not complete successfully: exit code: 1

I looked to the address, and found that 1.7-0 was indeed not in the Archive, but it is the latest version of the package. It is available at https://cran.r-project.org/src/contrib/Matrix_1.7-0.tar.gz or at https://cloud.r-project.org/src/contrib/Matrix_1.7-0.tar.gz. This was only the second package it tried to install - the first was MASS 7.3-60.2, and that wasn’t the latest version. Looking at other packages, it seems common that the latest version is not on CRAN archive.

I tried out a bunch of things, but the same issue persisted throughout:

  • I found a post with the same issue - that renv() only looks in CRAN archive in a Docker image. They suggested renv::restore(repos = c(CRAN = "https://cloud.r-project.org")).
    • I changed the Dockerfie (but used single quotes for URL) and re-ran - RUN R -e "renv::restore(repos = c(CRAN = 'https://cloud.r-project.org'))"
    • I tried with double quotes as above, but including \ to escape the inner quotes - RUN R -e "renv::restore(repos = c(CRAN = \"https://cloud.r-project.org\"))"
  • Based on some online posts, I wondered if this might be to do with system dependencies. Based on this post, I opened a fresh R session (so not in renv) and tried to install getsysreqs although it was not available for my version of R. The RStudio Package Manager (RSPM) was recommended. I also stumbled across containerit which can make a Dockerfile for you and would include the system dependencies. However, I decided first to try the simplest option, which is to just install a fairly standard list of some linux libraries that R packages need, like here.
  • Based on this issue, I add ENV RENV_WATCHDOG_ENABLED FALSE to disable the renv watchdog.

Based on Tom’s Dockerfile which is from Peter Solymos, I tried changing the CRAN source RUN R -e "renv::restore(repos = c(CRAN = \"https://packagemanager.rstudio.com/all/__linux__/focal/latest\"))". This resolved the issue, as it was able to download Matrix from CRAN. All packages successfully downloaded, but I then hit an issue installing the packages:

ERROR: this R is version 4.1.1, package 'MASS' requires R >= 4.4.0
install of package 'MASS' failed [error code 1]`.

I then realised I had accidentally put R 4.1.1, when I meant to put R 4.4.1! I changed this and re-ran. This was successful until attempting to install igraph, at which it hit an error:

Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared object '/home/code/renv/staging/2/igraph/libs/igraph.so':
  libglpk.so.40: cannot open shared object file: No such file or directory

I add libglpk-dev to the list of system dependencies to install then tried again. It did eventually failed again with another similar issue.

Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared object '/home/code/renv/staging/2/stringi/libs/stringi.so':
  libicui18n.so.66: cannot open shared object file: No such file or directory

I briefly tried adding containerit to my renv to try that and see if it was simpler, although decided to pause on that and remove it and keep trying as before, as I kept getting errors and it wasn’t a quick-fix. I removed it from DESCRIPTION then ran renv::clean(), renv::snapshot().

I add libicu-dev and tried again. This failed with the same error as before.

Looking at the rocker rstudio image, it runs on ubunutu 22.04. Posit lists system dependencies for ubunutu 22.04 as apt install -y libcairo2-dev libssl-dev make libcurl4-openssl-dev libmysqlclient-dev unixodbc-dev libnode-dev default-jdk libxml2-dev git libfontconfig1-dev libfreetype6-dev libssh2-1-dev zlib1g-dev libglpk-dev libjpeg-dev imagemagick libmagick++-dev gsfonts cmake libpng-dev libtiff-dev python3 libglu1-mesa-dev libgl1-mesa-dev libgdal-dev gdal-bin libgeos-dev libproj-dev libsqlite3-dev libsodium-dev libicu-dev tcl tk tk-dev tk-table libfribidi-dev libharfbuzz-dev libudunits2-dev. I replaced the line in my Dockerfile and tried again. This failed with the same error as before (so returned it to the simpler list).

I found this issue with the same error, where it appears there is an issue with the stringi binary being built for the wrong Ubunutu since libicui18n.so.66 is for 20.04, although the fix appeared to be that they fixed the bioconductor container, and it wasn’t super clear to me what I should do.

Based on this example linked from this issue, I tried switching libicu-dev to libicu. However, this returned error Unable to locate package libicu. I then instead tried adding RUN R -e "install.packages('stringi')" before renv::restore(). That ran successfully, but hit a new error:

Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared object '/home/code/renv/staging/2/openssl/libs/openssl.so':
  libssl.so.1.1: cannot open shared object file: No such file or directory

libssl is in the list of system dependencies that were being installed. A quick google shows there are issues related to libssl.so.1.1 on Ubuntu 22.04. I tried doing the simplest solution first - installing it seperately (as that worked for stringi above).

The docker image then built successfully!

Troubleshooting empty container

Having successfully built the image (sudo docker build --tag huang2019 . -f ./docker/Dockerfile), I then tried creating a container. After some trial and error with the command (and using sudo docker rm huang2019_docker to remove and recreate container)), I got to:

(sleep 2 && xdg-open http://localhost:8888) & sudo docker run -it -p 8888:8787 -e DISABLE_AUTH=true --name huang2019_docker huang2019

This opened up RStudio - although there were no files and none of the libraries I had added were listed in the packages (e.g. no simmer)

I spent quite a while searching for and trying suggestions on this issue, with little success. E.g.

  • I tried running sudo docker inspect huang2019 but couldn’t spot anything amiss.
  • Based on this post, I checked my .dockerignore, which has **/renv/, which should just be preventing upload of renv.

I then tried copying everything into /home/rstudio rather than making a new directory in the Dockerfile and remaking the image. Got error This project does not contain a lockfile, so add path to renv::restore(). This built successfully, so I used the command above to create a container and open RStudio. This successfully included all the files!

Troubleshooting renv in the container

On opening RStudio, the console now showed a long list of packages that are missing entries in the cache, saying These packages will need to be reinstalled. It then says Project '~/' loaded. [renv 1.0.7] and The following package(s) have broken symlinks into the cache, with the same list of packages again, and Use renv::repair() to try and reinstall these packages..

Running renv::repair(), we get the following message:

# Library cache links ---------------------------------------------
renv appears to be unable to access the requested library path:
- /home/rstudio/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu
Check that the 'rstudio' user has read / write access to this directory.

Do you want to proceed? [Y/n]

I spent a while googling and tried a few things including -

  • Selecting to proceed, it installs the packages from CRAN.
  • I tried making another container without disabling authentication - (sleep 2 && xdg-open http://localhost:8888) & sudo docker run -it -p 8888:8787 -e PASSWORD=password --name huang2019_docker huang2019, then login with username rstudio password password - but had the same issue.
  • Looking in the renv/ folder, it appears to only contain the package renv.
  • I tried changing the server to 8787:8787 but this made no difference.
  • Looking at the installed packages, it appears to not include anything we add (including stringi and openssl, which were installed seperately from the rest).

Based on this issue, it appears that /home/rstudio/ is owned by the user docker rather than by us (the user rstudio).

  • Tried (sleep 2 && xdg-open http://localhost:8787) & sudo docker run --rm -it --user docker -v $(pwd)/home/$USER/foo -e USER=$USER -e USERID=$UID -p 8787:8787 --name huang2019_docker huang2019 but got error docker: Error response from daemon: unable to find user docker: no matching entries in passwd file.
  • Based on this issue, tried changing ownership of renv directory when created in Dockerfile using RUN mkdir -p /renv/ && chown -c rstudio /renv/, but the same error message remained

Based on this issue, tried changing user in Dockerfile for the renv::restore(). I realised whilst doing this that, when I’d made the renv folder, I hadn’t set it to /home/rstudio/ like I’d done for the rest of them! This might have been the issue. This produced an error while building the image:

7.633 # Installing packages --------------------------------------------------------
7.888 - Installing stringi ...                        OK [built from source and cached in 1.3m]
83.54 Error: could not copy / move file '/home/rstudio/Documents/RStudio/PROJECT/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu/stringi' to '/home/rstudio/Documents/RStudio/PROJECT/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu/.renv-backup-7-stringi73f0bb75'
83.54 move: cannot rename file '/home/rstudio/Documents/RStudio/PROJECT/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu/stringi' to '/home/rstudio/Documents/RStudio/PROJECT/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu/.renv-backup-7-stringi73f0bb75', reason 'Permission denied'
83.54 copy: source file '/home/rstudio/Documents/RStudio/PROJECT/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu/.renv-copy-7606940ba' does not exist

I tried temporarily removing the seperate installations of stringi and openssl, to see if renv::restore() would be successful. This was successful - up until stringi, which had the error from before re: libicui18n.so.66.

I reintroduced the seperate installations, but tried switching install.packages("stringi") to renv::install("stringi"), but this had the issue from above.

I then tried to make it more similar to the line installing renv, which does work, so did RUN R -e "install.packages('stringi', repos = c(CRAN = 'https://cloud.r-project.org'))", but no change.

I then tried following the structure used to install renv in the aforementioned GitHub issue - so:

# Set version of packages installed outside the lockfile
ENV RENV_VERSION 'v1.0.7'
ENV STRINGI_VERSION '1.8.4'
ENV OPENSSL_VERSION '2.2.0'

# Install remotes (which we will use to install packages)
RUN Rscript -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"

# Install renv from GitHub
RUN Rscript -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"

# Install stringi seperately due to issue with not detecting libicu
RUN Rscript -e "remotes::install_version('stringi', version='${STRINGI_VERSION}')"

# Install openssl seperately due to issue with not detecting libssl
RUN Rscript -e "remotes::install_version('openssl', version='${OPENSSL_VERSION}')"

However, we now are getting the libicui18n.so.66 issue, so it appears the problem is that I haven’t installed this into the right place for renv to find it, when I installed it seperately.

I moved the installation to after having copied over the renv files -

# Copy files (including renv.lock!) into the image
# Thanks to .dockerignore, should not copy over renv/ (which is very large)
COPY . .

# Copy renv auto-loader tools
RUN mkdir -p renv/library
COPY .Rprofile .Rprofile
COPY renv/activate.R renv/activate.R
  • but we again get the first issue (could not copy / move file…).

I tried moving it around a few different locations in the file to try and figure out what might be right. I also tried adding in an renv::activate()

Fix Quarto GitHub action

Returned to the broken Quarto render action (which fails to find rmarkdown despite it having been installed with setup-renv). Some ideas: