Reproducibility and RAPs

1 Reproducibility

Reproducibility = the ability to regenerate published results using the provided code and data.

It’s helpful to distinguish reproducibility from related concepts:

  • Reusability refers to whether code can be adapted and applied to new contexts.
  • Replicability means others can independently implement your methods (without your code) and get the same results.


2 Why does reproducibility matter?

2.1 Advantages for you (the original author)

  • Reuse. Reproducibility makes it much easier for you to revisit and reuse your own code, whether you’re updating outputs or conducting new analyses.

Example: Peer review. The mean time from submission to publication in biomedical research ranges from 27 to 639 days (Andersen et al. 2021) - so it’s crucial that your code remains functional and understandable long after it’s initial execution.

  • Quality. Aiming for reproducibility encourages you to structure code clearly and document your workflow thoroughly. This will help reduce the risk of errors and ambiguities, and improve code quality.

  • Time. Even if you retain access to your code, failure to record exact parameters for every scenario or document the computational environment used, for example, could make the code non-reproducible when revisited, even if only a few months later. Troubleshooting non-reproducible code can be time-consuming or even impossible if required information is lost.

Many of these advantages apply regardless fo whether code is shared publicly or remains propreitrary.

2.2 Advantages for others

  • Trust. If work is reproducibility, it builds trust, as others can independently confirm that your results are reliable and your methods are transparent.

  • Reuse. If others can reproduce your work, this verifies that the code is working as expected when they run it. Knowing that the underlying code works as expected, they are then able to confidently reuse and adapt the code for new contexts and projects.


3 Reproducible analytical pipelines (RAP)

Reproducible analytical pipeline (RAP) = systematic approach to data analysis and modelling in which every step of the process (end-to-end) is transparent, automated, and repeatable.

Rather than relying on manual processes or undocumented decisions, a RAP ensures that the entire workflow - from data ingestion and cleaning, to modelling, and through to analysis (and sometimes also reporting) - is scripted, documented, and reproducible.

There should be no (or minimal) manual intervention required to generate the results. The requirements of a RAP are outlined in the NHS RAP Maturity Framework.


4 Reproducible work v.s. RAPs

You can do reproducible work without building a full RAP - but a RAP takes things further. Reproducible analysis may still involve manual steps, selective re-running of scripts, or undocumented decisions. This leaves room for human error and inefficiency, and makes it harder to check and re-run the work.

A RAP is distinct because it automates the entire workflow, minimising manual intervention and using software engineering best practices like version control, peer review, and modular code. This makes your analysis more efficient and robust, and easier to re-run and review.


5 Focus of this book

This book does both! As detailed on the guidelines page:

  • It aims to make your work reproducible, drawing on recommendations from Heather et al. (2025) for reproducible discrete-event simulation (DES)
  • It also guides you through building a full RAP, following the NHS “Levels of RAP” maturity framework.