Testing in Research Workflows
  1. Types of test
  2. Smoke tests

This site contains materials for the testing module on HDR UK’s RSE001 Research Software Engineering training course. It was developed as part of the STARS project.

  • When and why to run tests?
  • Case study
  • Introduction to writing and running tests
    • How to write a basic test
    • How to run tests
    • Parameterising tests
    • Data, temporary files and mocking
  • Types of test
    • Smoke tests
    • Unit tests
    • System tests
    • Regression tests
  • What was the point? Let’s break it and see!
  • Test coverage
  • Running tests via GitHub actions
  • Defensive programming
  • Example repositories
  1. Types of test
  2. Smoke tests

Smoke tests

Choose your language:  

What is a smoke test?

Smoke tests (also known as build verification tests) are a “sanity check” run before a complete test suite. They are extremely quick and just test that the code can run end-to-end, and not that the results are correct.

If a smoke test fails, it usually means something fundamental is broken (for example, a missing dependency, a changed column name, or a syntax error). In that case, there is no point running slower, more detailed tests until the basic problem is fixed.

Example: waiting times case study

We will create a simple smoke test for our case study workflow. This workflow uses three functions:

  • import_patient_data() - reads patient data from CSV.
  • calculate_wait_times() - calculates wait times.
  • summary_stats() - produces summary statistics.

For the smoke test, we do not care whether the CSV, wait times or statistics are correct. We only care that each function is able to run successfully, and that at least something is returned.

We will need the following imports in our test script:

import pandas as pd
from waitingtimes.patient_analysis import (
    import_patient_data, calculate_wait_times, summary_stats
)

Smoke test

In the test, we build a tiny dummy dataset so that the test runs quickly.

We use pytest’s tmp_path fixture, which gives us a temporary folder that is created for the test and automatically cleaned up afterwards, so we do not touch any real files on your machine.

We then run the full workflow end-to-end, and finish with a minimal assertion: checking only that stats exists.

def test_smoke(tmp_path):
    """Smoke: end-to-end workflow produces the expected final output shape."""
    # Create test data
    test_data = pd.DataFrame(
        {
            "PATIENT_ID": ["p1", "p2", "p3"],
            "ARRIVAL_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
            "ARRIVAL_TIME": ["0800", "0930", "1015"],
            "SERVICE_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
            "SERVICE_TIME": ["0830", "1000", "1045"],
        }
    )

    # Write test CSV
    csv_path = tmp_path / "patients.csv"
    test_data.to_csv(csv_path, index=False)

    # Run complete workflow
    df = import_patient_data(csv_path)
    df = calculate_wait_times(df)
    stats = summary_stats(df["waittime"])

    # Final check
    assert stats is not None
TipWhat is a fixture?

In pytest, a fixture is a small helper function that prepares something your test needs and then passes it into the test function as a parameter.

When you write a test like def test_name(tmp_path):, the name tmp_path tells pytest to call its built-in tmp_path fixture first. Whatever that fixture returns is then given to the test as the tmp_path argument.

In this case, the fixture creates a temporary folder and passes a path object called tmp_path into the test. At the end of the test, pytest removes that folder, so you do not leave any files behind.

We write it to a temporary file using tempfile(), so we do not touch any real files on your machine.

We then run the full workflow end-to-end, and finish with a minimal expectation: checking only that stats exists.

test_that("smoke: end-to-end workflow produces some output", {
  # Create small, fast test data
  test_data <- data.frame(
    PATIENT_ID   = c("p1", "p2", "p3"),
    ARRIVAL_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
    ARRIVAL_TIME = c("0800", "0930", "1015"),
    SERVICE_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
    SERVICE_TIME = c("0830", "1000", "1045"),
    stringsAsFactors = FALSE
  )

  # Write test CSV to a temporary file
  csv_path <- tempfile(fileext = ".csv")
  utils::write.csv(test_data, csv_path, row.names = FALSE)

  # Run complete workflow
  df <- import_patient_data(csv_path)
  df <- calculate_wait_times(df)
  stats <- summary_stats(df$waittime)

  # Final smoke-test check: did we get *any* result?
  expect_false(is.null(stats))
})

How to run the smoke test

You can run only the smoke test file with:

pytest test_smoke.py

However, you may wish to treat the smoke test as a gate - i.e., if that test fails, no others are run. So, instead of just calling pytest when running your test suite, you can call:

pytest test_smoke.py && pytest --ignore=test_smoke.py
testthat::test_file("tests/testthat/test_smoke.R")

However, you may wish to treat the smoke test as a gate - i.e., if that test fails, no others are run. So, instead of just calling devtools::test() when running your test suite, you can create a Makefile:

test:
    Rscript -e "testthat::test_file('tests/testthat/test_smoke.R')" && \
    Rscript -e "devtools::test()"

This is then run by calling:

make test

This works as follows:

  • The first command runs the smoke test.
  • && means “only run the next command if the previous one succeeded”.
  • The second command runs the full test suite.

If the smoke test fails, the second command is never run, so the full test suite is not executed.

Running our example test

NoteTest output
============================= test session starts ==============================
platform linux -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0
rootdir: /__w/stars-testing-intro/stars-testing-intro/examples/python_package
configfile: pyproject.toml
plugins: cov-7.0.0
collected 1 item

../examples/python_package/tests/test_smoke.py .                         [100%]

============================== 1 passed in 0.98s ===============================
<ExitCode.OK: 0>

══ Testing test_smoke.R ════════════════════════════════════════════════════════

[ FAIL 0 | WARN 0 | SKIP 0 | PASS 0 ]
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ] Done!
Data, temporary files and mocking
Unit tests
 
  • Code licence: MIT. Text licence: CC-BY-SA 4.0.