iBOTS Learn: Unit Testing with pytest

Software Delivery for Scientific Python Projects

Testing Code

Unit Testing with pytest

Author

Dr. Nicholas A. Del Grosso

Download Materials

Practice the core mechanics of automated testing in Python: reading test output, fixing assertions, comparing scientific data structures, parameterizing repeated checks, and using property-based tests for broader input coverage.

Setup

To make it easy to write and run automated tests in a notebook, we’re using the ipytest package. Run the code below to get it set up.

# Run this to install the packages used in these exercises
# %pip install pytest ipytest hypothesis numpy pandas

import pytest
import ipytest

# Makes the "%%ipytest cell magick work, set options"
# or just call `ipytest.autoconfig()` without options to make it all work easily.
ipytest.config(
    magics=True, 
    defopts="auto", 
    addopts=[
        "-q",  # quiet output
        "-W", "ignore:Module already imported so cannot be rewritten:pytest.PytestAssertRewriteWarning",
    ],
    coverage=False
)

Section 1: Background

What is Automated Testing?

Anything that your code can do, you can check. If you find yourself manually checking for something, putting that check in an automated test helps you continue doing it as you develop your project further, automatically.

pytest contains a test runner that looks for automated tests; one way it finds them is by looking for function names that start with the word test_, in file names that start with the word test_. It runs each function it finds, and marks down whether running it:

Passed: The function ran with no errors
Failed: The function ran with an AssertionError
Errored: The function ran with any other error type.

While [PyTest] is far and away the most popular testing framework in Python, there are other alternatives, including:

Library	When to Choose it over Pytest
`unittest`	When you want something built-in to Python.
`doctest`	When you want automated test code in your function ducumentation
`behave`	When you want automated test code to be readable and writable by non-coding teammates
A PyTest Plugin	When you want to add features to your tests, make special test types easier to write, or work with tricky-to-test frameworks.

Why Write Automated Tests?

While software developers often discuss automated testing in terms of quality control for gaining others’ trust in our code in large projects, in practice, automated testing tools are used for a wide variety of development tasks that help speed up the development process. Here are just a few examples, to show how pervasive automated testing is:

Checklist Automation: Do you find yourself rerunning your code a lot to confirm that it still works? This can slow down our workflow and hurt our creative flow, if it takes too long. For each check you need to do, just write an automated test and free up our brain space!
Colleague Onboarding: Is everything ready for your colleagues to make contribution to your code? If they can run some automated tests, then you can be confident that things are installed and ready on their machines.
Troubleshooting Time Reduction: Each time you run your code, does it take you a few minutes to work out where the problem is? Unit tests turn those minutes into seconds, making it easier to quickly pin down why our code isn’t working.
Design Tools for Tricky Algorithms: Do you feel like a particular function feels more like a brain teaser than usual? Write a few tests as you work on it, to free up some brain space that’s focused on checking the code.
Code Structuring Guidance: Wondering if your code is still modular? Writing tests is a great software architecture check–unit-testable code is modular code!
UX Guidance: Wondering if your functions are intuitive to use for others? A great check is to write unit tests, and see if the test code is complicated. Simple tests mean intuitive interfaces!
Getting Started Help: Not even sure how to get started with a project, or what you really should do? Write a test on a program you would like to write! The test will fail, of course (because the code isn’t written yet), but you’ll then have a clear idea of what needs done, and in what order.
Bug Fixing: A user is reporting a bug in your code? Write an automated test to recreate the bug, so that the test fails when the bug is present. Then, fix the code!
Code Review Simplification: Want to speed up code review when accepting contributions? Require tests on new code! If you’re happy with the tests, and they pass, then the code is likely already good to go.
Mentoring Aid: Have a junior who wants to contribute, but isn’t sure how? Sit together with them and write some tests that they should get to pass. That way they have feedback on their progress to their goal, and you get code that works!

Unit Tests vs “Other” Tests

We talk a lot about unit tests, because they are easiest to make, and in a big project they make up the vast majority of the automated tests, but there are a lot of different types of tests–which you choose just depends on your goals for that test. Here are a few other options out there:

Test Type	When you want to…	Example
Unit Test	Check that a function or method works.	Check: Calling `predict([1, 2])` returns a transformed array `[3, 4]`.
Property Test	A type of unit test, it checks that all calls to a function or method result something with a desired property.	Check: Calling `predict()` always returns a 1D array of floats.
Integration Test	Check that a function, method, or class calls other functions or methods the way you expected.	Check: Calling `predict()` calls the OpenAI API with certain parameters.
System Test	Check that the whole project works on a high level.	Check: when I run my pipeline on my data, I get the figures I want.
Smoke Test	Checks that nothing is crashing.	Check: When I run my script, it doesn’t error out.
Behavior Test	Checks that the program works along the user’s expectations	Check: when I press the `fit` button, I see a model fit on-screen.
Snapshot Test	Checks that the program still does the same thing it did yesterday.	Check: my pipeline still produces the same figure it did last time I ran the test.

In this workshop, we’ll focus on Unit Testing and Property Testing. If there’s interest, we can expand to other test types in future workshops.

Writing automated tests is work–it doesn’t come for free. Through these exercises, we’ll get familiar with the basics of the pytest framework and a few useful supplmentary libraries, writing unit tests in a concise manner, so we can spend less time writing boilerplate test code and more time building our projects.

Section 2: Checking Test Code

Automated Tests three main parts: a Test Runner, Test Functions and Test Assertions.

Term	Code	Description
Test Runner	`%%ipytest`	The program that finds, runs, and records tests and their results.
Test Function	`def test_xxx():`	The function run by the test runner. Must start with `test_`, followed by a description of what is checked.
Test Assertion	`assert x`	The check itself. If all checks in a test function pass without error, then the test is considered to have passed.

Let’s start with pre-written tests, fixing small parts of the code to see how the test runner gives feedback on what it finds.

Task: One of the tests below has an error in it, and so the test is failing. Use the output from pytest to find the failing test and fix it so all tests pass.

%%ipytest

def test_sum_1_2_is_3():
    assert sum([1, 2]) == 3


def test_sum_2_3_is_5():
    assert sum([1, 2]) == 4

Solution

%%ipytest

def test_sum_1_2_is_3():
    assert sum([1, 2]) == 3


def test_sum_2_3_is_5():
    assert sum([2, 3]) == 5

Task: Below are three unit tests that check for three different types of things: a value, a type, and an error. Edit the code so all tests do their intended checks successfully.

%%ipytest

def test_sum_3_4_is_7():
    # Hint: assert sum([..., ...]) == ...
    raise NotImplementedError()

def test_sum_of_ints_is_an_int():
    # Hint: assert isinstance(sum([...]), int)
    raise NotImplementedError()

def test_sum_strings_a_b_raises_typeerror():
    raise NotImplementedError()
    with pytest.raises(TypeError):
        ... # Put code that should result in an error here.

Solution

%%ipytest

def test_sum_3_4_is_7():
    assert sum([3, 4]) == 7


def test_sum_of_ints_is_an_int():
    assert isinstance(sum([1, 2, 3]), int)


def test_sum_strings_a_b_raises_typeerror():
    with pytest.raises(TypeError):
        sum(["a", "b"])

Section 3: Checking Equality of Numpy Arrays with `numpy.testing` and Pandas DataFrames with `pandas.testing`

Because we do a lot of data science work, we often check that arrays and dataframes are what we expected. To make this check easier, many packages include a testing subpackage with special assert_() functions used to simplify writing tests on their data structures. Not only do they make the code easier to write, they also usually give quite descriptive error messages when the tests fail, making troubleshooting simpler. Let’s try it out with numpy and pandas.

Task: Write a unit test to check that the computed numpy array is the expected one. When the test fails, use the error messages to fix the test so that it passes.

%%ipytest

import numpy as np
import numpy.testing as npt
# npt.assert_array_equal()

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
expected = np.array([5, 7, 8])
observed = a + b
npt.assert_array_equal(expected, observed)

Solution

%%ipytest

import numpy as np
import numpy.testing as npt

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
expected = np.array([5, 7, 9])
observed = a + b
npt.assert_array_equal(expected, observed)

Task: Write a unit test to check whether the two methods below produce the same dataframes. When the test fails, use the error messages to fix the test so that it passes.

%%ipytest

import pandas as pd
import pandas.testing as pdt
# pdt.assert_frame_equal()

# Method one: from dictionary
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 11, 12]})

# Method two: Stepwise DataFrame Mutation
df2 = pd.DataFrame()
df2['b'] = [10, 11, 12]
df2['a'] = [1, 3, 3]

pdt.assert_frame_equal(df1, df2)

Solution

%%ipytest

import pandas as pd
import pandas.testing as pdt

# Method one: from dictionary
df1 = pd.DataFrame({"a": [1, 2, 3], "b": [10, 11, 12]})

# Method two: Stepwise DataFrame Mutation
df2 = pd.DataFrame()
df2["a"] = [1, 2, 3]
df2["b"] = [10, 11, 12]

pdt.assert_frame_equal(df1, df2)

Section 4: Test Parameterization: Check More Cases with Less Code

It’s valuable to check many different sets of inputs and confirm that the outputs are all correct; strange little bugs appear in many functions when certain values weren’t what we expected. But writing a function for every single set of inputs we want to check for is needlessly verbose. Parametrizing tests functions makes that code more condensed, and PyTest provides a decorator for doing this @pytest.mark.parametrize()

Task: The code below uses PyTest’s parametrization feature. Without writing a new function, use that feature to add two more checks (a.k.a. “test cases”) to the tests below, so that a total of 4 tests run:

3 + 7 = 10
-2 + 3 = 1

%%ipytest

cases = [
    [[1, 2], 3],
    [[2, 3], 5],
]
@pytest.mark.parametrize('inputs,output', cases)
def test_sum_of_integers(inputs, output):
    assert sum(inputs) == output

Solution

%%ipytest

cases = [
    [[1, 2], 3],
    [[2, 3], 5],
    [[3, 7], 10],
    [[-2, 3], 1],
]


@pytest.mark.parametrize("inputs,output", cases)
def test_sum_of_integers(inputs, output):
    assert sum(inputs) == output

Task: Rewrite the three test functions below into a single test function, using parametrize to continue checking each case individually. Note that pytest includes an approx() function for helping check floats, since there are often little rounding errors with them.

%%ipytest

def test_5p2_minus_2p1_is_3p1():
    assert 5.2 - 2.1 == pytest.approx(3.1)

def test_6p5_minus_1p7_is_4p8():
    assert 6.5 - 1.7 == pytest.approx(4.8)

def test_0p3_minus_0p2_is_0p1():
    assert 0.3 - 0.2 == pytest.approx(0.1)

Solution

%%ipytest

cases = [
    (5.2, 2.1, 3.1),
    (6.5, 1.7, 4.8),
    (0.3, 0.2, 0.1),
]


@pytest.mark.parametrize("left,right,expected", cases)
def test_subtraction(left, right, expected):
    assert left - right == pytest.approx(expected)

Section 5: Property Testing with `hypothesis`

There are also cases where you want to check a bunch of inputs to make sure that the code works as correctly, but:

you don’t know exactly which inputs are the best to check,
and you aren’t sure exactly how to calculate the expected result,
but you know what aspect of the result you want to check (i.e. “property” you want the result to have).

This is called “Property Testing”, and the hypothesis library helps with that. Just describe the inputs that should go in, and write your test, and it will check your code with a wide range of inputs!

Task: The test function below isn’t checking what it means to be (as described by the function name), and so Hypothesis keeps finding sets of inputs that make the test fail. Fix the inputs and the test function body, so the test is correct.

%%ipytest 

from hypothesis import given
from hypothesis import strategies as st
# For the curious, a full list of "strategy" functions (how hypothesis generates inputs): 
# https://hypothesis.readthedocs.io/en/latest/reference/strategies.html


@given(
    st.lists(st.integers(min_value=-10), min_size=0),
)
def test_sum_of_positive_integers_always_a_positive_integer(inputs):
    assert sum(inputs) > 1

Solution

%%ipytest

from hypothesis import given
from hypothesis import strategies as st


@given(
    st.lists(st.integers(min_value=1), min_size=1),
)
def test_sum_of_positive_integers_always_a_positive_integer(inputs):
    result = sum(inputs)
    assert isinstance(result, int)
    assert result > 0

Task: Have hypothesis generate float values in order to test the function below.

%%ipytest 

from hypothesis import given
from hypothesis import strategies as st

@given(
    # Put a strategy here for the first float
    # Put a strategy here for the second float
)
def test_sum_of_two_floats_is_always_equivalent_to_using_plus_operator(first, second):
    assert sum([first, second]) == pytest.approx(first + second)

Solution

%%ipytest

from hypothesis import given
from hypothesis import strategies as st


@given(
    st.floats(min_value=-1e100, max_value=1e100, allow_nan=False, allow_infinity=False),
    st.floats(min_value=-1e100, max_value=1e100, allow_nan=False, allow_infinity=False),
)
def test_sum_of_two_floats_is_always_equivalent_to_using_plus_operator(first, second):
    assert sum([first, second]) == pytest.approx(first + second)

Section 6: Integrating Pytest into Python Projects

Seperating out your tests/ folder (where your tests live) from your src/ folder (where your actual importable code lives) is very helpful for keeping your project manageable. Getting your Python project into a standard format makes it easy for frameworks like pytest to find your code easily; below is a standard structure for making this work:

<project_name>
|
├── src/
|   ├── <your_package-=_name>/
|   |   ├── __init__.py
|   |   └── <module>.py
├── tests/
|   └── test_<package_name>.py
|
└── pyproject.toml

If you’d like a tool to set things up for you, UV can do this with a single command: uv init.

The pyproject.toml file contains a description of the project, which most Python develop tools know how to reference. This file lives in the project’s root directory so that tools can find it easily.

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "<your-project-name>"
version = "0.0.1"
requires-python = ">=3.11"
dependencies = []

[project.optional-dependencies]
dev = ["pytest"]

[tool.hatch.build.targets.wheel]
packages = ["src/<package_name>"]

Once the pyproject.toml file exists, just install the package in “development” (a.k.a. “editable”) mode all your tools, including pytest, will be able to import your package with import <your-project-name>:

Using pip: python -m pip install -e ".[dev]"
Using uv: uv pip install -e ".[dev]" or uv sync --extra dev

Unit Testing with pytest

Author

Setup

Section 1: Background

What is Automated Testing?

Why Write Automated Tests?

Unit Tests vs “Other” Tests

Section 2: Checking Test Code

Section 3: Checking Equality of Numpy Arrays with numpy.testing and Pandas DataFrames with pandas.testing

Section 4: Test Parameterization: Check More Cases with Less Code

Section 5: Property Testing with hypothesis

Section 6: Integrating Pytest into Python Projects

Section 3: Checking Equality of Numpy Arrays with `numpy.testing` and Pandas DataFrames with `pandas.testing`

Section 5: Property Testing with `hypothesis`