Reduce Duplication in Pytest Parametrised Tests using the Walrus Operator

Do you find yourself having to repeat literal values (like strings and integers) in parametrised tests? I often find myself in this situation and have been looking for ways of reducing this duplication. To show an example, consider this trivial function:

def get_second_word(text: str) -> str | None:
    """Get the second word from text."""
    words = text.split()
    return words[1] if len(words) >= 2 else None

This is an example of a function that can easily be tested using pytest parametrised test functions. In the past I would have written tests like this:

import pytest

from example import get_second_word


@pytest.mark.parametrize(
    "text, expected_word",
    [
        pytest.param("", None, id="empty text"),
        pytest.param("first", None, id="single word"),
        pytest.param("first second", "second", id="multiple words"),
        pytest.param("alpha beta", "beta", id="multiple words alternate"),
        pytest.param("first second third", "second", id="many words"),
    ],
)
def test_get_second_word(text: str, expected_word: str):
    returned_word = get_second_word(text)

    assert returned_word == expected_word

The third, fourth and fifth test case required writing the expected word twice, once on the text input and once on the expected word output side. It is easy to make mistakes or forget to update one or the other when the tests change. With the walrus operator, this duplication can be removed:

@pytest.mark.parametrize(
    "text, expected_word",
    [
        pytest.param("", None, id="empty text"),
        pytest.param("first", None, id="single word"),
        pytest.param(f"first {(second := 'second')}", second, id="multiple words"),
        pytest.param(
            f"alpha {(second := 'beta')}", second, id="multiple words alternate"
        ),
        pytest.param(f"first {(second := 'second')} third", second, id="many words"),
    ],
)
def test_get_second_word_walrus(text: str, expected_word: str):
    returned_word = get_second_word(text)

    assert returned_word == expected_word

As you can see, instead of writing the string literal "second" twice, the definition of the text makes use of f-strings and the walrus operator to define the variable second which contains the string literal of the second word. This variable is then used to define the expected word to be returned from the function. You do have to be careful to make sure that you aren’t using left-over values from previous tests which can lead to errors:

@pytest.mark.parametrize(
    "text, expected_word",
    [
        pytest.param("", None, id="empty text"),
        pytest.param("first", None, id="single word"),
        pytest.param(f"first {(second := 'second')}", second, id="multiple words"),
        pytest.param(
            f"alpha {(second := 'beta')}", second, id="multiple words alternate"
        ),
        pytest.param(f"first {(second_ := 'second')} third", second, id="many words"),
    ],
)
def test_get_second_word_walrus_mistake(text: str, expected_word: str):
    returned_word = get_second_word(text)

    assert returned_word == expected_word

As you can see, there is a mistake in defining the variable for the last test (it calls the variable second_ in the text definition and uses the second variable without the trailing _ for the expected word to be returned) which will mean that the expected second word for that test is the second word from the fourth test. Also, if you are using pylint, it doesn’t seem to understand the way the walrus operator works quite yet as it complains about unused and undefined variables. It will also count the number of variables defined here and warn you about too many local variables if you have many tests. These checks can be disabled around just the definition of the test cases:

# pylint: disable=unused-variable,undefined-variable,too-many-locals
@pytest.mark.parametrize(
    "text, expected_word",
    [
        pytest.param("", None, id="empty text"),
        pytest.param("first", None, id="single word"),
        pytest.param(f"first {(second := 'second')}", second, id="multiple words"),
        pytest.param(
            f"alpha {(second := 'beta')}", second, id="multiple words alternate"
        ),
        pytest.param(f"first {(second := 'second')} third", second, id="many words"),
    ],
)
# pylint: enable=unused-variable,undefined-variable,too-many-locals
def test_get_second_word_walrus_pylint(text: str, expected_word: str):
    returned_word = get_second_word(text)

    assert returned_word == expected_word

This technique is useful for many functions that transform the input to calculate the output. If you ever find yourself repeating literals in both the input and expected output, this technique could be helpful. Further examples are when transforming from one data structure to another (for example, picking values out of a dictionary, assembling a data structure from a few variables) and simple mathematical operations. It is advisable to keep the logic for calculating the output relatively simple otherwise you may find yourself just re-implementing the code of the function under test. For example, the following might be going too far in this case:

@pytest.mark.parametrize(
    "text, expected_word",
    [
        pytest.param("", None, id="empty text"),
        pytest.param("first", None, id="single word"),
        pytest.param(text := "first second", text.split()[1], id="multiple words"),
        pytest.param(
            text := "alpha beta", text.split()[1], id="multiple words alternate"
        ),
        pytest.param(text := "first second third", text.split()[1], id="many words"),
    ],
)
def test_get_second_word_walrus_too_far(text: str, expected_word: str):
    returned_word = get_second_word(text)

    assert returned_word == expected_word

In this case, the get_second function code is essentially replicated in the tests. In some cases it might be reasonable to go to this level of complexity, although in this case it would probably be better to stick to just re-using literal values to make the tests easy to understand.

Thats it! Thanks for reading and please leave any feedbacks as comments below.

One thought on “Reduce Duplication in Pytest Parametrised Tests using the Walrus Operator

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s