You're Writing the Wrong Tests

Testing is important. Testing is boring. Coming up with test cases on your own is a problem because you have to think of many specific examples. This is tedious, and your examples sometimes may not cover interesting corner cases.

def my_sort(a):
    # ...

@pytest.mark.parametrize("to_sort, expected", [
    ([2, 1],       [1, 2]),
    ([3, 2, 1],    [1, 2, 3]),
    ([4, 3, 2, 1], [1, 2, 3, 4]),
])
def test_eval(to_sort, expected):
    assert my_sort(to_sort) == expected

There’s a pattern in these test cases. They’re all reverse-sorted lists. There are many test cases here, but they all have the same shape because I was bored. What if my sort was just implemented as a reverse() ? It’d pass on this output, but would break on shuffled lists. What if my implementation crashed on empty lists? What if my implementation messed up on already sorted lists?

I have to think about those weird corner cases and check them.

Property testing’s value comes in two forms.

Less is More

The obvious one is you get these boring test cases generated for you for free. Hundreds of them. Thousands of them if you look up the config variable to set. Millions of them if you are paranoid and want to make your computer warm up. Billions of them if you’re looking for an excuse to not do work for the next several hours. Trillions of them if you’d rather welcome the heat-death of the universe than see another incomplete bug report with no instruction as to how you could reproduce it.

You Get Better Understanding of the Problem You’re Solving

The really interesting benefit from property testing is how to better formalize your definition of the problem. What’s the one common thing in all of the expected output of the sort function? What does it mean for a list to be sorted? In unit tests you state concretely what the data should be coming out, but nothing is said about why it’s correct. If one of my test cases is just wrong, but the others are right, it would take deliberate effort to inspect that case and correct that mistake. But coming up with an “it is sorted” property will hold for all test cases generated.

# the property
def it_is_sorted(a):
    for i in range(len(a) - 1):
        if a[i] > a[i + 1]:
            return False
    return True

# the test
from hypothesis import given
import hypothesis.strategies as st

@given(st.lists(st.integers()))
def test_eval(numbers):
    assert it_is_sorted(my_sort(numbers))

Once you define a property that describes the shape of the solution your code should generate, you’re free to let Hypothesis handle the rest. Not only you save yourself the tedium of coming up with your own test cases, but now your test suite is better at describing what you want your code to be doing.