Don't avoid test randomness, embrace and control it

I was listening to a podcast the other day, and heard something that struck me as a kind of mistake, of avoiding the root cause of problems instead of trying to properly solve them. This is my personal point of view, but I'll try to justify it the best I can.

At the aforementioned podcast episode, the folks were talking about good testing practices, and the guest said: "one of the first things I do when I begin writing tests is Faker.seed(1234) so I don't get random errors". Faker is an amazing Python library that creates fake data of many kinds, perfect for your tests; From full names to full addresses and credit card numbers, and supporting different locales, the value it provides is sooo good. For a developer, you get a magical and easy to use collection of data factories, and to ease creating different users by default is totally random. On instantiation, unless you provide it with a seed value, it will choose a random one and generate different values on each test and on each run (even of the same test).

The argument given of "sticking" the seed to a fixed value was defended with the argument of getting deterministic/reproducible test runs (one of the speakers had a bug in a certain address field and took him a while to find it as the test only failed sometimes). Now, of course when you stick to the same data you get deterministic results, and if you're not using Faker at all I'd see it as a valid approach. But what Faker provides is abundance of varied test data. Now you won't only have a "John Doe" and a "Jane Doe", you can try names from different locales, and many different ones.

To me, the correct way of using Faker is to generate a random seed (per test, per test file, per build, whatever you prefer) and seed the generator with it, but also print the seed to stdout. This way, you get the best of both worlds:

Different runs use different data, so you have data with more entropy, so you get more chances of finding hidden bugs that fixed data won't
If a test fails due to the random data, you have the seed value in the output/test logs, so you can reproduce it

A good example of how this works is another Python library I like to use, pytest-randomly, which randomly reorders the test execution order, so you can detect nasty bugs due to shared state, cache issues, and the like. On each run, it'll output a line like the following:

Using --randomly-seed=3888935657

Sometimes making your test data static even makes the test suite more fragile and harder to evolve, because the more data it contains, the more the tests rely on specific values: e.g. always assuming user_id = 1 instead of always referencing your test user #1 (with whatever id it contains). Then you need to create another different user with different data, or go change dozens of hardcoded values, fixture json files, etcetera etcetera... in the end you end with a bunch of hardcoded test users, all of them with hardcoded data that you now also have to explicitly maintain. We now like and try to have configuration as code, so then why we need to do manual maintenance of test fixtures instead of relying as much as possible on auto-generated ones?

Ultimately, adding randomness to your tests doesn't adds flakiness, what it really does is uncover hidden bugs and helps you achieve a bit more antifragility. Instead of freezing your test data, add a bit of logging to ensure having enough information to deterministically reproduce any failure. Embrace and control randomness.

Tags: Development Patterns & Practices Python Testing

Don't avoid test randomness, embrace and control it article, written by Kartones

. Published on 2021-11-01