I have been burning the midnight oil over the past week writing a data validator program—basically a front-end to the PETL validate function—in Python. This would be my third such program, and it is going to be the sort where it can be configured entirely, and flexibly, with configuration files. Theoretically, it should not require coding, or at least not that much coding, to set up validation rules. It has been a fun project thus far, and it is nearing the point where all the pieces will come together.

Lately I have been writing unit tests, and that has been where all the pain points have been. I used to use Python’s standard unittest framework for unit testing, but it requires so much repetitive code that I decided to use pytest instead. Pytest lets you parametrize unit test functions, which reduces code repetition considerably.

I never used pytest before, and ended up spending a lot of time trying to figure out how to get it working correctly. Getting my test module access to the package I’m testing was the first major hurdle. After that I spent a couple hours figuring out that I couldn’t make test fixtures that return lists, and that I couldn’t use text fixtures as test parameters. For the latter, I found a workaround in the form of the pytest-lazy-fixture package.