Let's begin with a small story: At a previous job (a big company with a far from trivial amount of engineers), there were quite a few services written in Python. Some of them were either moderate or big in size, and most importantly, some had existed since quite a few years. While the code was well structured, with a few comments here and there, and in general easy to read, you could notice that it was not homogeneous: single quotes here and double quotes there, tightly packed methods here triple line break separated methods there, a few classes with UpperCamelCase method names out of place in a Python codebase (maybe they were written by a Java developer?) ...
Those where some symptoms, and to be honest you can live with them. What really began to worry me a bit more as time passed by and I saw happening multiple times, was that I would see pull requests stopped by engineers (often in a different timezone) because of "critical" reasons like:
- "imports are not alphabetically sorted"
- "you must leave two new lines between imports and the class name"
- "comment needs to be triple-double quotes because it's multi-line"
We love to talk about "getting into the flow", about increasing developer productivity, and yet we ruthlessly impede others, sometimes until the next day, because of such trivialities. Most cases the code was otherwise perfect.
But, the real problem wasn't bike shedding, the real issue was that nobody had stopped and though "why are we doing this manual process over and over?", "why are we spending neurons on the form and not on the content of the code?". If you look at the examples above, they already feel like written by an automaton.
Thankfully, we have since quite some time both linters and formatters for most languages. Combined, you'll get consistent and uniform code, and people will focus on the important points of code reviews. Oh, and many linters also warn you about unused variables, methods or imports, so they are also nice for housekeeping.
In the case of Python I already wrote about
isort, so check that out if interested.
About the work tale at the beginning of this blog post, I got approval to add the Python linter and formatters to the biggest repository as pre-commit hooks, and we ran them over all the existing code. Pulling up a few pull requests that modify more than 40k files total is "interesting": Despite splitting into a few PRs, Github couldn't even render the list of collapsed files 😅. There was some pushback at first, but after a week or two the anger dissipated, and since then there was not a single discussion over formatting. It even helped us decide over fundamental questions like spaces vs tabs or single vs double quotes, as a nice thing about
black is that is opinionated and almost non-configurable, so it's either This Way or No Way.