Today December 2nd, 2019, marks a special date: exactly 15 years ago, I decided to start a blog to dump my development learnings and dull rants from time to time. The first blog post was about a game development framework I was using back then to prepare my master thesis project, Jamagic.
That original blog post now lives here, but I began using Google's Blogger as the platform to host my content.
I had another website with kind of posts (mostly tech-oriented news), but it was in Spanish and I wanted both to practice writing in English and to be able to speak about any topic I wanted.
For the record, I'll list the different platforms and approximate dates because one one hand it's been quite a challenge to migrate the data from one place to another, but on the other hand I've learned a lot about becoming more pragmatic and adopting leaner and simpler platforms, data formats and general approaches.
JOINhalf a dozen SQL DB tables to just maintaining a physical file per post or page.
Quite a journey to fight to keep my content always available. Call me crazy but I hate when you try to search for something old and there are no search results. Google tends to de-rank and hide old stuff, DuckDuckGo started crawling the web later so doesn't knows everything and most often, companies, platforms and domains just disappear, so to me it pays off to keep your stuff somewhere you can ensure is available.
Specially considering I mostly only do backend development nowadays.
To wrap up and stop wasting more time of any poor reader who made it until here, I'd like to be sincere: What I write here is usually and mostly irrelevant. I also lately feel less inclined to blog because I both feel I'm not going to add anything worthy and that the web is nowadays full of too many experts with shallow blog posts, posts that we then tend to take as the real truth. I feel I should instead be focusing on reading more books than contributing to this quick doses of "information", so my real message of this post is more like... don't waste much time reading only blog posts, instead to dig deep into a topic research books and papers, and experiment.
Talk less, do more.
One of my pet projects, Finished Games, is reaching a state in which already serves decently as a catalog and tracker. Sure I have a ton of ideas to add, more sources to get games from and many improvements, but the base system is working so I can start to tackle other areas, like automations.
I can manually add games to the database. I can import them from "catalog sources", and if already exists match them with the existing title, update certain fields, etc. But I still need to manually mark those games I own, so if, as in this example, a platform like Steam can provide me with a list of which titles I've got, and maybe which ones I've completed (by checking certain achievements), it's way easier and nicer.
So, without further ado, here's a brief introduction of the Steam Web API endpoints I'm going to use soon to be able to sync user catalogs.
You can register for an API at https://steamcommunity.com/dev, and it is instantaneous, no need to wait for a manual approval.
Once you have an API key, the official docs are at https://developer.valvesoftware.com/wiki/Steam_Web_API.
I just need three endpoints to grab user data relevant to my use case.
steamid from a
vanityurl (an account friendly name), like
"kartones". Not everybody has them setup but I for example do, so better be prepared:
Fetching the list of owned games of a given user. Including game name, which saves you an additional call to fetch game details (which also returns no name for some titles! 😵):
Retrieving achievements by game and user. Not only the unlock status but also the epoch timestamp of when it was unlocked (useful for deltas):
If you want way more info about a game, from the description, release date or the developer name, to screenshots, platforms (Windows, Linux, Mac), genres and more, there is an store endpoint that works without authentication:
store.steampowered.com is rate-limited against abuse. I read somewhere that seems to be around 200 requests in a 5 minutes window, so you should cache those call results.
Author: Chris Kohler
I had this 2005 book forgotten in a bookshelf and decided to read it during some vacations. Counting around 300 pages, it makes for an interesting and easy read if the topic of (mostly) retro japanese videogames interests you.
From an evolution of Arcades and consoles to videogames themselves, more than half of the book covers what we could say it's the core: how games evolved, the author's opinion on why they were so important and influential, and how they fit into 2003-2004. The remainder are less common but very interesting topics: game music, translations, interactions and collaborations between american and japanese game development studios and of them with Nintendo. I specially liked the music and translation chapters because were areas I know almost nothing about and I was surprised of how relevant they are, both to Japanese people and for example why we had those horrible translations of arcades and early console games.
If I had to criticise something about the book, the only two minor things I can come up with are the following:
As mentioned, just minor things, it is a delightful read and you can clearly feel how the author enjoys videogames, Japan cultural differences (at least applied to games) and in general researched nicely for the book.
I think I've found the four horsemen of the Apocalypse in the python world. A combo that, while will cause pain and destruction at first, will also leave afterwards a much better codebase, stricter but uniform and less prone to certain bugs.
Who are these raiders?
mypy: Not a newcomer to my life (see I & II). Each day I'm more convinced that any non-trivial python project should embrace type hints as self-documentation and as a safety measure to reduce wrong typing bugs.
flake8: The classic, so useful and almost always customized, losing some of its power when used alone. Still certainly useful, just needs to be configured to adapt to
isort: Automatically formats your imports. By itself supports certain settings, but should also be configured to please
black: The warmonger. Opinionated, radical, almost non-configurable, but pep-8 compliant and with decent reasonings about each and every rule it applies when auto-formatting the files. It will probably make you scream in anger when it first modifies all files, even some you didn't knew your project had, even django migrations and settings files 🤣... But it is the ultimate tool to cut out nitpickings and stupid discussions at pull request reviews. Everyone will be able to focus on reviewing the code itself instead of how it looks.
black are meant to run with either this tool or a similar one, instead of as a test (
black even ignores stdout process piping). After some experiments, the truth is that makes more sense to keep auto-formatters at a different level than test runners and linters, and as
flake8 will also fail the pre-commit hook, I decided to move everything except
Go programming language has, among other things, taken a great step by making a great decision: It provides one official way to format your code, and it does fix the formatting automatically by itself (instead of emitting warnings/errors).
I was reluctant to try
isort because I was worried of the chaos they can cause. But again, checking code often means coding style discussions here and there, so encouraged by a colleage I decided to try it both at work (in a softer and more gradual way) and at home (going all in. Almost everybody will hate at least one or two changes it automatically performs, but it leaves no more room for discussion, as you can only configure the maximum line length. period.
black through my whole project, but else they only format created and modified files, which is good for big codebases.
It takes some time to configure all of the linters and formatters until you're able to do a few sweeps and finally commit, so here are my configuration values:
I lately read a non-trivial amount of code diffs almost on a daily basis, so I'm learning a thing or two not only via the code itself, but also via the decisions taken and the "why"s of those decisions.
A recent example that I queried about was the following: You notice there's a DB query that causes a MySQL deadlock timeout. The Query operates over a potentially big list of items, and the engineer decided to split it into small sized chunks (let's say 10 items per chunk). 
My knowledge of MySQL is pretty much average; I know the usual differences between MyISAM and InnoDB, a few differences regarding PostgreSQL and not much more. And I consider I still know more about PostgreSQL than MySQL (although I haven't actively used PG since 2016). But in general what I've often seen, learned and have been told is to go for one bulk query instead of multiple individual small ones: You make less calls between processes and software pieces, less data transformations, the query planner can be smarter as knows "the full picture" of your intentions (e.g. operate with 1k items) and, who knows, maybe the rows you use have good data locality and are stored contiguously in disk or memory so they get loaded and saved faster. It is true you should keep your transactions scoped to the smallest surface possible, but at the same time the cost of opening and closing N transactions is bigger than doing it a single time, so there are advantages in that regard too.
With that "general" SQL knowledge, I went and read a few articles about the topic, and asked to the DB experts "Unlike other RDBMS, is it better in MySQL to chunk big queries?" And the answer is yes. MySQL's query planner is simpler than PostgreSQL's by design, and as
JOINs sometimes hurt, a way to get some extra performance is delegating joining data to the application layer, or transforming the
IN(s). So, to avoid lock contention and potential deadlocks, it is good to split into small blocks potentially large, locking queries, as this way other queries can execute in between. 
I also learned that, when using row-level locking, InnoDB normally uses next-key locking, so for each record it also locks the gap before it (yes, it's the gap before, not after). 
This differentiation is very interesting because it affects your data access patterns. Despite minimizing transaction scope, ensuring you have the appropriate indexes in place, tuning up the query to be properly built, and other good practices, if you use MySQL transactions you need to take into account lock contention (more frequently than with other engines, not that you won't cause them with suboptimal queries anywhere else).
A curious fact is that this is the second time that I find MySQL being noticeably different from other RDBMS. Using Microsoft's SQL Server first, and then PostgreSQL, you are always encouraged to use stored routines (stored procedures and/or stored functions) because of the benefits they provide, one of them being higher performance. With MySQL even a database trigger hurts performance, and everybody avoids stored procedures because they perform worse than application logic making queries . As for the why, I haven't had time nor the will to investigate.