Kartones Blog

Be the change you wanna see in this world

Opera Soft's PIC to PNG exporter

The other day I was trying to play an old MS-DOS game, Mutan Zone from Opera Soft. Despite being terribly hard (and requiring you to do pixel-perfect jumps), it was one of the games I owned for the old AMSTRAD PC/W back in the eighties and nostalgia hit me, so I played a few times... and decided maybe would be great to try and extract some of those fancy graphics. I lately get bored more easily with the games themselves and instead like to thinker with their internals.

This the game title screen, the (at first unkown) goal I'd end up achieving. At older systems also called "loading screen", because would be what you'd see while the game loaded into memory:

Mutan Zone

Before digging into code, I searched online for graphic formats, and while there was a .PIC file format, and was used by old painting programs, it had a very noticeable header (01234h) which wasn't the case with the game .PIC files.

My first naive approach was to assume 1 byte had one pixel (typical for MS-DOS games, as a VGA card could display 256 colors). I hex-analyzed some PIC files (both from Mutan Zone and Abadia Del Crimen), and they didn't seemed to have anything strange, other than lots of byte repetitions, so my bet was that the file had no compression. Also, I knew from my small retro gaming knowledge a few facts and guesses:

  • Games were written in Assembler, and ports to other computers were very frequent (many times by the same company)
  • There were almost no tools so building an image editor that converted to different formats would be already a hard for them
  • Space was an issue but at RAM too, and CPU was a big issue too, so compressing graphics would add more complexity than value

Based on this, I tried to simply dump the bytes, one at a time, in 8x8 sprites, using the pixel byte value as the green component of a RGB PNG. The results were... nothing recognizable. Tried 8x16 and 16x16, but no visible patterns resembling anything.

I built a crappy ASCII dumper that put @ everywhere there wasn't a zero, and toyed around with the first chunks of bytes. what I did found was that, instead of resembling sprites, the pixels would make more sense arranged horizontally, in a single row... and then I had this eureka moment: The PIC file might just be storing first the loading screen! It could also might be storing some metadata as a header, but didn't looked as such (lots of zeroes, atypical in header info) but... Would try first to just dump all content as a 320x200 image.

So I grabbed a screenshot of the title screen:

Title screenshot.

And comparing the "black & white" dump vs the loading screen, there were similarities but still didn't matched the first row of pixels...

Then, I decided to check the graphics mode for hints. In CGA you only use 4 colors, and the game ones were from a standard palette (0 black, 1 cyan, 2 magenta, 3 white)... so converting to binary we just need two bits to store the chosen color of a pixel:

00 -> black
01 -> cyan
10 -> magenta
11 -> white

I also thought that if I were to build an image editor for that era, graphics and memory, I'd squeeze 4 pixels per byte. Analysing the size of PIC files from another game from the company, Sol Negro, I found out that the EGA version files was double the size than CGA ones... so bits-per-pixel (bpp) were "in use" instead of just using a full byte per pixel. Also, it kind of confirmed my guess that PIC files weren't compressed (else size would differ but being almost exactly twice... was suspicious).

The first bytes of the file in binary were:

00110000 00000000 11000000 11000000

Which, if you count in pairs, match the pixels at the first row of the title screen:

Title screen upper-left corner zoom

With that in mind, I changed my code to keep reading one byte at a time, but operate with pairs of bits.

Early experiment of splitting a byte into 4 pixels / 2bpp. Color wasn't extracted properly, but I was getting somewhere:

WIP screenshot #1

Maybe I was doing something wrong (although it looked straightforward), so I decided to check how vigasoco project was reading the Abadia del Crimen data and placing pixels at the screen.

Copy & pasting the pixel unpacking method improved but didn't fixed the bug:

WIP screenshot #2

So I went back to my code and tried with Abadia del Crimen:

WIP screenshot #3

Someting was surely wrong, so in the end, I did the simplest boolean algebra logic I could to be 100% sure I was masking and shifting and grabbing the correct pixels:

# 00000011 1 + 2 = 3
# 00001100 8 + 4 = 12
# 00110000 16 + 32 = 48
# 11000000 128 + 64 = 192
if pixel == 0:
 return (data & 192) >> 6
elif pixel == 1:
 return (data & 48) >> 4
elif pixel == 2:
 return (data & 12) >> 2
else:
 return (data & 3)

And voila! it worked and was reading every pixel right... or maybe not:

WIP screenshot #4

Uhm.. a half-size image with weird cuts below... this looked like some kind of interlacing, so I assumed I was reading first all even rows and then all odd ones:

WIP screenshot #5

Almost there, but there were black lines. Going back to the hex PIC data, I saw that between the last "even row" and the first "odd row", there were 192 bytes/768 pixels of zeroes. I still have to find why that padding zeroes, but I simply skipped them and tried again:

Mutan Zone

Finally! A pixel-perfect PNG dump of the game's title screen.

I've uploaded all the code to my GitHub so, despite being small, better go there if you wish to see all the details. It is a simple Python script that reads bytes, operates with them and saves (using PIL) the data into a PNG, but I want to keep it at hand for the future if I need to do again bitwise operations, bit-representations of integers and the like.

Other results

Considering that I almost got before the Abadia del Crimen title screen, I decided to try a few more PIC files from other Opera Soft games I knew... and as long as they are CGA-based, it works perfectly:

Abadia del Crimen

Corsarios

Livingstone Supongo 2

Sol Negro

The future

I've been able to convert a PIC file, but it seems game sprites and backgronds don't live there. I've already done some initial peeking and the OVL files (present at most DOS games from Opera Soft) they include inside at least a COM executable (newer games like Sol Negro include a DOS EXE binary, with their identifiable MZ header), so each level runs independenty as a separate binary... but at least COM files have no clear separation between code instructions and data, so it'll require more work.

This is why also the Python script is so specific for title screens. Until I figure out where and how are sprites stored, it doesn't makes sense for now to make more generic the extractor (and maybe I'll even duplicate the code and keep the tile one intact).

Update: Added .PIC research (interesting but not critical) and corrected typo.


MVP: Minimum Viable Product

Walking back from work to home I recently listened to a Rework's podcast titled You need less than you think, which made me remember of a few technical examples that also fit into that topic.

Agility

When I started studying computer science, we were taught only about the classic and dreaded waterfall model, and although much has changed since then and I also had the luck of learning TDD and XP in 2004 and apply it to my work sometimes, we're midway through 2018 and still too many startups and in general tech companies struggle to generate value fast enough to be really considered agile (in my opinion). I've experienced both small and mid-sized tech teams struggling to deliver products in time, but I've also experienced the opposite: Very small teams (like 3 engineers at Minijuegos.com counting the CTO) and workforces of +100 engineers (like Tuenti when it was a social network), where we were able to deliver what I consider high quality and complex products in very short time periods.

But those two examples I mentioned are outliers. Usually you either get into big, long projects, or instead accomplish tiny projects without an ambitious scope. At a previous job, we had a real need to be agile: We were an early stage startup, with some initial funding but the need to build the product and start generating money [1], a small team (we peaked at ~10 engineers IIRC), and some "critical" requirements like being "highly scalable" from day 1.

I had the luck of working with eferro, who tries to be agile, pragmatic and iterative when building products, and he seeded us with the concept of building MVPs (Minimum Viable Products). I'm not going to enter into details of how it works but instead use a classic picture that perfectly represents it:

Visual explanation of how to build an MVP

We tried to be extremely pragmatic whenever possible, and even iterated through some existing services we had built in order to simplify them, as we moved towards a fully automated platform (from infraestructure to every service, everything was triggered through code, had no intermediate manual steps). I'm going to talk about three scenarios where we built MVPs that some people would consider almost "hacks" but worked so well in our scenario.

#1: You don't always need users

When we got our first paying clients, we not only manually billed them, but also didn't monitor usage of the platform per user, because we didn't actually had users.

We reached the point where the platform could work as an API-based self-service so product came with the need of "adding users so we can bill them". But we still had still lots of critical pieces to build (like an actual web!), so we discussed a bit around what was needed, and came upon the real need: "we need a way to track usage of the platform per customer to charge them appropiately". Look at the tiny critical detail: If we were able to somehow track that usage without actual users, as long as we provided accurate metrics, it would be fine.

We had the following data pieces:

  • we knew the email of the customer as we were sending them a notification when the job was done
  • each customer had an API key (which was a simple config python dictionary with an api key -> email mapping)
  • we were gathering metrics, just per job instead of per user
  • our AWS SQS message-based architecture allowed to easily add any new field to the messages without breaking existing services (e.g. add a correlation id that travels and marks the full journey of a job)

What we decided to do is, at the API, build an SHA hash of the API key per job as our "user id", add it to the messages, and implement a quick & simple CSV export job that would be manually triggered and would return a list of all the job metrics for a given user email and start-end datetime range.

This approach allowed us to keep building other pieces for a few months, until we really had to add a user service to the platform.

#2: You don't always need a database

The platform we were building allowed to customize some parameters based on templates. Those templates were displayed at a small website, like a customer-facing catalog, and also could used to do single task jobs as demos.

  • Some data was stored at PostgreSQL, while other was read from AWS S3. You always had to query the DB to just display a few items, but you also always had to fetch metadata files plus the template actual data.
  • Having to work with a Python ORM (SQLAlchemy in this case) to perform such trivial queries (we didn't even had search) was overkill
  • We had sample videos showcasing some templates, which were created by the template builders but needed to manually be resized (and weren't optimized)

None of it would initially be a deciding factor to rewrite the internals of the service, but combined made this apparently trivial system a huge source of pain for our "internal users" (the template makers), as they would had changes made but not reflected on this site and had to do lots of trial and error an manual corrections.

We also had less than a hundred templates, with not so many variants available, so why having to mess up with a DB for a few hundred "rows" of total data?

What we did was:

  • Revisit all the S3 policies to ensure consistent metadata + data files. Either you got the newest version "of everything", or would get the previous version when calling the service from jobs.
  • Create a script that, when (manually) run, would reencode the chosen FullHD sample video at a configured web resolution and optimize it a bit (mostly remove audio track and reduce some frames)
  • Remove the database, using a JSON file "per table"

S3 scales insanely well so we got rid of the ORM and of having to setup and use any database engine... And the code got really simple, up to the point that data reading methods became mere "dump this json file contents to the output" or "load this json and dump item X".

Local development also got faster, now being so easy to understand, test and extend.

Later this demo website was removed and a template service was developed, both to serve the main API and the would-be self-service webpage. I proposed for it to be also DB-less but the service owner decided to build and keep it relational just in case.

I like relational databases and think they are quite useful in many, many scenarios, but I've also realized that sometimes, if you can rely on something that "scales well", maybe there's a simpler solution removing the need of adding a DB, at least for the first MVPs.

#3: You don't always need a relational database (nor a NoSQL one)

This was an evolution of the second example. We had a distributed job coordinator, a single process that had to read and manage state of thousands of messages whenever the platform had work to do. The state was stored in a PostgreSQL database, and while the DB wasn't the main problem, it was also quite overkill for a single state storage.

We were also doing on-call, so after suffering a fun night with the service crashing multiple times, we decided to rewrite it, and came up with a much simpler solution: Keep a simple structure of general job data and individual job items/tasks, wait until every item has either finished or failed X times (we had retries for some tasks) and just persist everything into AWS S3 (taking care of the by default eventual consistency I/O) in plain JSON files.

Something like this:

job {
 id: "...",
 ...
 tasks: {
 "task-id-1": null,
 "task-id-2": "ok",
 "task-id-3": "ko",
 ...
 "task-id-N": null,
 }
}

This allows you to both know the status of the general job (if count of null > 0 still ongoing) and know if a finished job had errors (if count of ko > 0 has errors), without keeping an explicit general state. I've come to dread state machines when you can use actual data to calculate on the fly that same state.

We could have used Redis to keep the states, but JSON files worked fine, allowed us to TDD our way in from the beginning, and also eased bugfixing a lot, as we could just grab any failing job's "metadata" and replicate locally inside a test exactly the same job.



[1] : Actually we failed to achieve the goal, as the platform itself was working really well... but cool technology without money usually doesn't lasts.

UPDATE: A colleage from that company pointed out that my memory has issues. Corrected the second example to reflect which service we removed the DB from (the demo website), I proposed to do it also later at the template service when we built it and mixed both.


Course Review: Master The English Verb Tenses (Udemy)

Continuing with my "english workout" plan, another course from Udemy I've recently finished is Master The English Verb Tenses.

2 video hours and 1 audio exercises hour in, it serves as a good way to fortify the english verb tenses. It describes them, including handy timelines to more easily understand when the action took/takes/will take place, gives some examples, scenarios of common mistakes/misuses of each tense, and some exercises for you to do.

Small but effective, it met my expectations.


Build a Multi-Arcade with a Raspberry Pi 3 and RetroPie

Part I: The Experiment

I love retro systems, and always have been using emulators since they exist. The main caveat with that many of those emulators is that they tend to get outdated and stop working on newer operating systems. With Windows each major version scared me up to the point that my gaming PC is "frozen" on Windows 7 to avoid losing performance and compabitility. Since I switched to Linux, I've been trying to use emulators, with not so good results. So, when I read about the RetroPie project I felt like could be the solution: a dedicated but small gaming machine. I had a Raspberry Pi 2 not being used for anything so... why not?

I used the RPi 2 as a testbed with a spare 8GB SD card. Just following the official instructions and using the official image was enough and I didn't have a single issue. I love how it allows you to setup a USB-drive to add new content, by just setting up some folder structure on it, so I can just place new games and after plugging they get installed automatically.

As an alternative, Ars Technica has a detailed DIY guide, although as I mention later, careful if you play PSX or NeoGeo as their setup won't cool at all and the CPU will heat and lower the speed.

Everything went smooth and I could play some old console games, so I copied some MAME and NeoGeo romsets... and here I got my only issues:

  • For NeoGeo, read the appropiate section and ensure you have the correct BIOS downloaded
  • For MAME, documentation is not bad but I didn't read it so much and found the hard way about the concept of ROMset rebuilding. ROMs are dumped and emulated at a certain version, but as time goes by newer dumps might be achieved, and those usually are not backwards compatible. Combine that with the RPi MAME emulator supporting only until 0.78 ROMsets, and you get the answer to why most games didn't worked at the system when always worked for me at the PC. So either you rebuild the ROMsets for v0.78, or (the way I went as I'm lazy) directly search online for some full 0.78 ROMset to download and then filter.

Part II: Going to production

Emulation issues solved, the experiment was a success so I decided to buy a Raspberry Pi 3. After Nintendo's Mini-NES, there are many clones to use as Raspberry Pi cases. I didn't planned to buy any, but a friend read my tweet about setting up the RetroPie and instead of buying anything he gifted me not only with a case but an actual pack of a Raspberry Pi 3 + NES-like case + SD card + USB gamepad!

It came with some MAME games preinstalled but I preferred to do a clean install of the latest RetroPie. After installing games including the oldie but goldie Gran Turismo 2 for the original Sony Playstation, I noticed that the game slowed down after a few minutes of playing. Some research and touching the CPU to confirm it taught me that when gets hot, RPi lowers the CPU speed to avoid thermal issues. So, the case might be cool but wasn't good for CPU-intensive usage, despite having some heatsinks installed like the following:

Raspberry Heatsinks

Some more reading after it looked like the best solution was to add a fan to cool the CPU:

Raspberry Pi fans

The fans can be setup at two different voltages. Being so small and already used to a semi-tower gaming PC, I directly went for the 5V voltage connection.

Fan different speeds diagram

I originally had made additional ventilation holes to the case, hoping would be enough to let the hot air go out, but as it didn't worked I had to modify those holes to place and hold the fan:

NES clone RPi case with fan

Everything finally assembled, I played through a few levels of Aliens vs Predator (MAME version, good benchmark), a few races in PSX Gran Turismo 2 arcade mode and the first full level of NeoGeo's Metal Slug. Not a single slowdown.

Part III: Replicating the success

I liked the system it so much that I actually bought another Raspberry Pi, a 3 B+ model, with an official case. This is the one that I'll keep at home to squeeze those extra 200MHz and better network speed it brings. I initially thought about just leaving the case open, but as the fans came in a 2-pack, I decided to also install the other one on it:

Installing the fan into an official RPi case

And after some patience and plastic cutting, this is how it looks fully assembled:

Official RPi case with fan

As you might notice the hole is not centered. This is on purpose to place the fan exactly above the CPU. Less aesthetic but I prefer pragmatism.

Even with the fan, the official case doesn't have any ventilation hole/grid, so when playing I open one of the side lids (the one without connectors) so the air flows out from it instead of being kept inside.

I'm only missing setting up the second gamepad for the system, as I have a handy original SNES to USB adapter that allows to plug two pads, but joystick configuration is so easy that I don't expect any trouble.

Closing thoughts

I'll use the NES-like system as a "portable arcade", for vacations, events, etc. as I just need an HDMI plug and I have an extra USB gamepad.

I haven't tested anything more powerful than Nintendo-64 emulator, but people seem to be using it to play even PSP games, so this 3.5 generation of the Raspberry Pi is already quite powerful, despite not having a decent GPU. RetroPie 4.4 feels quite stable and my plan is to stick with the system for as long as possible, as my only wish would be more MAME ROMs support, but even with the actual compatibility is fine by me.

I only tried the Raspberry Pi 2 with few games and then gifted it to my sister, so I really don't know the performance cap it has compared with a Pi 3 or 3B+.

Appendix

Other useful software I've installed will get listed here.

  • Kodi: An amazing media player from the creators of XBMC, highly configurable and supporting plugins (although I haven't tried them).

Changelog

Update #1: Added Appendix section with Kodi software.

Update #2: Added a link to an Ars Technica guide which contains very detailed installation instructions.


Course Review: Essential Business English (Udemy)

Last christmas I decided to try to improve my english, so work lessons aside, I enrolled into a university course (B2 European level-equivalent), and now I'm also taking some Udemy english-related courses. This is the review of the first one I've fully studied (handy also for my incoming university exam).

Note: I'm not going to judge the price of courses because we now have at work Udemy for Business accounts so I get access to most of them "for free" and thus I cannot be totally objective regarding price/value. I prefer to focus on a small review of the quality and contents.

So, the first course I've finished is Essential Business English, which contains 2 hours of video with work-related conversations: How to hold a meeting, discuss, interrupt, making business calls, scheduling, complaining about delivery problems, introducing new employees... It also includes some exercises, both inside the videos (showing you the correct answer after a while) and in companion PDFs, online crosswords, quizs and other resources. Videos contain funny but well drawn cartoons and the pronunciation and quality of the audio is excellent.

It feels a bit short but well organized and easy to follow and study.


Previous entries