I'm having so much fun trying to reverse engineer and extract the graphics, that after being able to export the title screens from Mutan Zone Opera Soft game, now I want to get the sprites.
This is far from easy as they are inside the binaries, and the game has 3 of them:
MUTAN_Z.EXE
looks like the only one, but this tiny 2KB executable just does some bootstrapping: Allocates memory, sets the video mode, loads corresponding COM code in memory (more about this in next item) and executes it. It also goes back to text mode and displays error messages if something failed during the bootstrapping.MUTAN_Z1.OVL
Looks like some data file, but actually if you rename it to .COM it shows the lower menu (but doesn't do anything else). It is the actual first level of the game as its own executable.MUTAN_Z2.OVL
Similar to previous one, contains a .COM with the second game level, in this case perfectly booting at least the maze-mini game before the main level.
I already had renamed the .OVL files after checking their contents and other older Opera Soft games (which were always self-contained .COM files), but decided to went on and dissassemble the .EXE file to confirm it.
First, I installed Reko under Windows 7 using one of my Virtual Machines. It generates a decent Assembler code (comments are mine):
l0800_01EF:
mov ax,0013 ;; Set Video Mode 0x13
int 10 ;; Video Services
mov si,04A7
jmp 026A
But to my surprise, it also generates some rough C code automatically translating for you some of the interrupt calls:
else if (ds->b061A != 0x03)
{
bios_video_set_mode(0x13);
__cli();
bios_video_set_block_of_DAC_registers(0x00, 0x10, cs, 1191);
__sti();
}
As the full assembler is just around 400 lines of code (including some data byte blocks), I ended up refreshing some of my assembler and commenting all of the interrupt calls and some code fragments. It is great to see that now you can find even at the Wikipedia lists of MSDOS API/interrupts (int 21h) and BIOS interrupts. I mostly confirmed that the .COM contents are loaded in memory and apparently then executed (have to confirm but probably the program instruction counter etc. are just updated to there, at least is what I'd do).
After this, I decided to try with the first level's .COM file, but Reko didn't liked it, so I searched for alternatives. In the end, I discovered that the acclaimed IDA Pro has not only a freeware version for non-commercial usage, but also that it has Linux binaries and, even better, that any version can understand most MS-DOS and Windows executable formats!
Note the visual minimap of the contents in the upper part, with lots of blue and brown (instructions and similar), but also quite some grey (byte data). Those smelled interesting, so I extracted (as "raw bytes") the blocks to separate files.
I grabbed my existing PIC exporter and run it with the files... and no visible patterns. Then I thought... why the heck would anybody interlace sprite rows (as with the title screen), too much complication for tiny 8x8 or 8x16 blocks... so I changed the code to read sequentially all rows, also added some tiny changes to generate N files when read content surpasses the specified "sprite size", and checked again. Some of the data blocks were still not fully recognizable, but I'm using the following MS-DOS screenshot as the guide to hunt for sprites and I detected some color patterns:
I ran it through all segments... and one of them had displaced but visible numbers from the menu and letters from the intro text:
Checking the pixels this is not an interlacing problem, but a mere offset/displacement issue. Adding to the code the option to skip the first X pixels (remember from my previous post, 1 byte == 4 CGA 2 bit pixels), I fixed it:
I'm not an expert in pixel art, but checking other files coudn't recognize any fragments of enemies, scenery or the player... but I did saw that one block had half glibberish half clearly recognizable the game frame/menu rightmost part:
So, some pixels on the same data block are clearly 8x8, while others are still broken because are not so small. But at least I'm on the right path regarding displaying data. My current bet is that the code contains which offsets contain 8x8 sprites, which 8x16 (or whatever size the entities have) and which the menu parts. My hypothesis about the game menu/frame is that the upper part (and maybe lower too) might be 320 pixels width, read like the title screen. I'll probably configure segments of the byte data to be treated with different sizes, so I can handle 8x8, 8x16 and 320xWHATEVER. Again is what I'd do, not limit myself to handle only 8x8 sprites if I can just know which ones are bigger (and work with a finite group of available sizes).
I've uploaded the code of the yet unfinished sprite exporter to my Github. It really doesn't yet export individual sprites with the constant values I've commited but changing both sizes to 8 will output lots of pixel garbage, as many sprites aren't aligned.
Next steps are trying to guess the other sprites, or probably just checking the dissassembled .COM code, where it reads data from the data blocks and reverse engineer how it does it. I also read that DOSBOX has "debug enabled" compilation flags that allow runtime debugging so I might give it a try to experiment with more disassemblers.
NOTE: If I end up succeeding and extract the sprites, my plan is to build a small Python script that, given the MS-DOS game data files, will extract the byte data blocks, so that afterwards running the sprites script will generate the PNG. Expecting anybody to do a manual extraction of byte blocks is quite time-consuming, but I don't want to include the original game data files (they are trivial to find online).
Tags: Game Dev