Doom Classic with 24bpp lighting

It’s been a while since I pulled an all-night coding binge, but last night that counter was reset to zero. The fruits of that labor are a modestly improved look to the Doom Classic modes under the Doom 3 BFG edition which was recently open-sourced.

Here’s a before/after screenshot pair demonstrating the improved colors for lighting (click for full view):

It takes a keen eye to spot some differences, but the effect should be apparent overall while playing the game for an extended period of time, especially while visiting darker areas in-game. Take a close look at the entryway on the left side and also at the brighter brown wall on the right side.

The Doom Classic modes under BFG are simply ports of the original Doom engine, complete with the old software renderer. It seems they patched up the renderer to scale the original resolution of 320×200 up by a factor of 3x to 960×600. The main game engine (doom3bfg.exe) simply takes the 8bpp palettized framebuffer rendered each frame from the DoomClassic library and updates a texture with its contents, to be presented to the user in the main game window.

While I was perusing the code, I found, by happenstance, this typedef byte lighttable_t; line with these comments above it:

// This could be wider for >8 bit display.
// Indeed, true color support is possible
// precalculating 24bpp lightmap/colormap LUT.
// from darkening PLAYPAL to all black.
// Could even use more than 32 levels.
typedef byte lighttable_t;

This looks like a conversation between developers via code comments (with my own edits to fix spelling), but the way they did the import to git caused all authoring history to be lost, probably on purpose, so we don’t know who’s talking to whom here.

Regardless, what they’re saying here is essentially that lighttable_t, which is used to store palette index lookups based on light levels, could be made to be larger (e.g. 32 bits) to support true color (24bpp with no alpha), with a few additional code changes to generate said light maps and look up the raw RGB colors instead.

The way the engine works is that there is a 256 color palette stored in the main IWAD file in the PLAYPAL lump. All textures and sprites in the game data refer to colors in this main palette. However, there is lighting to be taken into consideration. The engine has to darken the colors referred to in textures and sprites according to the surrounding light level and z-distance. This is done with a light map, from the COLORMAP lump, which is simply an optimized palette lookup table for 32 distinct light levels. Each light level has a 256-entry lookup table which tells it which color from the 256 color palette best matches the original color darkened to the light level. Of course it won’t be perfect since there are only 256 colors able to be displayed on the screen at one time, so you’ll get some color shifting effects and other quantization effects here. But overall, the result is rather impressive for 1994-era technology!

What I’ve done is (mostly) removed the need for the COLORMAP lump and gone straight to calculating the raw RGB colors from the PLAYPAL palette based on the light levels. This way you get direct 24bpp color from the engine. Of course, our colors are still limited to what’s available in the original palette so the source material hasn’t changed, only our rendering is improved.

The light levels available are from 0 to NUMCOLORMAPS-1, where NUMCOLORMAPS is 32. According to some comments in the code, light level 0 is full brightness and level 31 is full darkness. I was able to easily increase NUMCOLORMAPS from 32 up to 64, giving more distinct colors and a smoother lighting look. I was not able to increase NUMLIGHTLEVELS though; there’s something crazy going on with the code related to that constant.

The part that made this all (relatively) easy was that the neo/framework/common_frame.cpp code which projects the 8bpp screen to the 32bpp texture is very simple and does the palette lookup itself. I left this code mostly the same, except I changed the screens array to store larger integers instead of bytes.

I extended the XColorMap array from 256 entries to 256 * NUMCOLORMAPS entries which essentially makes it a larger palette of 16,384 colors instead of just 256 colors. I modified the I_SetPalette method to precalculate all the 16,384 colors based on the original 256 colors.

The rest of the work involved making sure that all the rendering code could handle a wider screen element integer size than byte. There were lots of hard-coded assumptions that the element size would be a byte, apparent in several memcpy and memset calls.

I did encounter some problems that didn’t allow me to fully skip loading the COLORMAP lump.

The primary problem was with the fuzz effect for spectres and your gun (and also other invisible players in network mode). The problem is that the effect uses a specific colormap (#6) from the COLORMAP lump to “dither” the onscreen colors, which produces an effect that isn’t easy to reproduce with a simple calculation. After failing twice or thrice to reproduce this effect, I finally resorted to just bringing back the original COLORMAP and doing a little bit twiddling on the colormapindex_t values read from the screen to keep the light levels consistent.

The other problem was the inverted color effect (only used when the player picks up an invulnerability sphere). I just had to import the colormap at index 32 from the lump to get this to work and also update the INVERSECOLORMAP to be NUMCOLORMAPS since it’s now 64 instead of 32. Just a little table translation there.

There appear to be two extra colormaps in the lump that I’ve not accounted for so I’m just ignoring them. The game plays and looks great now. Admittedly, the red- and green-tint effects don’t look as good as they used to for some reason. I’ll have to check that out. The effect comes across, but it gets too dark further in the distance.

How I fixed the crash in Doom 3 BFG Edition

Merely 10 hours ago, id Software released the GPL source code to Doom 3 BFG Edition. Unfortunately, when I built the game with VS2012 Premium, the Doom Classic modes crash (both Doom 1 and 2) instantly. Here is the small tale of how I fixed that bug.

The obvious thing to do was to fire up the game in Debug mode and see how far I get. The debugger (under default configuration) wasn’t giving me much when the code bombed out due to an unhandled Access Violation Win32 exception. The key was to force the debugger to break when the access violation exception occurs in the first place rather than letting it pass unhandled. VS2012 gives you a check-box labeled “Break when this exception type is thrown” when the unhandled exception is caught. Turn this on and restart the game and try to start up Doom 1 or 2 from the main menu.

Now we get a first-chance exception occurring in r_things.cpp line 196:

intname = (*int *)namelist[i];

A quick check to the Locals debugger window shows that i is 138. The access violation exception is thrown by the OS when the process tries to read memory at namelist[138]. Let’s try reading from namelist[137] using the Watch window to see if index 137 is safe. Okay, everything looks fine there at index 137. It’s just at 138 where it bombs out. Let’s remember this number.

Now let’s step backwards a bit and try to find our place in the code. Where did this namelist pointer originate from? Jumping back to P_Init in the call stack shows us that P_InitSprites was called with sprnames and P_InitSprites hands that off to P_InitSpriteDefs unchanged. Let’s take a look at this sprnames in info.cpp

const char * const sprnames[NUMSPRITES] = { "TROO","SHTG",**...<snip>...**,"TLMP","TLP2" };

That’s it? No NULL terminator there? And there’s this constant array size specifier there: NUMSPRITES. Visual Studio tells me that its value is 138. That sounds familiar…

Let’s go back and take a look at that function where our first access violation occurred to see why it’s trying to read past the bounds of the hard-coded array (whose length is 138 elements).

We can see that the size of namelist (assigned to ::g->numsprites) is calculated to be longer than it should because there is no NULL terminator present. That causes the loop below it to try to access memory beyond what’s allowed. Here’s the simple counting code:

// line 173 in p_thing.cpp:  
check = namelist;  
while (*check != NULL)  
    check++;

Perhaps the original developer assumed that the const memory section would be zeroed out and the counting while-loop would just luckily run into an extra zero that just so happened to be found just past the bounds of the array? I can’t see why this is a safe assumption to make under any context whatsoever. Perhaps a random happy coincidence of memory layout and padding made this work in VS2010?

Based on this analysis, it seems obvious to me that these methods should be passing around the array’s known count (NUMSPRITES) instead of trying to calculate it dynamically by scanning for NULL terminators. A quick search through the code shows me that these functions are only used once from P_Init so this should be a safe change to make.

This particular instance of this class of bug makes me wonder what other instances of this class of bug are lying around the code elsewhere. I think I got extremely lucky in this instance and could pinpoint a root cause because the data was hard-coded.

I’m going out on a limb here, but it seems that VS2012 added some extra protections to make sure that access violations were thrown for access beyond the bounds of statically-allocated memory regions, which makes me doubly lucky to find the bug. I’m not sure exactly how they’ve done that, not being too familiar with the Windows memory management APIs, but I’m sure there are all sorts of caveats and gotchas with protecting fixed-size memory regions (page alignment issues, etc.). I wonder if this bug would reproduce in VS2010, or any other compiler for that matter…

The pull request I’ve submitted just appends the NULL terminator to the hard-coded array. From here, the code works great and Doom 1 and 2 start up just fine.

XP VMs with VirtualBox

For web developers, if you want to test your site on IE7, go download the free XP image from Microsoft here. Once it’s fully set-up, install IE7 on it; the image comes with the installer on the desktop. Don’t bother with the Vista image unless you need to support something OS-specific, which if you do – you should just stop what you’re doing and severely rethink your web dev stragedy.

For use with Oracle VirtualBox, you’ll have issues with networking, which will prevent you from Activating the VM. Follow these steps to resolve the networking situation:

  1. Download the XP image, obviously: Windows_XP_IE6.exe
  2. Run the EXE to extract the VHD file (ignore all other files) to somewhere you like
  3. Fire up VirtualBox
  4. Create a new virtual machine using the existing VHD file you just extracted, obvious settings apply
  5. Go download the Intel PRO driver at http://downloadmirror.intel.com/8659/eng/PRO2KXP.exe
  6. Place that EXE into a new ISO image using whatever ISO tools you wish (cygwin has mkisofs)
  7. Mount the ISO you created on FIRST boot of the VM and install the driver as immediately as you can. This will help you be able to Activate the VM over the Internet.
  8. Open the mounted ISO from within the VM and run the driver EXE installer.
  9. Reboot should be safe at this point.

NOTE: If you don’t Activate after the second boot, your VM is hosed and you have to start from scratch again (just run the XP EXE and replace the VHD file). I did this at least 4 times to try to find the right procedure.

After you finally activate your VM, you should be fine to install IE7. Don’t bother doing that before otherwise you’ll just waste your time because the VM won’t let you log in after three boots without being activated.

Now you’ll probably want some sort of decent JavaScript debugger. Well, I’ve got some good news and some bad.

The good news is you can get a basic JavaScript stack trace when an exception is thrown but only if you install Microsoft Script Debugger. The bad news is that this tool flat-out sucks and you don’t have many other options. If you know of some, please let me know.