Thoughts on Go 1.1

I’d like to share a few thoughts I have about the Go programming language after implementing my very first and currently only project in it. This may be a bit premature since I don’t have much experience with it, so if you have some advice to give or some justifications to make then please comment back. I’m always eager to learn new things!

For future readers, it should be known that at the time of this writing (2013-05-22), Go 1.1 was just recently released, so all of this observation is specific to that version and not to any newer version that obviously doesn’t exist yet.

Fair warning: there are some strong opinions expressed here. I make no apology for having strong opinions, but perhaps the tone in which those opinions are expressed might be offensive and I will preemptively apologize for that. It’s hard for me to decouple the passion from the tone.

Language features:

First off, let’s address the biggest elephants in the room:

  1. usage of nil instead of the much more common null to represent the lack of a value for a reference type
  2. non-nullable strings
  3. import with an unused package is a compiler error
  4. identifier case determines package exposure

I don’t think that nil and null in terms of reference values (or the absence of such) are two different concepts here so there’s really no reason that I can see for going with nil over null. It seems contrarian in nature. I’ll just dispense with the nillity and say null from now on and you should know what I mean.

Strings in the Go language act like reference types, and since all other reference types are nullable, why not strings? The idea that the empty string is equivalent to the null string is utter nonsense. Anyone who preaches or practices this has no appreciation for the real expressive value of nullability or optionality. Having a way to represent a missing value as opposed to an empty value (or zero value) is a good thing.

Now, if strings were non-nullable AND there were a more general optionality feature of the type system to make any non-nullable type into a nullable one, THEN that would be nice. In that case, the nullability of a type would be decoupled from the type itself and I would agree then that string should be non-nullable, like every other basic type should be. I’ve yet to see this kind of clean type system design in the family of curly-brace languages. An example syntax off the top of my head would be string? (nullable string) vs. string (non-nullable default string) and int? vs. int and bool? vs. bool, etc. You see where I’m going.

The most popular complaint that I’ve seen is that all imported packages must be used or you get a compiler error. This compiler error is just downright stupid. I see the intention, and I can kinda get why this was done. But the developers chose to stick to their guns and suggest workarounds for the obvious deficiency, and this is where things get worse. The suggested workaround is to define a dummy variable in your code using some exported member of the package. This workaround is a worse code smell than the original problem of having an “unclean” import section! What were they thinking?! Nonsense. Give me a compiler option to turn that stupidity off at the very least. I should be the one who decides whether an import list should be exact or not, not my compiler nor its over-zealous authors. We’ll revisit this a little bit later in a dumb little narrative.

Riding on the package import error’s heels is the requirement that public members of packages must start with an uppercase character. Character case should not decide such an important aspect and also somewhat volatile fact of a package’s member. During development you might start out with everything private and then maybe wish to expose things later, or even vice versa. Having to change the exposure of a package member will mean having to rename all instances of its usage. What a needless pain. It also makes the export list of the package less discoverable. An export clause at the top of the package file would do fine and serve as better documentation.

There are other issues with forced character casing that arise in marshalling of data to JSON and XML, for instance. Granted there are “tags” that one can apply to struct members in order to provide marshalling hints but the simple fact that you can’t cleanly represent your struct members as close to how you wish to represent the marshalled data is a shame.

Now that the big elephants are out of the way, the rest of the language is more or less competent. The only other major complaint at this point would be the lack of generics. You can’t really cleanly bolt generics onto an already-released language. C# and Java both learned that lesson the hard way. It really has to be baked in from the start. That is, of course, unless you want to just cut a swath of breaking changes in with version 2.0 of your language to get generics in. I guess it depends on the boldness of the language development team. I personally would be fine with breaking changes if they introduced a much more powerful feature that took out a lot of warts and inconsistencies.

There is a bit of silliness that arises from the consequences of how semicolons are elided at the lexer level. For instance, if you separate out a method call expression onto multiple lines where each line is a parameter expression terminated by a comma, then the last parameter line must also terminate with a comma even if the very last line contains the closing paren of the method call expression. Perhaps an example will help:

method(
    param1,
    param2,  // <- this comma is **required**
)

This isn’t a huge deal, but it does sorta make things look messy. Now, I’m all for acceptable usage of extra trailing commas in things like list initializers because they’re useful there, but for a standard method call expression that doesn’t have a variable number of parameters it’s kind of misleading. Your eye parses the last param line expecting another one and gets misdirected to the ending paren unexpectedly. Where’d the last-last param go? Oh, there isn’t one? Hm, okay. Weird.

Don’t forget that this extra comma is only required IF you format your code in this style. Obvious response is “well don’t format it that way”. My obvious response to that would be “Screw you. I’ll format my code how I think my code should be formatted and how I want to read it. Your idiotic lexer hacks to elide semicolons are getting in my way.” After coding for 20+ years with semicolons I have no objections to them and it’s just second nature at this point to type them in anyway.

(Side-note: Yes I’m only 30 years old and yes I’ve been coding for 20+ years since I was 8 years old. Deal with it.)

Go lacks a native enum type. Its replacement is the somewhat less obvious combination of a type declaration with a const section that describes a series of constant values outside the namespace of that new named type that should act as the enum’s type name. Here’s an example:

type sortBy int
const (
    sortByName sortBy = iota
    sortByDate
    sortBySize
)

All that code just to effectively create an enum named sortBy that would’ve been this brief in C# or Java or C++:

enum sortBy {
    Name,
    Date,
    Size
}

Of course we could make both of those even more brief, but the comparison here is fair I think. The Go version is needlessly more wordy for this most common of cases. Granted, I like the iota concept. That’s really cool, but there’s no reason that we can’t get iota into a native enum type in Go. Furthermore, the lack of the namespace for the enum members means that they end up at your package level with pseudo-namespace identifiers which makes things get a bit wordy. At that point you might as well just go back to writing C code with ENUMNAME_MACROS_LIKE_THIS to define enum members.

There’s the horrid syntax of map[K]V. This just makes my eyes bleed, but given the present lack of generics and the inability to design anything less ugly I guess I’ll deal with it. I just can’t bring myself to type that in here again, so let’s just move on.

Why is len a built-in global function and not a built-in method on slice/array types? len(slice) could just as easily be slice.Length() but it’s not. Granted, my syntax is longer, but is obviously more consistent in appearance with other method calls.

I do like Go’s slice support, but I think they didn’t take it far enough. They should’ve taken a leaf from Python’s book and implemented negative end values to denote positions from the end of the slice instead of having to compute that offset yourself. The D programming language almost got there with its $ token to represent the length of the slice e.g. a[0 .. $ - 1], but I think I’ll give the bronze to Python here for a[0:-1]. Go has neither, and forces you to a[0 : len(a) - 1].

The simpleton will say, “but what’s wrong with that?” And I will reply, “Fine, then try this package.GetSomething(lots of parameters here)[0 : len(package.GetSomething(lots of parameters here) - 4].” Did you get lost? Did you recompute something there that you shouldn’t have? Sure you can just pull it out to a separate variable on the line above and refactor the entire expression you just cooked up. Or you could just say package.GetSomething(lots of parameters here)[0 : -4] and you’re done.

Now if you’re a Go expert and you know something that I don’t about this, then it’s not in the (rather terse) language specs. I checked.

Interfaces:

I think the most confusing part of the language is that interface implementation is entirely implicit and not discoverable at all. At first I thought this would be kind of cool, but unless you’re intimately familiar with all implementation details of all packages, you’re never going to know what interfaces a given type implements. This makes using the standard library a nightmare.

Okay, this method wants a Reader … do I have a Reader here? What is that? Oh geez, now I have to look at the type the library exposed to me to check if it even implements that interface… Oh of course it doesn’t state it obviously anywhere so I have to read their source code or gleam that fact by glancing at ALL their exported methods for ALL their types. If my human-eye parser is off by a token or two then whoops! I guessed wrong. Oh, that interface accepts a POINTER to that type but not a copy of.

All this is fine, of course, but Go(d) forbid you have a dirty import list! THE HORROR! How could you not know that you don’t need that time package despite the fact that the os.FileInfo has a ModTime() that gives you back a time.Time that may or may not require you to use the format string constant from the time package!? If you don’t need that format string then you don’t need the time package and you’re a bad developer for importing it as a precaution. Oh wait, now you do need that format string constant? Well, you should’ve imported that time project! What’s wrong with you?

Let’s not forget about the fact that interface{} is the preferred way to represent the any type. Which makes me wonder… WHY NOT JUST ALIAS IT AS any AND BE DONE WITH IT? I don’t want to type interface{} everywhere when I could just as easily type any. Save the pinkies!

I do understand why that is done and it is pretty cool that the language lets you just embed an unnamed type declaration where a type is required (unless that is false which makes this whole justification section moot), but why not just alias that awful syntax to something much simpler and more meaningful? The fact that interface{} is the catch-all interface is cute and all, but I don’t think we need to encode that fact directly in that representation throughout all code.

Standard Library:

The terminology present in the standard library is just foreign and awkward. Let’s take a few examples:

html.EscapeString. Escape? No, we’re ENCODING HTML here, not escaping. HTML has its own encoding. It is not a string literal to have certain characters escaped with escape characters, like a "C \"string\" does with the \\ backslash escape char". HTML is a different language, not an escaped string. Point made? Okay, moving on.

net.Dial. Dial? I haven’t heard “dial” in serious use since the good old days of dialing into BBSes with my 57.6k baud modem (if I was even lucky enough to get that baud rate). “Hello, operator? Can you dial a TCP address for me? My fingers are too fat to mash the keypad with.” Nowadays we just “Connect” to things. Try to keep up.

rune for characters? What? No. No. No. No no no. Why not char LIKE EVERY OTHER LANGUAGE ON THE PLANET? What new value does the term “rune” bring to the table other than to just be obscuritan and contrarian like with your usage of nil? My keyboard here does not carve runes into stone tables for archaeologists to unearth and decipher 2,000 years from now. My keyboard is for typing characters. Let’s get with the times here.

Then there’s the complete lack of support for null strings in the JSON encoder. Really? You can’t call that a JSON encoder in my book. This means that you have to design your JSON-friendly structs to have interface{} where you really just mean a string that could sometimes be null? Awful.

Pile on top of that the idiotic uppercase-letter-means-public decision and you get this rule: “The json package only accesses the exported fields of struct types (those that begin with an uppercase letter). Therefore only the exported fields of a struct will be present in the JSON output.” (emphasis added). That’s quoted right from the JSON documentation.

Pros:

Let me point out some of the features that I really enjoy so that we don’t end on a completely negative note here.

First, the runtime is extremely solid. I haven’t had my HTTP server process that I wrote in Go go down at all, even when it’s faced with boneheaded developer mistakes. I think that says a lot. Good on you guys for a rock solid implementation.

The concurrency model is solid. I don’t have much experience with channels yet, but that’s definitely the right direction to go. I am getting the benefits of the concurrency model with http.Serve and friends without even having to explicitly deal with it in my code at all. I like that. Keep it up.

The multi-valued-return functions are awesome and reduce a lot of unnecessary control flow boilerplate. Combined with the pragmatic if statement, there’s definitely power there, e.g. if v, err := pkg.GetSomething(); err != nil { yay! }.

Raw string literals are just great. No more really needs to be said here. I like that the back-tick character (not rune) was used for these strings. C# did well enough with @"raw string literals" but the double quote is such a common character that you have to double-up on them to escape them, e.g. @"""". I definitely prefer `back-ticks`. I’m much less likely to require a literal back-tick character in my strings than a double quote character.

Implicit typing is wonderful with the := operator.

Multi-valued assignment is simply awesome, e.g. a, b = b, a to implement a simple a, b swap operation. I need to take more advantage of that in my code.

The lack of required parens for the if statement is great but comes at a high cost of requiring that the statement body be surrounded in curly-braces in all cases. This restriction is a bit annoying for simple for-loop if (filterout) continue; cases.

Grouping function parameters by type is awesome, e.g. func Less(i, j int)

The name-type order rule contrary to the more common type-name rule is a welcome change, e.g. i int vs. int i.

I do agree with Go’s explicit error handling strategy via multi-return values and if statements. I’m mostly against exceptions and their ubiquitous use of handling all error cases. From a reliability standpoint, explicit error handling is far easier to deal with than a virtually unbounded set of exceptions that I can’t easily reason about.

Summary:

Once you get past the warts and big issues and find the workarounds, you can get really productive in this little language. I am mostly impressed at this point and want to see bigger and better things. So far, it’s the best option I have for writing reliable network services with, HTTP or otherwise, and having them execute efficiently.

Home Recording Advice

Here’s a bit of home recording advice I just gave to a fellow YouTuber. If you don’t know, I have a YouTube channel where I post home-recorded guitar cover videos here. And if you do know, good for you buddy. Anyways, I thought this was a valuable collection of knowledge I’ve gained about the subject and summarized fairly well. The question posited was about where to spend your money to get the most bang for your buck, so to speak.

Obviously if you want quality you’ll need to spend a bit of cash, but there are places where you can make acceptable trade-offs. Here’s where you ought to spend your money best, in order of importance:

  1. Guitar instrument, guitar strings, and pick (aka plectrum)
  2. Guitar amplifier (if you don’t like the sound coming out of your amplifier, you won’t like what it sounds like on the recording)
  3. Instrument cables (avoid crackly cabling with poor connectors; Planet Waves is generally good)
  4. Studio monitors (I have Yamaha HS80M pair and HS10W subwoofer, subwoofer is probably optional for starting out)
  5. Recording room treatment (a couple of Auralex foam pads stuck to the wall in strategic locations does wonders)
  6. Microphones ($80 – $100 should suit you fine here, just get a Shure SM57; they’re standard workhorses and sound great on guitar speaker cabinets)
  7. Microphone XLR cables
  8. Computer audio interface (I use Roland’s OCTA-CAPTURE ($800) but there are cheaper variants on that same unit with fewer channels. Check out the DUO-CAPTURE EX)

Disclaimer: This is just my list and there’s nothing inherently right or wrong about it. It’s just a representation of what value I’ve learned to place on things in the chain of everything between your fingers executing a musical performance all the way to the final captured performance in your DAW suitable for mixing with.

These investments will all enable you be able to capture the sound coming out of your guitar amplifier into some computer software, a digital audio workstation. I’d recommend Cakewalk Sonar X2 since that’s what I use and am most familiar with.

What seems to matter the most to the quality of the final mix is actually what you do in the mixing and mastering phases. You can completely ruin a good recording with bad mixing. I know; I’ve done it too many times. Conversely, you can’t make a good mix with a bad recording. “Get it right at the source” should be your mantra, where the source is any one of: your fingers on the guitar, the guitar itself, the amplifier, the speaker, the room the speaker is in, and the microphone at the speaker, including all cabling involved. I guess “the source” is considered to be anything in the physical realm that is not a part of your DAW software that leads to producing the digital track.

I also recommend dialing the amplifier gain down quite a bit while recording. Most great recorded tones are recorded with significantly less gain than you’d expect. The real trick to getting a huge guitar sound is in layering lots of lower gain sounds on top of and next to each other in the mix. Also roll off a lot of low end, like below 100Hz. That’ll clear up the low end quite a bit to let you have some thundering bass and kick drum down there. Otherwise it’ll get all muddied up and you’ll be sad.

Finally, for when you get really into this sort of thing, I’d recommend picking up a re-amp unit. This unit allows you to record the guitar performance first and play it back through an amplifier to be recorded later, when you dial in all your settings just right and like what you hear. This is what the pros do and I’ve only just started doing it myself.

One final tidbit is perhaps Windows OS specific, and that is regarding driver modes for how your DAW connects to your audio interface. In Windows, with a high quality audio interface, you’re likely to have the option for using ASIO which is an extremely low-latency driver mode that lets your DAW talk directly to the audio interface without going through the Windows kernel as an intermediary. This offers huge benefits in terms of latency and CPU utilization in that the system no longer has to do a lot of extra copying and processing just to get your audio data to where it has to get to anyway.

You only want to use the true ASIO offering from your audio interface driver. Don’t use the ASIO4ALL driver because that one’s a big phony. It won’t give you the true low latency of real ASIO that the manufacturer’s driver would. Now, ASIO4ALL is useful as a compatibility layer if the software you’re using only supports ASIO, but don’t expect it to be low latency because it simply cannot be, by design.

Custom Directory Listing with Nginx and Go

For the last few years, I’ve been maintaining a large repository of files and folders on my website here using lighttpd‘s default directory index generator. The generator is fine to get the job done, but offers no extra features. I just recently switched to nginx and its directory index generator is a bit worse than lighttpd‘s (the autoindex directive). This approach worked fine for a while but I really wanted the option to have a custom file ordering for certain directories, e.g. to order by date descending so newer files would automatically float to the top of the file list. So I wrote a HTTP server in Go to do just that, and a little more!

This project was my first real foray into the Go programming language (which I have a few choice opinions about but I’ll express those in another post later). For the most part, the experience has been pleasant, save for a few language warts. The Go runtime is rock solid and my HTTP server has not gone down at all. I keep it running with upstart on my Ubuntu server. If you’re not managing your daemons with upstart, you definitely should start. It’s far easier than the horrible copy/paste/modify workflow of those awful init.d scripts.

What I do is have nginx act as a reverse proxy for /ftp/ requests to my Go HTTP server which is just listening on a localhost port. I intend to change this over to use local Unix sockets for more security and to save my sanity in dealing with TCP port numbers and remembering which one goes where.

The main features of this directory listing generator are custom ordering of files per directory and slightly advanced symlink support.

To specify a custom ordering for a directory, just create a file named .index-sort in the directory and have its contents be a single line specifying the sort mode. The available sort modes are documented on the GitHub project’s README. To override the default sort order, you can specify the ?sort=mode query string parameter in the request.

The advanced symlink support helps to translate filesystem symlinks into HTTP 302 redirects. This works for both files and directories. If the symlink target path is within the filesystem jail being served up, the request will be served, otherwise a 400 Bad Request error will be presented.

For example, if you have a set of versions of some file and a symlink that always points to the latest version, the directory listing will 302 redirect from the symlink request to the actual target filename that is the specific version. In other words, a request to file-latest.kind might redirect to file-v1.kind. This way, the downloaded filename will represent the symlink target file-v1.kind and you can be sure which specific file your users have downloaded, instead of the file being served up as file-latest.kind and you having no clue which one that represented at the time the user downloaded the file.

I’m really pleased with this setup and it took me only a few hours to code up and test. Go does allow one to be productive right off the bat. Best of all, there’s no funny business about threading, concurrency, or reliability like you get with other things like Ruby or Python (mostly the concurrency issue here). There’s just fast, compiled, statically typed code here; just the way I like it. Of course Go isn’t perfect, but we’ll get into that later.

Feel free to use this process for hosting your own directory listings. I look forward to the pull requests!

Goodbye lighttpd; hello nginx

enter image description here

.

.

.

enter image description here

It took me a while (collectively ~8 hours), but I’ve finally replaced lighttpd with nginx on this server!

nginx is already using vastly fewer resources than lighttpd ever did on its best day. I’m happy about that considering the limited resources this server has (MemTotal: 1008568 kB). I’m also pleased with the way nginx handles basic things in a zero downtime manner, e.g. reloading configuration files. I hated that I always had to completely kill lighttpd and restart it just to reload the configuration file for a minor change. nginx reloads the configuration file transactionally and will rollback if issues are found. That alone is worth switching for if you’re on the fence.

Getting nginx to match my existing lighttpd configuration was a bit of a challenge but I got it all sorted out in the end. Some issues I faced were in getting PHP requests through to php-fpm. Those issues were mostly due to nginx‘s quirky root and alias directive behavior, especially regarding the request handling cycle and nested location tags and all the internal redirections and regexes required. (I HATE regexes.)

I settled on a very simple albeit repetitive configuration. There’s no global root directive. All the main location directives are independent of one another, which works best for my setup since I have WordPress as the root / with other sites “grafted” on from there. The PHP-specific location directives are copy/pasted and nested into each main location directive as needed.

The trickiest part was getting PHP requests with PATH_INFO (e.g. /index.php/2013/05/article-name) to work. I found the default example in the nginx documentation for fastcgi_split_path_info and it works great.

For those who are curious and just want to see the nginx.conf details, here you are!

server {
    listen       64.85.164.128:80;
    server_name  bittwiddlers.org;

    location / {
        root   /var/www-bittwiddlers/wordpress;
        index  index.php;

        location ~ ^.+\.php {
            try_files $uri /index.php;

            fastcgi_split_path_info ^(.+\.php)(/?.+)$;
            fastcgi_pass   unix:/tmp/php5-fpm.sock;
            fastcgi_index  index.php;
            include        fastcgi_params;
            fastcgi_param  SCRIPT_FILENAME $document_root$fastcgi_script_name;
        }
    }
}

There are a few other main location directives, but they’re irrelevant to the WordPress setup so I’ve omitted them here.

My fastcgi_params file is almost exactly the default file that comes with nginx, except the SCRIPT_FILENAME line is commented out. I’ve found that the best way is to specify this param per each location directive. $document_root does not work when you only have an alias directive and no root directive. It will only have a value if a root directive exists.

For my configuration I’ve abandoned aliases entirely because of the PHP configuration issues they caused. This is most unfortunate because it should just be a simple thing to set up, but it is not.

Another minor issue that bit me was configuring HTTP Basic Authentication. lighttpd and nginx handle this differently regarding the passwd files that store the username/passwords. nginx is a little more obsecure* (conjunction of obscure and secure, implying security via obscurity) than lighttpd in that it requires that passwords in the htpasswd file are “encrypted” so you have to use the htpasswd tool to create those entries. lighttpd is a little more lax in that it doesn’t care at all.

What also irked me is that nginx has no equivalent to lighttpd‘s "require" => "user=username" feature. I was using that feature in lighttpd to “secure” some parts of the site down to specific users while using one common htpasswd file. For nginx I had to separate the htpasswd file into multiple files, one for each section. This was a little annoying but not really a big deal.

What am I doing “securing” things with HTTP Basic Auth, you ask? I’m taking the most primitive security measures to protect access to those things which deserve only such primitive security measures. In other words, the measure is consistent with the value I place on the secured data. :)

Doom Classic with 24bpp lighting

It’s been a while since I pulled an all-night coding binge, but last night that counter was reset to zero. The fruits of that labor are a modestly improved look to the Doom Classic modes under the Doom 3 BFG edition which was recently open-sourced.

Here’s a before/after screenshot pair demonstrating the improved colors for lighting (click for full view):

It takes a keen eye to spot some differences, but the effect should be apparent overall while playing the game for an extended period of time, especially while visiting darker areas in-game. Take a close look at the entryway on the left side and also at the brighter brown wall on the right side.

The Doom Classic modes under BFG are simply ports of the original Doom engine, complete with the old software renderer. It seems they patched up the renderer to scale the original resolution of 320×200 up by a factor of 3x to 960×600. The main game engine (doom3bfg.exe) simply takes the 8bpp palettized framebuffer rendered each frame from the DoomClassic library and updates a texture with its contents, to be presented to the user in the main game window.

While I was perusing the code, I found, by happenstance, this typedef byte lighttable_t; line with these comments above it:

// This could be wider for >8 bit display.
// Indeed, true color support is possible
// precalculating 24bpp lightmap/colormap LUT.
// from darkening PLAYPAL to all black.
// Could even use more than 32 levels.
typedef byte lighttable_t;

This looks like a conversation between developers via code comments (with my own edits to fix spelling), but the way they did the import to git caused all authoring history to be lost, probably on purpose, so we don’t know who’s talking to whom here.

Regardless, what they’re saying here is essentially that lighttable_t, which is used to store palette index lookups based on light levels, could be made to be larger (e.g. 32 bits) to support true color (24bpp with no alpha), with a few additional code changes to generate said light maps and look up the raw RGB colors instead.

The way the engine works is that there is a 256 color palette stored in the main IWAD file in the PLAYPAL lump. All textures and sprites in the game data refer to colors in this main palette. However, there is lighting to be taken into consideration. The engine has to darken the colors referred to in textures and sprites according to the surrounding light level and z-distance. This is done with a light map, from the COLORMAP lump, which is simply an optimized palette lookup table for 32 distinct light levels. Each light level has a 256-entry lookup table which tells it which color from the 256 color palette best matches the original color darkened to the light level. Of course it won’t be perfect since there are only 256 colors able to be displayed on the screen at one time, so you’ll get some color shifting effects and other quantization effects here. But overall, the result is rather impressive for 1994-era technology!

What I’ve done is (mostly) removed the need for the COLORMAP lump and gone straight to calculating the raw RGB colors from the PLAYPAL palette based on the light levels. This way you get direct 24bpp color from the engine. Of course, our colors are still limited to what’s available in the original palette so the source material hasn’t changed, only our rendering is improved.

The light levels available are from 0 to NUMCOLORMAPS-1, where NUMCOLORMAPS is 32. According to some comments in the code, light level 0 is full brightness and level 31 is full darkness. I was able to easily increase NUMCOLORMAPS from 32 up to 64, giving more distinct colors and a smoother lighting look. I was not able to increase NUMLIGHTLEVELS though; there’s something crazy going on with the code related to that constant.

The part that made this all (relatively) easy was that the neo/framework/common_frame.cpp code which projects the 8bpp screen to the 32bpp texture is very simple and does the palette lookup itself. I left this code mostly the same, except I changed the screens array to store larger integers instead of bytes.

I extended the XColorMap array from 256 entries to 256 * NUMCOLORMAPS entries which essentially makes it a larger palette of 16,384 colors instead of just 256 colors. I modified the I_SetPalette method to precalculate all the 16,384 colors based on the original 256 colors.

The rest of the work involved making sure that all the rendering code could handle a wider screen element integer size than byte. There were lots of hard-coded assumptions that the element size would be a byte, apparent in several memcpy and memset calls.

I did encounter some problems that didn’t allow me to fully skip loading the COLORMAP lump.

The primary problem was with the fuzz effect for spectres and your gun (and also other invisible players in network mode). The problem is that the effect uses a specific colormap (#6) from the COLORMAP lump to “dither” the onscreen colors, which produces an effect that isn’t easy to reproduce with a simple calculation. After failing twice or thrice to reproduce this effect, I finally resorted to just bringing back the original COLORMAP and doing a little bit twiddling on the colormapindex_t values read from the screen to keep the light levels consistent.

The other problem was the inverted color effect (only used when the player picks up an invulnerability sphere). I just had to import the colormap at index 32 from the lump to get this to work and also update the INVERSECOLORMAP to be NUMCOLORMAPS since it’s now 64 instead of 32. Just a little table translation there.

There appear to be two extra colormaps in the lump that I’ve not accounted for so I’m just ignoring them. The game plays and looks great now. Admittedly, the red- and green-tint effects don’t look as good as they used to for some reason. I’ll have to check that out. The effect comes across, but it gets too dark further in the distance.

How I fixed the crash in Doom 3 BFG Edition

Merely 10 hours ago, id Software released the GPL source code to Doom 3 BFG Edition. Unfortunately, when I built the game with VS2012 Premium, the Doom Classic modes crash (both Doom 1 and 2) instantly. Here is the small tale of how I fixed that bug.

The obvious thing to do was to fire up the game in Debug mode and see how far I get. The debugger (under default configuration) wasn’t giving me much when the code bombed out due to an unhandled Access Violation Win32 exception. The key was to force the debugger to break when the access violation exception occurs in the first place rather than letting it pass unhandled. VS2012 gives you a check-box labeled “Break when this exception type is thrown” when the unhandled exception is caught. Turn this on and restart the game and try to start up Doom 1 or 2 from the main menu.

Now we get a first-chance exception occurring in r_things.cpp line 196:

intname = (*int *)namelist[i];

A quick check to the Locals debugger window shows that i is 138. The access violation exception is thrown by the OS when the process tries to read memory at namelist[138]. Let’s try reading from namelist[137] using the Watch window to see if index 137 is safe. Okay, everything looks fine there at index 137. It’s just at 138 where it bombs out. Let’s remember this number.

Now let’s step backwards a bit and try to find our place in the code. Where did this namelist pointer originate from? Jumping back to P_Init in the call stack shows us that P_InitSprites was called with sprnames and P_InitSprites hands that off to P_InitSpriteDefs unchanged. Let’s take a look at this sprnames in info.cpp

const char * const sprnames[NUMSPRITES] = { "TROO","SHTG",**...<snip>...**,"TLMP","TLP2" };

That’s it? No NULL terminator there? And there’s this constant array size specifier there: NUMSPRITES. Visual Studio tells me that its value is 138. That sounds familiar…

Let’s go back and take a look at that function where our first access violation occurred to see why it’s trying to read past the bounds of the hard-coded array (whose length is 138 elements).

We can see that the size of namelist (assigned to ::g->numsprites) is calculated to be longer than it should because there is no NULL terminator present. That causes the loop below it to try to access memory beyond what’s allowed. Here’s the simple counting code:

// line 173 in p_thing.cpp:  
check = namelist;  
while (*check != NULL)  
    check++;

Perhaps the original developer assumed that the const memory section would be zeroed out and the counting while-loop would just luckily run into an extra zero that just so happened to be found just past the bounds of the array? I can’t see why this is a safe assumption to make under any context whatsoever. Perhaps a random happy coincidence of memory layout and padding made this work in VS2010?

Based on this analysis, it seems obvious to me that these methods should be passing around the array’s known count (NUMSPRITES) instead of trying to calculate it dynamically by scanning for NULL terminators. A quick search through the code shows me that these functions are only used once from P_Init so this should be a safe change to make.

This particular instance of this class of bug makes me wonder what other instances of this class of bug are lying around the code elsewhere. I think I got extremely lucky in this instance and could pinpoint a root cause because the data was hard-coded.

I’m going out on a limb here, but it seems that VS2012 added some extra protections to make sure that access violations were thrown for access beyond the bounds of statically-allocated memory regions, which makes me doubly lucky to find the bug. I’m not sure exactly how they’ve done that, not being too familiar with the Windows memory management APIs, but I’m sure there are all sorts of caveats and gotchas with protecting fixed-size memory regions (page alignment issues, etc.). I wonder if this bug would reproduce in VS2010, or any other compiler for that matter…

The pull request I’ve submitted just appends the NULL terminator to the hard-coded array. From here, the code works great and Doom 1 and 2 start up just fine.

XP VMs with VirtualBox

For web developers, if you want to test your site on IE7, go download the free XP image from Microsoft here. Once it’s fully set-up, install IE7 on it; the image comes with the installer on the desktop. Don’t bother with the Vista image unless you need to support something OS-specific, which if you do – you should just stop what you’re doing and severely rethink your web dev stragedy.

For use with Oracle VirtualBox, you’ll have issues with networking, which will prevent you from Activating the VM. Follow these steps to resolve the networking situation:

  1. Download the XP image, obviously: Windows_XP_IE6.exe
  2. Run the EXE to extract the VHD file (ignore all other files) to somewhere you like
  3. Fire up VirtualBox
  4. Create a new virtual machine using the existing VHD file you just extracted, obvious settings apply
  5. Go download the Intel PRO driver at http://downloadmirror.intel.com/8659/eng/PRO2KXP.exe
  6. Place that EXE into a new ISO image using whatever ISO tools you wish (cygwin has mkisofs)
  7. Mount the ISO you created on FIRST boot of the VM and install the driver as immediately as you can. This will help you be able to Activate the VM over the Internet.
  8. Open the mounted ISO from within the VM and run the driver EXE installer.
  9. Reboot should be safe at this point.

NOTE: If you don’t Activate after the second boot, your VM is hosed and you have to start from scratch again (just run the XP EXE and replace the VHD file). I did this at least 4 times to try to find the right procedure.

After you finally activate your VM, you should be fine to install IE7. Don’t bother doing that before otherwise you’ll just waste your time because the VM won’t let you log in after three boots without being activated.

Now you’ll probably want some sort of decent JavaScript debugger. Well, I’ve got some good news and some bad.

The good news is you can get a basic JavaScript stack trace when an exception is thrown but only if you install Microsoft Script Debugger. The bad news is that this tool flat-out sucks and you don’t have many other options. If you know of some, please let me know.

Google Calendar API v3 Undocumentation

Over the many collective months that I’ve been working with Google Calendar API v3, I’ve noticed many undocumented “features” and behaviors. I thought I should sum up my experiences here so that someone else struggling to use this “simple” API would have at least some forewarning of the adventure they’re about to embark upon.

The documentation up on Google’s site is woefully incomplete. Yes, it lists all the methods and most of the parameters and such, the reference documentation; that’s great for starters, but it’s the bare minimum amount of useful information. What is completely missing is the documentation of behaviors you will encounter, what I call the narrative documentation. Google seems to be very bad about narrative documentation in general.

Uniqueness Constraints

What seems to be completely missing from the documentation is uniqueness constraints of objects (or resources, as Google calls them).

For example, an event is unique by its iCalUID property. Is this important fact mentioned on the event insert method‘s documentation page? Not at all. In fact, iCalUID is not even mentioned on this page. You have to go to the generic events resource page to find the first mention of iCalUID at all. Is the uniqueness constraint mentioned there either? Nope.

While we’re on the subject of inserting events, there’s also the import event, which I have no idea about what it does differently than the insert method, other than that they’ve declared that an iCalUID is required for the import method. The summary documentation is a useless one-liner: “imports an event.” Thanks, Sherlock; that was real helpful.

Furthermore, the only documentation about the iCalUID states that it should be an “Event ID in the iCalendar format.” I’ve found this to be completely untrue. That’s probably what the field is intended for, but there is absolutely no format validation for an iCalUID. You can put anything you want in here. (TWSS)

Colors

Google Calendar API gives you the ability to assign colors to events and also assign colors to whole calendars. What they don’t tell you is that assigning colors to events is completely useless unless the calendar the events are contained within is your own personal calendar. In other words, assigning event colors to a calendar intended to be shared with others is pointless. The only control over colors you have in that circumstance is to assign the whole calendar a single color specific to the user’s calendar list. If you really want colorization of events being shared with multiple users, your only choice is to split events across multiple calendars and assign the colors at the calendar level per each user. And of course, don’t forget the uniqueness constraint on the Summary property of the calendars you create!

Also, what they don’t tell you about colors is that there is one global palette of two kinds of colors: calendar colors and event colors. They do tell you there are two palettes, but they do not indicate whether they are global palettes or user-specific palettes. The two (calendar and event) palettes are not the same palette and an event colorId is not interchangeable with a calendar colorId and vice versa. Why use the same type name “colorId” to refer to two incompatible types? Why not just call one an “eventColorId” and the other a “calendarColorId”? Would that be so hard? To be fair, the documentation does make the distinction but it’s not obvious at first glance that the distinction is meaningful.

Furthermore, when Google duplicates events on your behalf (and they do – see the Side Effects section below), they don’t necessarily duplicate all properties, including the colorId property.

Recurring Events

Creating recurring events is extremely frustrating and fraught with many gotchas and stingers. I don’t even want to go into it here; avoid it at all costs if you value your sanity.

Side Effects

WARNING! Side effects of regular Google Calendar API v3 usage may include:

  • Adding email addresses as attendees copies the event to the attendees’ personal calendars. This creates a completely different event with its own eventId, unrelated to the one you created via the API. As far as I can tell, there is no programmatic way to determine if this duplicated event originated from the event you created via the API.
  • Deleting a user which owns calendars that are shared with other users will create a private copy of the shared calendars in each users’ calendar list and will only delete the original calendars owned by the user being deleted.
  • Deleting an event causes it to be marked as a dual “deleted/cancelled” state. I simply cannot figure out the difference between deleted and cancelled, if there is one.
  • Trying to re-create a previously deleted event will cause a 409 Conflict response. You must instead resurrect the deleted/cancelled event which has the same uniqueness properties as the one you are trying to create (e.g. the iCalUID must match).
    • When fetching the event list for a calendar, always set the “showDeleted” parameter to true. This way you can detect if you’re trying to recreate an already existing yet deleted event.

Types

  • /events/list accepts timeMin and timeMax arguments and these are simply stated as accepting a ‘datetime’ argument. Of the myriad possible standardized date-time formats, I have discovered that this value should be a UTC date-time (with offset explicitly set at 00:00) formatted as RFC3339 (yyyy-MM-ddThh:mm:ss.sss+00:00).

There are many more issues than I can list here, but these are fresh in my memory.

Revisions

UPDATE (2012-10-14): removed the bit about calendars being unique by Summary as that was not true.

UPDATE (2012-10-19): added Types section to document timeMin and timeMax arguments

Google Calendar API access with 2LO (2-legged OAuth) and .NET

Getting this scenario to work has been one of the most frustrating experiences of my development career as yet.  I’m writing this incredibly detailed and informative blog post to save others from self-inflicted premature baldness.

Your scenario is that you want to write a background task or service in .NET to communicate with Google’s servers specific to your Google Apps domain for business or education. Your service will not have any external exposure to end-users of your apps domain. You want to synchronize data in bulk to Google on behalf of your apps domain users. You want your service to use the latest and greatest APIs that Google advertises that are not deprecated. At the time of this writing that is the V3 API for things like Calendar (http://code.google.com/apis/calendar/v3/getting_started.html). I will use the Calendar API as my example here since that’s what I was first interested in using for my project.

Google seems to want to make this relatively simple scenario unnecessarily difficult to find any information on how to do it correctly. All of the documentation does one of the following: (a) ultimately redirects you to a page talking about a V2 API, not a V3 API, (b) does not talk at all about 2LO and instead obsesses over 3LO, (c) is misleading and woefully incomplete with regards to the practical information you need to know to avoid the dreaded 401 Unauthorized response from the API servers.

Let me assume that you have created a Google Apps domain for business or education already and that you have superadmin access to your Google Apps domain. If you do not, please do so and find the responsible parties in your organization to grant you superadmin access (if only to a development-specific domain).

Required set-up steps: (every single one is critical; do not skip one thinking you know better or you will fail)

  1. Create an API Project at https://code.google.com/apis/console while logged in as your superadmin account for your apps domain (honestly I’m not sure if it matters which user account you create the project with but it doesn’t hurt to be consistent here just in case).
  2. Go to the API Access section.
  3. Create an OAuth 2 client named “whatever you want” at “https:// wherever you want; it doesn’t matter for this scenario”.
    1. Note that you don’t need to specify an icon and that the name of your client doesn’t matter as no other living soul (a.k.a. end-user in your apps domain) will ever see it.
  4. Copy down the generated client ID (somenumberhere.apps.googleusercontent.com) and the client secret (lots-oflettersanddigits-here).
  5. Go to the Services section while still in the APIs Console and enable your specific services (e.g. Calendar).
  6. Open https://www.google.com/a/cpanel/yourdomainhere.com/ManageOauthClients (need superadmin rights to your google domain here)
  7. Add the client ID (somenumberhere.apps.googleusercontent.com) from step #4 and specify “https://www.google.com/calendar/feeds/” for your scope (assuming you want to work with Calendar API) and click Authorize.
    1. For other APIs, list the proper scopes here, comma-delimited. I just need Calendar API and this works for me. Be sure to specify https, not just http.

     

  8. Go to https://www.google.com/a/cpanel/yourdomainhere.com/SetupOAuth
  9. Un-check the “Allow access to all APIs” under “Two-legged OAuth access control”.
    1. Yes, un-check it. This is per http://groups.google.com/group/google-tasks-api/msg/c8dd0ac7c8f320dc. As of 2012-01-25, this is still relevant and required. Perhaps this will change in the future but for now it is required.
  10. Save changes.

Now you should have a properly set-up apps domain and API project and the two are linked together.

Let’s move on now to your .NET code that will be your task/service that runs in the background on behalf of your users in your apps domain.

Firstly, I do not recommend using the open-source Google client library for .NET. I’ve had bad experiences with it, namely that it has been known to leak memory. The issue I reported with them on this matter was claimed to be resolved but I haven’t been back to check it out. I had to make progress on my project and waiting for them to resolve the issue was not an option.

I wrote my own T4 template (Visual Studio code generator) to generate a client library for Google (and other RESTful APIs which use OAuth 1.0 or 2.0) that has no memory leaks and is both ridiculously fast and efficient: https://github.com/JamesDunne/RESTful.tt. It supports both synchronous and asynchronous methods of I/O. Its code is up to date as of 2012-11-12.

Check out the project locally and open the TestClient project’s Program.cs file. This is a simple console application designed to demonstrate the simplicity of the code-generated API client for Google and using OAuth 1.0 for shared-secret authentication.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using WellDunne.REST;
using WellDunne.REST.Google;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
 
namespace TestClient
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create the authentication module to use oauth 1.0 2LO:
            // NOTE: Replace 'key' and 'secret' with your testing parameters, else you'll get a 401 (Unauthorized) response.
            var oauth = new OAuth10("myclientidhere.apps.googleusercontent.com", "mysecretkeyhere");
 
            // Create the client against googleapis.com:
            // NOTE: This client instance is reusable across threads.
            var client = new WellDunne.REST.Google.V3.CalendarServiceAsyncClient(new Uri("https://www.googleapis.com"), oauth);
 
            // Create the request to get the user's calendar list:
            // NOTE: Each request object is NOT reusable.
            var req = client.GetMyCalendarList(null, null, null, /*requestorID:*/ "username@example.com");
 
            // Fetch the request synchronously:
            var rsp = req.Fetch();
 
            // Write the response JSON object to Console.Out:
            using (var conWriter = new JsonTextWriter(Console.Out))
                rsp.Response.WriteTo(conWriter);
            Console.WriteLine();
        }
    }
}

NOTE: This test program is specific to Google Calendar. If you are working with a different API, you’ll have to edit the RESTful/Google/Restful.tt T4 template to declare the API methods you need access to. It couldn’t hurt to define some Newtonsoft.Json-enabled classes to deserialize the response data to.

For Google Calendar API testing, simply paste in the values from your API Console (client ID and client secret) into the `new OAuth10(“myclientidhere.apps.googleusercontent.com”, “mysecretkeyhere“)` expression.

Then paste in an actual user provisioned in your domain into the `client.GetMyCalendarList(null, null, null, /*requestorID:*/ “username@example.com“)` expression.

Run the program and you should see a raw JSON dump of the response retrieved from Google.

For example, I get this output (id and summary are sanitized):

{"kind":"calendar#calendarList","etag":"\"bt6uG7OvVvCre70u9H5QXyrDIXY/5P7Dh-jUGpT56O5EBhfgecrj2pU\"","items":[{"kind":"calendar#calendarListEntry","etag":"\"bt6uG7OvVvCre70u9H5QXyrDIXY/HstY1Kh3cCrbvmn0afdroRd44BQ\"","id":"owner@example.org","summary":"owner@example.org","timeZone":"America/Los_Angeles","colorId":"15","backgroundColor":"#9fc6e7","foregroundColor":"#000000","selected":true,"accessRole":"owner","defaultReminders":[{"method":"email","minutes":10},{"method":"popup","minutes":10}]}]}

query.ashx

I put together a web-based SQL query tool written using ASP.NET’s IHttpHandler interface, packaged as a single query.ashx file. Its main feature is that it guarantees that you cannot construct a query that inserts or updates data. The SELECT query form is forced upon you and any means of escaping that form via SQL injection is detected and the query is rejected in that case. This makes it safe to deploy internally so that your developers may query data in a read-only manner.

Unfortunately I can’t link you to any public deployment of this tool for a demo because it queries SQL databases. I don’t have any SQL databases out in the wild and if I did I certainly wouldn’t want any random jackhole off the internet querying data from them anyway. You’ll just have to deploy it yourself for your own demo. It’s not hard to set it up since all you need is just some form of ASP.NET host that can compile and execute an ashx file. Virtually any standard IIS host will be able to handle this in its default configuration. The tool runs under the security context of the app pool you place it in so if you use connection strings with Integrated Security=SSPI, beware of that.

Features

  • SQL query builder that allows only SELECT queries (Query tab)
    • Try to break it and find a way to UPDATE or INSERT data!
    • Strips out all SQL comments
    • Actively prevents you from overriding the forced separation of query clauses, e.g. you cannot put a FROM into the SELECT clause unless it’s part of a subquery
  • Results tab
    • Queries are forcibly executed in READ UNCOMMITTED transaction isolation level
    • SET ROWCOUNT is set to 1000 by default but can be overridden with rowlimit=# query string parameter.
    • Dynamic show/hide of SQL column type headers
    • Execution time tracked in msec
    • Results grid with left/right text alignment set per column type (numerals are right-aligned, text is left-aligned, etc.)
    • Binary data is displayed as 0xHEXADECIMALSTRINGS
    • Shows generated SQL query
    • Link to share query with someone else (opening link automatically executes the query)
    • Links to JSON and XML output formats with appropriate Content-Type headers. (Try it in Chrome’s Advanced REST Client app!)
  • Custom Connection String support (Connection tab)
    • The drop-down is sourced from web.config’s connectionStrings section
  • Recording of query history (Query Log tab)
    • Stores host name of client that executed query
    • Paged view
    • Click “GO” link to execute query
  • Parameterized queries
    • Parameter values are saved in the query log
    • Probably want to store a library of queries in addition to the query history log
    • Limited set of parameter types supported, but most common ones used should be there
  • Multiple forms of JSON and XML formatted output for programmatic consumption of query results
    • Add query string params to produce less verbose output: no_query=1, no_header=1
  • Single-file deployment
  • Self-update feature that pulls latest code from github
  • Tip jar!

I rather enjoy writing developer tools like this. I especially enjoy the ashx packaging mechanism. For an ashx file, you write all your tool code in one file that contains a primary class that implements the IHttpHandler interface (or IHttpAsyncHandler). You have complete control over the HTTP response in this way at a very low level yet you still get all the convenience of the HttpContext class with its Request and Response structures that ease the pain of dealing with HTTP headers, URLs, query strings, POST form variables, etc. at such a low level.