Sunday, October 18, 2009

Tangent Basis as a Quaternion

Have not done perf testing on this much yet, but if you want to save some memory and bandwidth at the cost of a bit of maths try packing your normal+tangent basis into a quaternion instead. It might be a good option vs 8-bit normals. Can even compress the quat into 3 components too if you assume dot(q,q)=1 by negating the quaternion if w is negative which should work fine if its normalized. I suspect this could be a win for static geometry.

Thursday, October 08, 2009

Color Coding Makefile Output

Quick useful tidbit... color coding makefile output...

NO_COLOR=\x1b[0m

OK_COLOR=\x1b[32;01m

ERROR_COLOR=\x1b[31;01m

WARN_COLOR=\x1b[33;01m


OK_STRING=$(OK_COLOR)[OK]$(NO_COLOR)

ERROR_STRING=$(ERROR_COLOR)[ERRORS]$(NO_COLOR)

WARN_STRING=$(WARN_COLOR)[WARNINGS]$(NO_COLOR)

You can then echo out $(OK_STRING), $(ERROR_STRING) or $(WARN_STRING) depending on if errors were encountered. Errors are Red, Warnings are Yellow, Success is Green. If you want to know what the goofy strings like "\x1b[32;01m" actually mean, well it really should be obvious, duh! But in case you can't decode that enigma of a color coding scheme you can read about it here.

The only tricky bit then is just conditionally echoing the right string. Personaly I found the easiest way is to use temporary files to determine if the build was a success or had errors/warnings . But maybe there is a cleaner solution that doesn't require temp files?

@$(ECHO) -n compiling debug foo.cpp...


@$(CXX) $(CFLAGS) -c foo.cpp -o $@ 2> temp.log || touch temp.errors


@if test -e temp.errors; then $(ECHO) "$(ERROR_STRING)" && $(CAT) temp.log; elif test -s temp.log; then $(ECHO) "$(WARN_STRING)" && $(CAT) temp.log; else $(ECHO) "$(OK_STRING)"; fi;


@$(RM) -f temp.errors temp.log

For those that don't like writing makefiles explicitly, XPJ currently supports this color coding mechanism and leaves the OK/ERROR/WARN strings defined in the user specified platform file so you can alter the colors or disable them entirely.

Friday, October 02, 2009

Eye Candy: GTC Demos

Here are some videos of the tech demos used during the GPU Technology Conference Keynote this year which were all simulated and rendered live in real-time. Our goal was to take the bits of tech we were each working on and produce a small demo of it to give the audience a taste. All the demos were presented in 3D Stereo and looked amazing. So, in order of appearance...

- Sarah Tariq made a pretty kick ass demo showing off the turbulence (sorry for the really poor video, a better one will be posted soon I am sure).

Photoshop PSD Loading

A year or so ago I went about improving my tools chain for textures. Previously an artist would have to export their texture from Photoshop to PNG and then import into a custom engine format. This was problematic because it meant there was this useless intermediate file always laying around (the artist's .psd, the intermediate .png and the engine formatted .xbin). It was yet another piece of the pipeline that could get out of sync and cause problems.

There were two solutions I saw, either write a plugin for Photoshop, or import the PSD files directly in my tools chain and convert to engine format there. Now because I have sort of this automagical serialization tech writing plugins for other tools makes me a bit nervous because everything has to stay in sync and I also didn't want to have to support 50 versions of Photoshop and other imaging tools so the latter choice was selected.

After a wee bit of research I found that the PSD format was actually fairly well documented. Along with the exact run length encoding algorithm used internally for compression. The format is incredibly stable too, appears to not have changed much since version 2.5 and works fine with Elements (the $99 version of Photoshop).

One big concern I had was that I would have to replicate all of Photoshop layering options and composite the image myself, but luckily enough PSD actually stores a "preview" image in the file which is an already baked single layer version of the image including alpha channel so I didn't need to bother with compositing. Basically all the pieces were there... and it looks incredibly easy to do, so I just did it one lazy Sunday afternoon last year and thus far its worked fairly well for me. I still support PNG importing to make it friendly for non-Photoshop tools still, but the majority of my textures now come straight from Photoshop into my own format.

Anyways, below is a download link to some simple source for loading PSD images. It includes some test source for then saving back out in PNG to confirm it works.

Tuesday, September 08, 2009

On Scripting: a rant that will likely annoying some friends

(warning, rant ahead)

It seems to me that a lot of game developers jump head long into setting up their game engines to be based around writing gameplay code in some scripting language without any sensible reason to. It just seems like the thing to do, right? Everyone does it so obviously we should to! Often times these languages offer no real productivity improvements over native C++ but still require significant work creating such systems, binding them with the engines and maintaining the interface with all gameplay systems. This also comes at the cost of being able to use a top notch debugger (like Visual Studio) in place of either a non-existant or inferior one and at the cost of a massive performance hit for any code executed inside script (anywhere between 20x slower to 200x slower depending on the VM).

So, what are the benefit FOR scripting (feel free to ping me with anything I missed):
1) You can sandbox in your gameplay programmers so they don't crash your game. This is nice in theory, but in practice they still manage to crash and/or hang the game just fine. Any language that is flexible enough will also not be fool proof enough. There just simply isn't time to make sure all the entry points back to native code can't do anything stupid.
2) It's easier to program for because its a script. This argument never made sense to me. On top of there being an inferior debugging environment, any game studio is not going to hire a programmer/script that doesn't have significant C++ experience. So on top of throwing away that experience they are getting tossed onto most likely a new language for them to learn. Wouldn't it be more efficient just to use the language EVERYONE is already using and as a bonus have it run at native speeds? Also, I have seen a few different game scripting languages that are basically just VM versions of C/C++... which I guess is an attempt at making it more familiar to C++ programmers but really just completely invalidates the argument that it can somehow be easier, since it basically is the same at this point.
3) Iteration time is quicker with scripts. This is unfortunately true sometimes, but only because IMO the project is not being properly maintained and thus build times are completely absurd. I have seen large projects take 40 minutes to compile and 10 minutes to link... but I have also seen LARGER projects take just a few seconds to incrementally compile+link. To me it seems easier to fix the build rather than setting up the bindings between native and script. Another option is sectioning off gameplay code into a DLL, which is not ideal, but more ideal than a VM.

And the arguments against:
1) Slow as shart. Even the best VMs can still be 20x slower... think about that, 20 times slower. Where else in game development has such a massive perf hit been acceptable?
2) Memory, this varies a lot, but some off the shelf scripting languages (like python) can be quite costly on this front.
3) VC++ has a better debugger and all your programmers already know C++. There is also loads and loads of 3rd party libraries/middleware available to use directly in C++ with no binding craziness required.
4) Your engine team saves loads of time not having to maintain bindings to the scripting system.
5) You have LESS code without a scripting system. So your build times are less, you have less code to maintain and potentially less bugs.

So, How does one make such a transition from VM to Native without causing complete insanity?
1) Don't do this mid project, thats a terrible idea, what are you thinking?! This is something to consider for next project.
2) Instead of making bindings for everything... make clean/safe APIs. This will both improve internal engine productivity/safety but make it easier and safer for game code to access such systems.
3) Have an incredibly anal memory allocator that slaps you when you leak memory, go above budget, or just grow too quickly. Make sure you have tools (either internally developer like Excel dumps, or externally developed like RAD's Dresscode or Deja's Insight) to keep tabs on who the big spenders are. Odds are you already have all this infrastructure, and odds are the game scripters can already abuse stuff in script. But if you have a lot of junior coders its best to keep track of them early and teach them proper usage rather than trying to clean stuff up later.
4) Have strong discipline across your engineering org... even scripters. Stop letting the gameplay guys write terrible code just because its script and you don't need to look at it. Trust me, it was an unmaintainable mess before it was C++ too, you just were too busy ignoring that problem. Make sure they follow the coding standard, they don't hack random parts of the code base and that they follow some sort of generalized actor or component paradigm rather than just making shit up on the fly. Remember, they were somewhat constrained before in what they could do in a scripting language, make sure they don't think the transition to C++ is license to do whatever the hell they want. Migrate from technical constraints to manual constraints enforced by the leads. It's not that hard, and they will learn quickly.

Final Thoughts
So, basically this rant was about game studio leads pushing off the responsibility of mentoring and managing junior programmers and instead just tossing them in a sandbox. Sounds harsh but that is one way of looking at it I suppose. My argument is that with strong discipline and good practices on the API front lots of time and energy can be saved by ditching a complex scripting language and in the process gain untold amounts of performance back.

Not to confuse you all, but there is still probably one potentially good place for "scripting"... and that is per-level sequences. I would prefer to not think of these as scripts/code but rather as data. In essence this is stuff authored by level designers or cutscene animators, not by your coding staff. In Unreal Engine this is normally the territory of Kismet and Matinee and I think visual sequencing languages is the real king here. If budget or schedule does not allow for the development of such a visual system I think a simple batch file like scripting system is all that is needed. Maybe there is already middleware out for this though?

Saturday, September 05, 2009

Variadic Macro magic for enums, or so I thought

(this is one of the first images that popped up when searching for "variadic macro" on images.google.com... go figure)

Warning, this is actually a rant, not really any useful information.

A few months ago I thought I would go through and clean up the way I map enum values to strings (used for serialization and exposing to my level editor). Currently I have to define the enum twice, ones for the enum and once for the strings, I have some simple error checking but its not entirely fool proof and tedious to maintain. The solution I thought was Variadic Macros, so I spend the better part of a night coming up with a solution that worked fairly well on GCC. When I was done I was quite proud of myself, I had a solution that generated fairly nice C++ and even worked on GCC.

But fate was a bitch that day... for when I went to try it on VC++ I found that Microsoft had a fairly hokey implementation of Variadic Macros. The basic problem was this, VC++ expands __VA_ARGS__ after macros get expanded which is not consistent with the way the C99 defines the behavior.

example:
#define MY_GREAT_MACRO(...) myGreatFunction(__VA_ARGS__)
MY_GREAT_MACRO(foo, bar)


Now in GCC that would expand to something like...
MY_GREAT_MACRO(foo,bar) -> myGreatFunction(foo, bar)

But in VC++ you would get something like...
MY_GREAT_MACRO(foo,bar) -> myGreatFunction(foo,bar,)
...notice the trailing comma? Basically what happens is VC++ treats __VA_ARGS__ as a single value and then is expanded as the last step in the preprocessor.

So, basically VC++'s implementation of Variadic Macros is completely worthless... its no different than the old school trick of doing MY_MACRO((my,great,args)).

Microsoft meanwhile doesn't appear interested in fixing this pretty much ever... And yes I realize this is technically a C99 feature and not a C++ feature, but it still doesn't work in C in Visual Studio, and its quite reasonable to expect C features to work in C++ IMO. Microsoft also obviously did attempt to put variadic macros into VC++ so they can't hide behind the "well its a C feature not a C++ feature" bullshit, because they did try and implement it, and they did document it!

If you are interested in the technique I came up with you can download a test of it here. If you can figure out how to make this, or something similar, work in VC++ you get major brownie points from me.

libbom: an efficient and simple binary container format



Here is a little story of a little library I wrote so that I could load up game assets lighting fast. If you don't like stories you can just download it from here right now instead.

Background
A while ago I setup this fancy serialization tech inside of my hobby engine that allowed easy serialization of arbitrary game classes with virtually no extra effort from the author of said class. The first backend file format was XML simply so I could debug it and it was fairly easy to map a class hierarchy to but always knew eventually I would have to pack the data in a more efficiently... enter libbom, which is a dirt simple bit of code for saving and loading arbitrary data to disk and like XML maps well to hierarchical data structures.

Now before I went and wrote this code I poked around on google for a while to see if I could find if someone else had already done this for me and while I found a bunch of various projects nothing really fit my requirement of being able to load and traverse binary data without reinterpretation. In other words, I ideally wanted to do one big 'fread' and then start traversing my data. I didn't want a lot of memory allocations or converting offsets to pointers.

Elements and Attributes
I tried to model the terminology here as close to XML as possible. So, simply put, an Element contains a Name, a list of Attributes and a list of child Elements. An Attribute contains a Name and some user specified Data.

Tables
All data, including the names of Elements and Attributes are stored in tables, and there are two tables.

1) ASCII String Table - this is primarily used for storing Element and Attribute names, but in reality you can store anything you want in there (but its ascii so its probably not useful for human readable strings used in-game). For instance in my hobby project I store all sorts of class and type names inside the BOM string table because it automatically removes duplicate strings. For Unicode strings you will need to use the second table... Also note that strings can be optionally used here, in my engine I only use the strings when the file version doesn't match the engine version.

2) Data Table - this table is not interpreted at all by libbom. The application can simply allocate a block of this table and store what ever they want in it. Data is stored separately from the DOM Tree so that it can be traversed efficiently as well as gives the application a chance to share data between Attributes (the application could, if it wants, remove duplicate entries in the Data Table in order to reduce file size and just reference the same data multiple times).

Reading
libbom can read data and traverse it without any conversion process. Basically pipeline is something like this:
1) reads a small header that indicates how much data is to be read, and a few statistics like how many elements vs attributes.
2) allocates some memory.
3) reads the rest of the file into said memory.
4) Profit! you can now directly query the data tree!

Writing
Now because libbom was optimized for fast loading the writing phase is not so fast (but not so slow either). Basically libbom contains two parallel sets of classes, the uncooked classes (which are connected by pointers) and the cooked classes (which are contiguous in memory and connected by offsets). To make life as simple as possible to write a file you simply setup a DOM tree using the uncooked classes and then call the 'cook' function to bake them into a single contiguous block of memory which can be written to disk.

Compression?
libbom does not contain any compression support built in at all... it is assumed the application will either put BOM files together into a single compressed archive or compress them individually itself. Building in compression I do not believe would be of much benefit since many applications will have their own scheme and doubly compressing things does you no service.

Final Thoughts
So, there you have it, it's simple, fast, and non intrusive. Now I have only tested this on a hand full of platforms but thus far it appears to work on both little and big endian machines and the code is do darn simple that it really shouldn't have any problem on any mature C++ compiler. Let me know if you try it out and what you think. And more importantly, let me know if you know of a better solution out there!

Friday, September 04, 2009

stb_truetype: my experiments


(Blogger scaled the image down slightly so click to enlarge)

Recently I discovered that Sean Barrett has put up on his website a nice little True Type font rasterizer. The keyword here being "little", compared to FreeType its microscopic with less than 1600 lines of source code in a single file. FreeType on the other hand is about 6.5MB of just source code... I have worked on games with less source code! So the possibility of tossing out that much bloat is really appealing to me.

Anyways upon finding this I quickly tossed it into my hobby project game engine to compare performance/memory/quality to FreeType with the assumption that it is probably faster, probably thrashes memory less and quality is probably roughly the same. Here are my discoveries:

Performance
This was a big shocker to me... FreeType was a good 3 times faster than stb_truetype at rasterizing a font of about 350 characters at height=14 pixels 8ms vs 26ms... thats almost double a single frame's budget! Granted usually you rasterize fonts once during loading and if you only have 1 or 2 fonts this is probably a non-issue, but if you have lots of fonts and supporting asian localizations this could potentially add seconds to your load time and that might be a deal breaker if you are already barely squeezing into TRC load time requirements. If you are dynamically caching fonts (like because you support rich text or something) this could be make it quite difficult to keep performance up. Luckily in my situation though I am pre-baking my fonts in my tools chain not at load time so performance isn't quite a deal breaker but I would prefer not pre-bake if performance was good enough. One thing to note is that in stb_truetype I was able to set it up to rasterize directly into locked texture memory whereas FreeType has no such API so I have to rasterize and then blit into texture memory so in a way FreeType was given a small handicap already.

Memory
Because I rasterize glyphs one by one the peak memory consumption for both FreeType and stb_truetype is quite low (just a few KB). But the number of allocator calls for both is quite high. Oddly enough again stb_truetype hammers on the allocator more making ~8,000 allocations versus FreeType's ~3,000 for the same font. Neither particularly makes me happy as a runtime solution. But the good news here is that while I would not want to touch FreeType's source code with a 10 foot pole, looking around in stb_truetype's source I believe I could stick in a relatively small fixed size pool and make stb_truetype run without any allocations of its own. So there is actually quite a bit of hope here.

Quality
Quality of stb_truetype isn't terrible but its not great either. FreeType comes off much cleaner and more closely matches the way Windows rasterize the same font. stb_truetype always comes off a bit bolder than it should be, a bit blurrier and a few artifacts like some characters being 50% transparent. Overall I think quality is probably acceptable for games and tools, but I wouldn't be shocked to have a few artists complain about it looking different than it does in Windows.

Mikko Mononen brought up www.codinghorror.com/blog/archives/000884.html. I am not so sure yet if the differences are within the same threshold or not, but I suppose its possible the quality is no more different than OSX vs Windows (which, btw I think fonts look better in OSX).

Final Verdict
I am not quite ready to toss out FreeType yet, but I am awfully tempted. I think all the issues with stb_truetype can be resolved without too much effort so I fully expect to be giving FreeType the boot eventually.

Two types of engineers

Type 1) Get shit done fast.
Type 2) Everything you do should be done to make the next thing get done faster.

In a race to see who can get stuff done fastest... Type 1 will win every time when measured on a single task. But if you measure over and extended period of time to see who gets the most done Type 2 will win every time.

It is always a challenge for management to differentiate between the two types because most of the time they measure in such a way that is beneficial to Type 1, but what they really want is a Type 2.

That will conclude today's philosophy lesson.

So I might try this blogging thing again

So I decided I might try this blogging thing that all the cool kids are doing again, but this time instead of trying to ramble on about my day to day life which is not particularly interesting I figured I would use this as an outlet for my ramblings on technology, algorithms and random bits of code.