Wednesday, 27 October 2010

Subnormal floats

Recently at work I had to write a floating point emulator. Although I did have some background knowledge of floating point operations, I still had to do some research on the matter and found out that, after all, it shouldn't be all that difficult.
The emulator was required to perform mathematical operations at a higher resolution. We are using our own custom hardware with 32 bit registers. I decided to expand the significant bits from the 24 of the standard float into 64 bit significant (actually more than a double) and kept the exponent on the same range (to speed up conversion to and from the simulated float). As a side note, the exponent of the float simply increases the range of representable numbers; it is the significant that increases the amount of intermediary numbers, thus increasing accuracy.
All went well, until our test programs decided to throw subnormal floats at the emulator.

Hell broke lose.

The concept of a subnormal float is simple: instead of flushing the result of the multiplication of the minimum float with a number less than, to zero, the most significant bit of the significant is slowly shifted right with the exponent kept at 0 (a value otherwise reserved to represent zero itself). This is called "gradual underflow". This obviously caused problems, since I used the implicitly set bit explicitly set and used the 0 exponent to check for 0.

The final result was not short on black magic, which was eventually brought to the light side with some assembly level debugging. Since I failed to find resources relating to subnormal float emulation, I figured I might as well share the C version of my emulator, developed on my own free time (not the Company's of course!).

(to be pasted once I figure out how to paste code samples in here)

No comments:

Post a Comment