Monday, December 04, 2006

Damn you Bill Gates - the saga of the "big bit"

Microsoft threw my current programming project a 6/12 curve ball today.

You see, in the Intel OMF (object module format) that more of less corresponds to what a .OBJ file is that PC compilers and assemblers spit out, there is a thing called a SEGDEF record. The SEGDEF, not suprisingly, tells what the characteristics of a particular segment are. This includes the segment's name, class, combine, alignment characteristics, and of course the segment's length.

Segments are the stuff that make up the code and data in a program. How they lay out and get managed is plainly a matter of supreme importance if things are to actually ummmm...work.

In the 16 bit world, segments were of necessity limited to 64K in length because that's the most you can fit in 16 bits worth of addressability.

HOWEVER - its also quite legal, and happens frequently, that SEGDEF's in OBJ's may describe segments of ZERO LENGTH. Zero length segments are used for a multitude of purposes. Place holders, to force particular orderings in memory, etc.

Now there arises a problem -- the notorious "fence post" class of issues. With only 16 bits available to tell the length of a segment, the largest number you can have in 16 bits is 65535 (64K-1). With zero length segments being legal, you can't do something like say a length of zero means a 1-byte segment and 65535 means a full 64K segment.

Intel solved this issue with a thing called the "big bit". Part of the SEGDEF record includes a bit that says if a segment is a full 65536 bytes long. If the 16 bit length is set to zero and the "big bit" is turned on, then the length is really 65536, not 65535.

Architecturally, this "solved" the problem. Zero length segments are still OK, and you can have full 64K segments as well by using the "big bit".

Alas, along comes the Microsoft Macro Assembler. Versions of MASM from 1981 through roughly 1987 that I've tested don't set the "big bit" when they construct a full 64K length segment. They do however set the SEGDEF that describes them's length to zero.

This bug in MASM caused the OBJ validation phase of my gizmo to choke when it noticed that code/data was being emitted into a (apparently, but incorrectly) zero length segment.

How a bug as egregious as this one somehow managed to persist over several versions of MASM spanning around 6 years is a real mystery. I'll have to look into that apsect further. I'm going to hazard a guess and say it was maybe dealt with in LINK if it was dealt with at all.

To their credit, Microsoft eventually got this fixed. MASM 6.11 doesn't exhibit the problem and sets the "big bit" correctly.

My gizmo now includes a new command line switch: -Fm That will "fix" the broken MASM's "big bit" when it encounters a zero length segment that subsequently has code/data being emitted for it.

I could see this in the first version of MASM. Lots of the 1981 vintage tools and apps were really quite wretched and bug ridden (not just Microsoft's either). But 6+ years? Oy!

No comments: