Saturday, December 02, 2006

More fun with Ryan-McFarland FORTRAN compilers

Getting deeper into the stuff this compiler (and a newer version of it) emit now. You can tell a lot about the internal organization of a compiler by the nature of the OBJ's it spits out.

The RM Fortran compiler was apparently designed to run in extremely modest amounts of memory. Packaging for the earliest version I have indicates it would run with about 128K of free memory -- not 128M -- 128K. That's less memory than the L2 caches on most PC's that have been made in the past decade.

Packaging claims aside, the OBJ's it spits out tend to confirm those modest claims. As any compiler is compiling code, its going to encounter situations (many in Fortran) where a JMP or conditional JMP to a forward memory location is needed. Usually, most compilers would calculate the addresses of these jumps and adjust itself internally as it goes along and emit the blob of compiled code in one shot.

Not so with the RM Fortran compiler. It spits out code with essentially blank place holders in the code where the JMP offsets would go. A blob of code is written to the OBJ that doesn't work because its not complete yet. THEN, after that blob is written out, it emits (many) subsequent records to adjust the JMP offsets in what it just emitted. Essentially, RM Fortran is punting some of the traditional compilation process off onto the linker.

So for Fortran code files that are large, the compiler doesn't really need to swallow them whole -- it takes them small bites at a time, writes out the code for those small bites, and punts the rest to LINK.

Obviously, RM were in a hurry to recycle the memory being taken up by the largish (in relative terms) blobs of code it was working on at any given point and saved just jump offset information. That lets the compiler puke out the code blob early.

Its an interesting concept born of the era when programmers didn't have hard disks and only had machines with maybe 256K and a couple of floppy disks. In terms of compilation speed there's no real penalty to this approach. The quality of the generated code is going to suffer some, the OBJ's will bloat up, and link times would go up some with the linker having to glue together slews of code fragments and apply fixups to them because the compiler didn't bother to do so. All in all, it was a reasonable approach for the 256K PC era (I remember paying hundreds of dollars for a meg of memory). If people were doing mostly floating point number crunching with this product and had 8087's or 287's, the necessarily lesser quality of the non-NPX code wasn't going to matter all that much anyway since those NPX's were relatively slow and integer code was usually waiting on the NPX.

3 comments:

Francis W. Porretto said...

This is a variation on the "backpatching" strategy that was developed to reduce the processing demanded by the great number of forward references in large modular programs. RMS did it well; their FORTRAN and COBOL compilers of that vintage were state of the art.

Backpatching originated, believe it or not, in the design of assemblers! I have an old CP/M 8080/Z-80 assembler that implements the technique. Today it might seem "old hat," but to one whose career in software is about to enter its fifth decade, it's endlessly fascinating.

The Merry Widow said...

Now I feel bad, My maiden name is Mcfarland, great...

tmw

Nikola said...

Beards are the fastest growing hairs on the human body. If the average man never trimmed his beard, it would grow to nearly 30 feet long in his lifetime. cash advance