The Case Against 64-bit Builds

I wrote a version of this some time ago as an argument against what I considered to be an ill-considered push within the company I worked for to 'modernize' software on a particular high-availability high-reliability product line, said push seeming not to take into account the considerable costs and side-effects of such an action on the somewhat monolithic products in question. The argument doesn't necessarily apply well to software in general, but should at least be considered wherever you have a choice of build size.

Leaving aside esoteric number-crunching applications, where the actual size of 64-bit integers (by default) is required, or at least advantageous, the main reason to go to a '64-bit' build is to gain additional address space. So-called 32-bit builds offer a maximum 4-GB address space, which is often smaller than this when realized on a platform. (For performance reasons PPC and Intel Linux platforms often only give processes 3GB of address space, and MIPS Linux platforms only give processes 2GB of address space.) It isn't all that hard anymore to craft a large application that starts bumping its head on the ceiling in a 32-bit process. Systems these days often have more than 4GB of RAM, so a single process cannot use all of it even if it wanted to.

64-bit processes offer the advantage of effectively unlimited address space, more than is physically possible anyway, and so give you the advantage of being able to use all the RAM that is available. (The exact same argument that was used for 32-bit processes not all that many years ago.)

Paradoxically, this same 'advantage' can also be a pretty substantial disadvantage! To wit:

A process is able to use all the RAM that is available. What if it's not supposed to? Bugs happen. Ulimit quotas can be used to prevent this, but require more effort to ensure that the limit is reasonable, and kept up to date as code evolves, or else the 'cure' will be worse than the disease. Usually all processes in a 64-bit environment end up being 64-bit processes, and so should have quotas if ultimate system reliability is to be ensured. (This isn't really new.)
You usually don't want to have a single giant process with a bazillion threads in it. The threads are all vulnerable to each other, and the larger the pie gets the more likely someone is to drop it. A large 32-bit multifunction (multithreaded?) process that is straining at the seams should be chopped up, if possible, rather than having its address space made larger.
32-bit processes that are converted to 64-bit processes are immediately larger than they were, even with no functional changes. This is because all default integers and pointers are twice as large as they were before. With a fixed amount of RAM in the target system that means you're using what you have less efficiently than before, and unless you have a task that itself must access more than 4GB (3GB? 2GB?) of resources there's no inherent value to the move. Such tasks are actually fairly rare, statistically.
These less-semantically-dense instructions not only effectively reduce available RAM, they reduce the effective cache size of the CPU, and the effective speed of the CPU since the instruction stream is bigger but the hardware's fetch rate is unchanged. They also increase the size of a build's on-disk (in-flash) footprint. (Slower software installation, potentially even prohibiting an installation due to lack of space.) These effects are all bad.
Core files can be large. Very large. That makes them difficult and slow to handle, as they're usually full of a lot of un-interesting information too. Huge core files are more likely to be truncated due to a lack of system resources, both on-DUT and off, and truncated core files are usually useless.
Though a bad idea, there are often a lot of casts within older code, many of which assume that a pointer can be jammed into a U32. Some of these are insidious, and can take a lot of time to shake out of what is otherwise perfectly functional and reliable code. More work, for zero functional gain.
Upgrades from 32-bit to 64-bit builds may cause problems with saved data structures.
In-service upgrades (from 32-bit to 64-bit) may not even be possible.

IMHO, the Engineering effort to convert to a 64-bit build might be better spent on slicing into multiple 32-bit pieces. That would offer the following real, and potential, advantages:

Separate processes could be implemented as separate programs, which opens up the possibility that these could be individually replaced. (Incremental upgrades? In-service upgrades? Individual subsystem restarts? Such features would require additional work, and are not strictly necessary, but do become possible once the surgery has been done.)
Malfunctioning subsystems don't necessarily take down other features. If one crashed the rest of the system could continue on undisturbed. (At least this would offer the choice to continue running at reduced functionality until a maintenance window opened up, if you wanted to implement this.)
Memory/cache utilization remains as efficient as before, though there would be some per-process penalties that you didn't have before. The system would provide the most perceived functionality this way.
Core files can be smaller! If something crashes you'd only get the core file from the crashing process, and not all the data from every other unrelated process that was part your product. You could afford to keep more core files than before, and on- and off-DUT resource loads would be minimized.
Unnecessary to re-write already-functional 32-bit-assuming support functions, with the attendant risk of collateral damage.
By considering the necessary interactions among major components, and designing interfaces to accomplish them rather than directly reaching out in a large address space, reliability (and test-ability) is ultimately enhanced. Complexity is the bane of reliability, and 'wide' interfaces like shared libraries and direct access to data structures are ultimately more complex, due to the lack of constraints.

Return to Site Home