When we talk about C source code portability, we’re talking about writing the code so that it can be easily moved (ported) to another environment, so that after recompiling and relinking, it will behave the same way it did originally (ideally without any changes to the source code itself, but in practice, with only minimal changes).
What do we mean by “another environment?”
- Moving to a different operating system on the same hardware.
- Moving to a different version of the same operating system on the same hardware.
- Moving to a different variant/flavor/distribution of an operating system.
- Moving to a different CPU hardware architecture.
- Moving to a different C compiler on the same hardware and operating system.
- Moving to a different version of the same C compiler on the same hardware and operating system. (Yes, code can break between versions of a compiler offered by the same compiler vendor.)
Why do we want portable code?
The act of porting the code takes time and effort (and therefore has a cost), in terms of understanding exactly what has to change, making the change(s), and testing the modified code. If we can reduce the number of required changes to zero, or to a very small number of isolated changes, we can reduce the porting effort and project cost.
How do we strive to achieve portability?
Here is a partial list to give you some idea of what to worry about:
- Don’t assume the size of any data type. Data type sizes can and do vary from one environment to another. For example, an int might be 16 bits, 32 bits, 64 bits, or more. It might vary between compilers for the exact same hardware. It might change from one compiler version to another. The size of an int may or may not have any relationship to the natural word size of the CPU hardware.
- Don’t assume that a pointer (to anything) is the same size as an int or the same size as any other data type. Pointers are sometimes the same size as an int, but are often a different size from an int. For example, in many popular compilers, building for a 32-bit target gives you a 32-bit int and a 32-bit pointer, but building the same code with the same compiler for a 64-bit target gives you a 32-bit int and a 64-bit pointer.
- Don’t make calls directly to the operating system. Instead, use standard library functions.
- Don’t make assumptions about the underlying hardware, speed, memory size, memory map, I/O map, etc.
- Don’t assume a specific endianness (byte ordering) of the target system. Not only can endianness vary from one CPU architecture to another, but some CPU architectures allow switching between big and little endian.
- Avoid the use of bit fields in structures, if you’re relying on a specific packing/ordering of the bit fields. Handling of bit fields varies between implementations.
- Don’t assume that structure packing/padding will be the same in all environments. Packing and padding behaviors can and do vary between compiler implementations, even when targeting the same CPU hardware.
- Don’t embed assembly language in the source code. By definition, the code will break if you try to port it to a different CPU architecture.
- Don’t use compiler intrinsics or compiler-specific keywords and pragmas. Obviously, not ever compiler implementation will have these features, so for maximum portability, avoid them.
- Avoid the use of newer language features that have not been widely adopted. Some people are really taken aback by this rule, but it has a very practical purpose. For example, variable-length arrays have been part of the C standard since C99. But many compiler implementations have never supported the feature, so porting code that uses this feature becomes a problem. (The C11 standard has demoted this language feature to optional, so it’s even more likely that many compilers will never implement the feature.)
- Avoid all other undefined behavior and implementation-specific behavior. Your code might appear to work in one environment, and fall apart as soon as you try to port it another environment. This requires some common sense and a knowledge of what is undefined and implementation-specific. Many compilers produce helpful warnings when code ventures into these areas, but many don’t say anything at all.
If you must violate these rules – and there are sometimes excellent reasons to do so – it’s best to isolate that non-portable code into a separate module, so that the porting work is isolated and minimized.
Even if you don’t ever intend to port your code beyond its initial target, it’s a good idea to keep portability in mind in all projects – you just never know where your code is going to end up.
Portability vs Efficiency – An Ancient Struggle
Portability is not a new concept, nor is it unique to the C programming language. In a source code portability experiment performed in 1960 between two competing COBOL compiler vendors, it was found that only a minimum number of modifications was required, due to slight differences in the two compiler implementations. One member of the team said that COBOL does not simultaneously preserve efficiency and compatibility across machines. The same could be said today, nearly 60 yeas later, for many general-purpose high-level languages, including C. If you can’t use compiler intrinsics and other non-portable local performance enhancers in your source code, you may be trading portability for efficiency.