Linux binary compatibility explained at 5 levels of difficulty

What would happen if you copy an application compiled on one system and try to run it on another?

Ruvinda Dhambarage
15 min readApr 17, 2024
Generated with DALL-E 3

The subtitle of this blog is one of my favorite interview questions. On the surface it is a pretty easy question to answer, but as soon as you start peeling back the layers and coloring outside the lines, this rabbit hole of a subject will test even the most experienced Linux developers.

A real world example of how this knowledge can be helpful is when it comes to troubleshoot issues in CI/CD pipelines. There, it’s not uncommon to have to build and run on heterogeneous environments. Link when you build using developer Docker images and have to deploy on serverless AWS Lambda instances. I personally find myself having to fallback on this blog’s fundamentals to resolve such issues in my day job.

Inspired by Wired’s “5 Levels” series of explainers, I too will attempting to explain this topic at five levels of increasing complexity.

Level 1: The Linux n00b

Generated with DALL-E 3

A “native” executable on a computer is a file that consists of machine instructions (assembly instructions) that “executes” on a CPU to perform a task. Therefore at the very minimum, the assembly’s ISA (instruction set architecture) need to be compatible with the CPU to be able run. That is to say that you can’t do something like run ARM binary on a x86–64 machine, at least not out of the box. You can use a machine emulator (e.g. QEMU) to translate between ISA’s, but that will impose a significant overhead. MacOS famously does this very well with their Rosetta 2 emulator which allows you to run apps compiled for Intel Macs on an Apple Silicon machines. But I digress.

This is what it looks like if you try to run an ARM bin on an x86 machine

# Run a ARM binrary on a x86-64 PC
> ./test.arm
bash: ./test.arm: cannot execute binary file: Exec format error

The error is mostly self explanatory and it will be your clue that there is an incompatibility with the binary’s ISA. To get more information use the file command.

# Using the file commnad
> file test.arm
test.arm: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), \
dynamically linked, interpreter /lib/ld-linux-armhf.so.3, \
for GNU/Linux 5.10.185, \
with debug_info, not stripped

Here we can see the issue; it is indeed an ARM binary!

If you look closely at the output, you will also see two other dependencies listed. One is the “interpreter”, a.k.a “the C runtime” or “dynamic interpreter /loader”. That is the runtime counter part of the compile-time linker. It’s job is to find the required libraries at runtime. Here it is /lib/ld-linux-armhf.so.3. The second dependency is the kernel version, which is GNU/Linux 5.10.185 here. We can remove the C runtime dependency and significantly improve the portability of our binary by “statically linking” it. Then there would be no runtime library requirement and you can run it on any system with a compatible ISA and Linux kernel.

# Statically linked binary are more portable
> file test.arm.static
test.arm.static: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), \
statically linked, \
for GNU/Linux 5.10.185, \
with debug_info, not stripped

Pro tip: keep the file command handy in your troubleshooting tool set. It is a versatile tool that can save a lot of time.

Level 2: The enthusiastic Linux user

Generated with DALL-E 3

Linux executable are formatted according to the “ELF” binary format. This standard defines a structure for the binaries so that the “Linux OS” can bootstrap and run it. The “Linux OS” in this case is a catch all for the low level system interfaces, which include the C runtime, it’s dependencies and the kernel. The CPU ISA along with these system interfaces define the “target triple”. I explain what target triples are in more detail in here: @ruvi-d/a-master-guide-to-linux-cross-compiling

The target triple must match between any two machines for a binary to have any chance of running natively. Otherwise you’d need something akin to a hypervisor OS to bridge the gap.

To demonstrate this, I will cross compile a simple test app using a musl lib toolchain. Musl is lib C implementation that is a light weight alternative to GNU glibc. Lets see what happens when you try to run it on my laptop which has glibc.

# Try to run a musl binary on a glibc runtime
> ./test.musl
bash: no such file or directory: ./test.musl

# Details of the incompatibility
> file test.musl
test.musl: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), \
dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, not stripped

Here we get a very misleading “No such file or directory” error. What it means by “no such file” is that the musl interpreter specified in the binary ( /lib/ld-musl-x86_64.so.1) is not available; not that the ./test.musl file itself is missing or that it doesn’t have the correct permissions. You’d get the same error if you try to run 32-bit bins on a 64-bit machine for the same reason.

Pro tip: Keep an eye out for this error message in the wild, it can trip you up if you are not paying attention and you can loose hours in a wild goose chase trying to figure out why a particular file is missing.

Next we need to take a minute to understand dynamic libraries work. Dynamic libraries are those .so files that you see in system folders and it’s primary purpose is a form of code reuse that allows common functionality used by multiple applications to be split out into one shared library. Thus reducing the file sizes. This also allows you to update or fix the common code in the library without having to recompile all the applications that use it.

The information regarding which libraries a particular binary requires is present in the ELF binary data structures. The objdump and readelf commands are some of the more commonly used tools to extract this information off of a binary.

# Using objdump to find the required libraries
> objdump -p test.bin | grep NEEDED
NEEDED libpthread.so.0
NEEDED libc.so.6

# Using readelf to do the same
> readelf -d test.bin | grep 'Shared library'
0x00000001 (NEEDED) Shared library: [libpthread.so.0]
0x00000001 (NEEDED) Shared library: [libc.so.6]

In the example above, we can see that this particular test app needs both lib C (glibc) and pthread libraries at runtime. The ldd command is a more powerful alternative to objdump/readelf. In addition simply listing the dependencies, it also lets you know whether they can be found and if found, it will list their file path as well.

# If all dependency libs can be found
> ldd test.bin
libpthread.so.0 => /lib/libpthread.so.0 (0x8badf00d)
libc.so.6 => /lib/libc.so.6 (0x8badf00d)
ld-linux-armhf.so.3 => /lib/ld-linux-armhf.so.3 (0x8badf00d)

# e.g. if pthread is missing
> ldd test.bin
libpthread.so.0 not found
libc.so.6 => /lib/libc.so.6 (0x8badf00d)
ld-linux-armhf.so.3 => /lib/ld-linux-armhf.so.3 (0x8badf00d)

Pro tip: the ldd command is yet another powerful tool that can help to quickly troubleshoot issue related to missing runtime dependencies.

Bonus trivia: what’s the difference between a C, C++, GO or Rust lang “hello world” applications in terms of their runtime dependencies?

# C lang
> ldd hello.c.bin
linux-vdso.so.1 (0x00007ffca8b20000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000706c49c00000)
/lib64/ld-linux-x86-64.so.2 (0x0000706c49ec9000)

# C++
> ldd hello.cpp.bin
linux-vdso.so.1 (0x00007fffcf4e0000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x0000716b98400000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000716b98000000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000716b98685000)
/lib64/ld-linux-x86-64.so.2 (0x0000716b9878f000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000716b98665000)

# GO lang
> ldd hello.go.bin
not a dynamic executable
> file hello.go.bin
hello.go.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), \
statically linked, ..

# Rust
> ldd hello.rs.bin
linux-vdso.so.1 (0x00007ffc5238d000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007295eda94000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007295ed800000)
/lib64/ld-linux-x86-64.so.2 (0x00007295edb2b000)

We can see that GO bins are the most portable in this lineup. They are statically linked with no runtime dependencies. C and Rust are the next most portable; both requiring the a “C runtime/ interpreter” to function. The C++ app is the least portable with a notable dependency on the C++ standard library. Portability is one of the reasons that GO lang is such a good fit for cloud native development; allowing it to be easily deployed on light weight containers.

Level 3: The plucky Linux developer

Generated with DALL-E 3

So far we figured out that we need to match the target triple and then we learnt how to find the required runtime libraries. But finding the “correct” versions of the required libraries is not straightforward thanks to library versioning.

Library versioning is a must have because software evolve over time. New features get added, bugs get squashed and security holes get plugged. This means that sometimes libraries need to update in ways that will break their exiting behaviors. e.g. older APIs that are insecure could be deprecated in favor of something more secure. This necessitates some form of version control for libraries so that exiting apps continue to work correctly and so that new applications can leverage new features.

In Linux, we get two forms of library version control.

  1. Soname based version
  2. Symbol based version

Soname versioning

Soname versions are the numerical post-fixes that you see in library file names. e.g. for the C++ standard lib /lib/libstdc++.so.6.0.28, it’s version is 6.0.28. That’s a major, minor and patch number. The rule of thumb is that major number changes corresponds to breaking changes that will require apps to be recompiled. Minor version bumps will add new features but will not break exiting apps, so apps will not be needed to be recompiled. Patch version bumps will only include updates that are completely transparent to app functionality like bug fixes. During app compilation, apps are usually liked against only the major number (e.g. against libstdc++.so.6) so that it would break if the major number changes, but would continue to work for minor and patch number updates. Typically you’d find a bunch of symlinks associated with a library to support this form of versioning.

# C++ std libs and its symlinks
> ls -l /lib/libstdc++*
lrwxrwxrwx 1 ruvi ruvi 19 Apr 2 07:44 libstdc++.so -> libstdc++.so.6.0.28
lrwxrwxrwx 1 ruvi ruvi 19 Apr 2 07:44 libstdc++.so.6 -> libstdc++.so.6.0.28
-rwxr-xr-x 1 ruvi ruvi 11332604 Apr 2 07:44 libstdc++.so.6.0.28

If you scroll back up to our ldd command examples you will see that all the dependencies there had listed the major number.

Some libraries opt to not to have support for backward compatibility and link against the full soname version. Boost is such an example. If we look at a C++ app that uses Boost’s programming options lib, we can see below that it is explicitly linked to the full 1.69.0 version. Meaning that even a patch update to Boost would break that binary.

# Boost being picky
> ldd boost-po.bin
linux-vdso.so.1 (0x00007ffd74d23000)
libboost_program_options.so.1.69.0 => /lib/x86_64-linux-gnu/libboost_program_options.so.1.69.0 (0x00007ef948000000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007ef947c00000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ef94835f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ef947800000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ef94835a000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ef947f19000)
/lib64/ld-linux-x86-64.so.2 (0x00007ef9483a8000)

Symbol versioning

A draw back of the soname scheme is that you’d need to keep around multiple versions of the same library if you want to support apps that link against different versions of a library. Symbol versions help mitigate this problem by allowing one library to have multiple versions of a single symbol. You can use the readelf -V command to figure out what symbol versions a particular binary lists. For libraries, the .gnu.version_d section in the output will list the symbol versions it exposes. For executable, check the .gnu.version_r section for a list symbol versions it will require at runtime.

To illustrate this point, let’s look at the symbols exposed by the standard C++ library and how it effects the runtime requirements of C++ applications. I’ll build a C++ hello world app with both GCC 11.4.0 that comes with my OS and with a standalone GCC 13.2.0 toolchain.

> g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

> x86_64-unknown-linux-gnu-g++ --version
x86_64-unknown-linux-gnu-g++ (crosstool-NG 1.26.0) 13.2.0

If we try to run the binary built with the newer GCC version on my OS, it will fail complaining about missing symbols, despite my OS’s libraries having the same major number.

# Compile with new GCC
> x86_64-unknown-linux-gnu-g++ hello.cpp -o hello.cpp.g13.bin

# It doesn't run due to missing 3.4.32 GLIBCXX symbol version
> ./hello.cpp.g13.bin
./hello.cpp.g13.bin: /lib/x86_64-linux-gnu/libstdc++.so.6: version \
`GLIBCXX_3.4.32` not found (required by ./hello.cpp.g13.bin)

# Even though the interpreter is able to find a `libstdc++.so.6`
> ldd hello.cpp.g13.bin
./hello.cpp.g13.bin: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32` not found (required by ./hello.cpp.g13.bin)
linux-vdso.so.1 (0x00007ffecb59d000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7746000000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7746299000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7746279000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7745c00000)
/lib64/ld-linux-x86-64.so.2 (0x00007f774639e000)

# We can see that my OS's lib only has GLIBCXX symbols up to 3.4.30
> strings /lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX_
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
....
GLIBCXX_3.4.28
GLIBCXX_3.4.29
GLIBCXX_3.4.30
GLIBCXX_DEBUG_MESSAGE_LENGTH

Your learning here should be that, in general, older binaries will run on newer versions of libraries. But you are at the mercy of the library author’s whims and fancies.

Level 4: The Linux platform engineer

Generated with DALL-E 3

Now it’s time to talk about Linux binary compatibility at the platform level. There are 4 components that combine together to form what is colloquially called the “Linux platform”. These are:

  1. The compiler: GNU GCC
  2. The C library: GNU glibc
  3. The kernel: Linux
  4. The linker and misc tools: GNU Binutils

Each of these components can be upgraded/downgraded independently. But it’s best practice to stick with versions that are all contemporary. Embedded Long-term-support (LTS) distros from Yocto and Buildroot typically keep the major/minor versions locked to prevent library compatibility issues.

Image credit: https://preshing.com/20141119/how-to-build-a-gcc-cross-compiler/

I really like the diagram above. It succinctly describe how these 4 components interact with each other. The right hand side is what we will be focusing our attention on.

The kernel

Starting at the bottom, the kernel exposes system calls. These are the lowest level API that user-space applications in Linux compile down to.

# Kernel version dependency listed by the file command
> file test.arm.static
test.arm.static: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), \
statically linked, \
for GNU/Linux 5.10.185, \
with debug_info, not stripped

In the file command output above, we can see that this statically linked binary has only one dependency and that is kernel version 5.10.185. The kernel APIs are famously always backward compatible. Linus Torvalds has very strong opinions about it. The implications of this is that this binary in the above example will always run on any kernel newer than 5.10.185.

Lib C

The next layer up is lib C. It abstracts the kernel APIs and offers a more user friendly API to application developers. It too is backwards compatible and uses symbol versioning to achieve that compatibility. If you try to run an app compiled against a new version of lib C on an older runtime you will get an error as shown below:

# Typically glibc incompatibility error
> ./test.new.c.bin
./test.new.c.bin: /lib/libc.so.6: \
version GLIBC_2.25 not found (required by ./test.new.c.bin)

What if the lib C versions match but the underlying kernel versions are different? This is when you’d get the dreaded “Kernel too old error”.

# Kernel incompatibility 
> ./test.new
FATAL: kernel too old

# file command will show the kernel dependency as 6.1.86
> file ./test.new
test.new: ELF 64-bit LSB executable, \
x86-64, version 1 (SYSV), dynamically linked, \
interpreter /lib64/ld-linux-x86-64.so.2, \
for GNU/Linux 6.1.86, with debug_info, not stripped

# But the runtime has an older 5.10.208 kernel
> uname -r
5.10.208-66789-generic

Fortunately you won’t see this error commonly in the wild because glibc has a build time option that lets you target an old kernel version to improve portability. By default, most toolchains target an ancient kernel version like 2.6.37 so that this is never an issue. It will increase the size of glibc as it will have to include legacy logic required to work with older kernel APIs. But there is hardly any performance hit and is usually worth it for the added portability you gain.

C++ standard library

Next layer up is the C++ standard libraries that is part of GCC. Similar to lib C, the C++ libs also have symbol versions that compiled applications will require at runtime.

# Symbols versions in a C++ hello world app
> readelf -V hello.cpp.bin

Version symbols section .gnu.version contains 11 entries:
Addr: 0x0000000000000602 Offset: 0x000602 Link: 6 (.dynsym)
000: 0 (*local*) 3 (GLIBC_2.34) 2 (GLIBC_2.2.5) 4 (GLIBCXX_3.4)
004: 4 (GLIBCXX_3.4) 1 (*global*) 1 (*global*) 1 (*global*)
008: 4 (GLIBCXX_3.4) 2 (GLIBC_2.2.5) 4 (GLIBCXX_3.4)

Version needs section .gnu.version_r contains 2 entries:
Addr: 0x0000000000000618 Offset: 0x000618 Link: 7 (.dynstr)
000000: Version: 1 File: libstdc++.so.6 Cnt: 1
0x0010: Name: GLIBCXX_3.4 Flags: none Version: 4
0x0020: Version: 1 File: libc.so.6 Cnt: 2
0x0030: Name: GLIBC_2.34 Flags: none Version: 3
0x0040: Name: GLIBC_2.2.5 Flags: none Version: 2

Here we see that this simple C++ app depends on both GLIBCXX symbols from the C++ std lib and it’s dependent GLIBC symbols from lib C. This is why C++ apps are notorious for being bad at portability. Rust lang does better with this regard by compiling down to lib C; effectively statically linking the Rust standard library. As mentioned previously, GO lang does it even better by statically linking both it’s standard library and it’s dependent lib C dependencies.

Level 5: The grey bearded Linux wizard

Generated with DALL-E 3

Right, so what if you really want to run new binaries on an older runtime? What options do we have?

For non-platform libraries like Boost, sqlite3, etc.. it’s actually not that hard. 99% of the time you can get away with using LD_LIBRARY_PATH to specify the path to new libraries.

# Run an app that needs new boost version 1.84
> ./boost-po-1.84.bin
./boost-po-1.84.bin: error while loading shared libraries: \
libboost_program_options.so.1.84.0: cannot open shared object file: \
No such file or directory

# Boost 1.84 is not available in the system
> ldd boost-po-1.84.bin
linux-vdso.so.1 (0x00007ffce05d7000)
libboost_program_options.so.1.84.0 => not found
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007faec4400000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007faec470f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faec4000000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faec4319000)
/lib64/ld-linux-x86-64.so.2 (0x00007faec4758000)

# Use LD_LIBRARY_PATH to assist the interpreter to find the required lib
> LD_LIBRARY_PATH=/tmp/boost_1_84_0/lib/ ./boost-po-1.84.bin
success!

But this trick can fail spectacularly when you try to do the same for platform libraries.

# Trying to run app that needs newer C++ std libs
> ./hello.cpp.g13.bin
./hello.cpp.g13.bin: /lib/x86_64-linux-gnu/libstdc++.so.6: \
version GLIBCXX_3.4.32 not found (required by ./hello.cpp.g13.bin)

# Things go south fast with LD_LIBRARY_PATH
> LD_LIBRARY_PATH=/tmp/lib/:/tmp/usr/lib \
./hello.cpp.g13.bin
[1] 1786070 segmentation fault (core dumped) \
LD_LIBRARY_PATH= ./hello.cpp.g13.bin

Here is another example of things going badly.

# Trying to run an app that needs newer lib C
> ./test.arm
./test.arm: /lib/libc.so.6: version GLIBC_2.25 not found \
(required by ./test.arm)

# Again things go south
> LD_LIBRARY_PATH=/tmp/lib:/tmp/usr/lib ./test.arm
./test.arm: relocation error: /tmp/lib/libc.so.6: \
symbol _dl_exception_create, version GLIBC_PRIVATE not defined in \
file ld-linux-armhf.so.3 with link time reference

To understand what’s happening here, we need to understand how dynamically linked applications are boot-strapped. When you type a command on a shell and hit enter, the shell is gonna fork itself and then do kernel syscalls to execute the command you entered. The kernel will do some memory copies to load the new application and then it first executes the “interpreter” ( a.k.a loader) specified in the executable. It’s the interpreter that loads all the other required libraries. The failures you see in the above two examples are from the interpreter failing to execute.

The dynamic interpreter is built from the lib C code base and has a tight coupling to the lib C version. As such, when you use LD_LIBRARY_PATH to specify an alternate lib C version, all bets are off for how the interpreter will handle it. As you saw from the two examples I shared above, you can get cryptic missing symbol errors or outright segmentation faults.

The pro move to resolve this issue and achieve maximum potability is to launch the application you want to run via the new system’s interpreter. Basically you run the new interpreter and pass the path to the application you want to run as an argument to it. This will short circuit the application’s bootstrapping process, allowing you to use the new platform libraries to run the new application. Despite the interpreter having a .so file extension, it is actually executable.

# Confirm that binary doesn't run
> ./hello.cpp.g13.bin
./hello.cpp.g13.bin: /lib/x86_64-linux-gnu/libstdc++.so.6: \
version GLIBCXX_3.4.32 not found (required by ./hello.cpp.g13.bin)

# The pro move of calling the interpreter directly to run an application
>/tmp/lib/ld-linux-x86-64.so.2 --library-path /tmp/lib:/tmp/usr/lib \
./hello.cpp.g13.bin
Hello World!

That’s all folks! Hope you learnt something new!

--

--