How to Build Portable Binaries for Linux

Author: Alexey Milovidov, 2020-07-01.

How to Build Portable
Binaries for Linux

The Problem

There are many Linux distributions

— Debian, Ubuntu, Red Hat, CentOS, Fedora, Gentoo, Arch, Suse, Amazon...

There are many versions of each distribution

— Ubuntu: 25 versions used in production: 8.04 .. 20.04.

Do we need to build our programs for all of them?

The Problem

Do we need to build our programs for all of them?

— of course not!

— we don't want to make a hundred build variants
  and then debug them all.

Let's just build a binary that works on any Linux.

Let's Try!

#include <vector> #include <thread> #include <iostream> struct S { ~S() { std::cerr << "Thread exited\n"; } }; thread_local S s; int main(int, char **) { std::vector<std::thread> threads; for (size_t i = 0; i < 10; ++i) threads.emplace_back([]{ std::cerr << &s << '\n'; }); for (auto & thread : threads) thread.join(); return 0; }

Let's Try!

I'll compile a binary on my machine
and run it on a server with Ubuntu 12.04.

$ g++ -pthread main.cpp $ ./a.out $ docker run -it --rm --volume $(pwd):/test ubuntu:precise /test/a.out /test/a.out: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.7' not found (required by /test/a.out) /test/a.out: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /test/a.out) /test/a.out: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by /test/a.out)

Let's Figure This Out

1. How are C++ programs built?

2. How do C++ programs work in the OS?

3. How do programs execute on the CPU?

Malcolm in the Middle S03E06 - Health Scare, Fox Network, 2001

How C++ Programs are Built

g++ -pthread main.cpp g++ -v -pthread main.cpp /usr/lib/gcc/x86_64-linux-gnu/9/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/9/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper -plugin-opt=-fresolution=/tmp/ccKFcsUf.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/9/../../.. /tmp/ccZP10ce.o -lstdc++ -lm -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o

What Libraries are Used Implicitly?

1. C++ Standard Library

Example implementation: std::string

Different implementations:
libstdc++ — default when building with gcc and clang on Linux;
libc++ — default when building with clang on FreeBSD, Darwin;
— you can choose not to use the C++ standard library at all
  or even write your own.

All variants can be used with both gcc and clang.

You can use different library versions
almost independently of the compiler version:
— example: libc++ 11 (master) can be used with clang-10 and gcc-9.3.

What Libraries are Used Implicitly?

2. C++ ABI Library

Example implementation: __cxa_throw

A layer between the C++ standard library*
and platform-specific implementation of details such as:
— exception handling;
— RTTI (dynamic_cast and everything around it);
— thread_local variables with non-trivial destructors;
— static variables in functions with non-trivial initialization;

This library has a C interface, and different C++ standard library implementations use the same C++ ABI interface.

Different implementations:
— libstdc++ — default when building with gcc and clang on Linux;
— libc++abi — default when building with clang on FreeBSD, Darwin;
— you can choose not to use exception handling, RTTI...

What Libraries are Used Implicitly?

C++ ABI is not just a library.

It defines data structure layouts, calling conventions, interfaces...

Examples:
— how name mangling is done in object files;
— where the virtual function table is located and how it's structured;
— how exactly to perform exception handling;

If you write in C++, your program uses C++ ABI.

There are different C++ ABIs for different platforms.

For Linux amd64 — Itanium* C++ ABI.

* Why is it called that?

What Libraries are Used Implicitly?

The C++ ABI library can use different implementations
of individual details internally.

3. Exception Handling Library.

Example implementation: __Unwind_RaiseException

C++ ABI defines the exception handling mechanism: which functions are called when handling exceptions, where the data for their operation should be located (location of catch blocks, which destructors to call before getting there...), how this data is encoded in the binary...
The exception handling library implements these mechanisms.

Even for Itanium C++ ABI there are different implementations:
— libgcc_eh.a, libgcc_s.so — default when building with gcc and clang on Linux;
— libunwind "nongnu" (HP);
— libunwind from LLVM;
— you can choose not to use exception handling at all.

How are Exceptions Handled?

Itanium C++ ABI — "zero-cost exception handling".

If you don't throw exceptions — it's free*
* — well, almost.

Throwing an exception is very expensive (millions of CPU cycles
+ loading cold information from the binary).

https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html
https://monoinfinito.wordpress.com/series/exception-handling-in-c/

ARM, POWER C++ ABI — also "zero-cost exception handling".

Alternative: SJLJ — setjmp/longjmp exception handling.
— can also be used on amd64 for "self-contained" programs, if exceptions are not thrown between libraries with different EH ABI.

What Libraries are Used Implicitly?

The libunwind library can be used not only for exception handling.

You can call its methods manually for stack unwinding and context switching.

And it can also be built in two ways:
— with C++ ABI functions;
— with different function names so as not to replace the C++ ABI.

You can even use multiple libraries in one program:
— libgcc_eh for exception handling according to C++ ABI;
— libunwind nongnu for manual stack unwinding.

What Libraries are Used Implicitly?

4. The libc Library.

Implements completely different things:

— implementation of the C standard library
  example: strchr, memcpy, exp;
— implementation of POSIX standard functions
  example: getaddrinfo, pthread_create, dlopen;
— wrappers around system calls
  example: mremap;
— and much more...

Different implementations:
glibc — default when building on Linux;
Musl, DietLibc, UCLibc...
— of course, you can choose not to use libc at all :)
  (but see -fno-builtin-...)

What Libraries are Used Implicitly?

5. Compiler builtins / runtime Library.

Implements functionality required for language features or extensions, but separated from the compiler code for implementation on different platforms.

Example: __int128 division: calling the __divti3 function;

The interface is compiler-specific. clang and gcc have different,
non-interchangeable compiler builtin libraries.

Implementations:
— libgcc_s;
— libclang_rt.builtins-x86_64.

Separate libraries for supporting functions
of different build modes: coverage, sanitizers.

What Libraries are Used Implicitly?

6. Startup Files.

Scrt1.o, crti.o, crtbeginS.o, crtendS.o, crtn.o

Entry point, global initialization.

Example implementation: __start.

Depends on libc and comes with libc distribution.

Let's Figure This Out

1. How are C++ programs built?

2. How do C++ programs work in the OS?

3. How do programs execute on the CPU?

How C++ Programs Work with the OS?

Actually, any programs... it doesn't matter here.

1. System calls.

2. Signal handlers.

3. Virtual memory mechanism.

...

How C++ Programs Work with the OS?

1. System calls.

std::cerr << "Hello, world\n";

— C++

write(2, "Hello, world\n", 13);

— Libc, POSIX

syscall(SYS_write, 2, "Hello, world\n", 13);

— Libc

__asm__("movq ..., %rax ... syscall")

How C++ Programs Work with the OS?

1. System calls, good facts:

System call numbers in Linux do not change.

Old system calls are not removed and continue to work.

Old behavior of system calls is kept, only new features are added.

Old programs should work on new versions of the Linux kernel (the reverse is not true).

How C++ Programs Work with the OS?

2. Signal handlers.

3. Virtual memory mechanism.

Example: page fault causes an interrupt, the OS kernel provides the necessary data for memory mapped files and resumes program execution.

Example: The OS constantly updates the content of the vdso page, which is used by libc functions such as clock_gettime (not always).

Example: page fault causes an interrupt, the OS kernel calls the SEGV signal handler in our program, if it exists.

Puzzle: due to a memory traversal in htop, all other programs suddenly and permanently slowed down.

How are Programs Built and Run?

Static linking vs dynamic linking?

Just two options?

No. There are many ways to build your program...

— statically linked executable;
— dynamically linked executable;
— dynamically linked shared object;
  — can use different dynamic loaders (interp);
  — dependencies can be statically linked into the binary
    or required as dynamic libraries;
— additionally, you can load some libraries
  at runtime using dlopen;

How do Programs Execute on the CPU?

What instruction set do they use?

amd64 (x86_64):

— SSE 2 / SSE 3 / SSSE 3 / SSE 4.1 / SSE 4.2
/ POPCNT / CLMUL / AVX / AVX 2 / AVX 512...

The instruction set is specified by -m compiler flags,
for example -msse4.2.

You can build separate translation units
with different instruction sets and link them together.

You can build separate functions with different instruction sets:
__attribute__((target("avx2")))
_Pragma("GCC target(\"avx2\")"
__attribute__((ifunc("resolve_func")));

Now We Know Everything!

1. How are C++ programs built?

2. How do C++ programs work in the OS?

3. How do programs execute on the CPU?

 

How to build portable binaries for Linux?

Problems

$ g++ -pthread main.cpp $ ./a.out $ docker run -it --rm --volume $(pwd):/test ubuntu:precise /test/a.out /test/a.out: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.7' not found (required by /test/a.out) /test/a.out: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /test/a.out) /test/a.out: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by /test/a.out)

Problems

Reason: different version of the libstdc++ dynamic library
on the target system.

Who said there's even a libstdc++ library there?

Who said that installing from the repository
will give you the right version of this library?

What if we want to use libc++ instead of libstdc++?

Problems

Solution:

Static linking of c++, c++abi libraries into the binary.

The binary itself can remain dynamic and depend on other dynamic libraries.

Potential problems:

If another C++ library is dynamically linked, and this library was compiled with a mismatched version of C++ ABI...

Problems

If another C++ library is dynamically linked, and this library was compiled with a mismatched version of C++ ABI, there are the following options:

1. The program won't build due to differences in symbol names.

2. The program will build but will work incorrectly due to using a different ABI implementation in our binary.
Example: std::terminate when throwing an exception from the library.

3. By some miracle everything will be fine.

Problems

Solution:

All C++ libraries must be built from source with the same build settings and statically linked into the binary.

If there are shared C++ libraries that cannot be rebuilt:
— they must have a plain-C interface, all dependencies linked inside, you can't throw exceptions across library boundaries;
— you can also avoid linking with the library altogether and move the interaction to a separate process, example: clickhouse-odbc-bridge.
— non-working options: RTLD_DEEPBIND, dlmopen.

Conclusion: shared libraries with C++ interfaces should not exist.

Problems

$ g++ -pthread -static-libstdc++ -static-libgcc main.cpp $ g++ -nodefaultlibs main.cpp -lstdc++ -lgcc_eh \ -lgcc -lc -lm -lrt -lpthread -ldl $ clang++ -std=c++2a -fuse-ld=lld -pthread -nodefaultlibs -nostdinc++ \ -I ../contrib/libcxx/include/ -I ../contrib/libcxxabi/include/ \ -L contrib/libcxx-cmake/ -lcxx \ -L contrib/libcxxabi-cmake/ -lcxxabi \ -L contrib/libunwind-cmake/ -lunwind \ -L /usr/lib/clang/9.0.0/lib/linux/ -lclang_rt.builtins-x86_64 \ -lc -lm -lrt -lpthread -ldl main.cpp

Problems

$ docker run -it --rm --volume $(pwd):/test ubuntu:precise /test/a.out /test/a.out: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.18' not found (required by /test/a.out)

Simple Solutions

Build our program in Docker
and distribute it for use the same way.

Advantages:

— many are already used to this kind of suffering;

Disadvantages:

— inconvenient to distribute (example: scp to server doesn't work);
— inconvenient to configure system interaction;
— inconvenient to debug;

The task of building a portable binary is not solved this way.

Simple Solutions

Build our program and supply
all shared libraries with it for installation.

— change rpath in the binary or make a script
that sets LD_LIBRARY_PATH before launch.
— change interp in the binary.

https://github.com/orivej/ldcp

Advantages:
— it works;

Disadvantages:
— inconvenient to distribute (example: scp to server doesn't work);

The task of building a portable binary is not solved this way.

Simple Solutions

Build our program as AppImage / Snap / Flatpak.

Advantages:

— the entire environment is packed into one binary;

Disadvantages:

— inconvenient to work with the system;
— the program takes a long time to start;

More suitable for client applications.

Problems

version `GLIBC_2.18' not found (required by /test/a.out)

How are shared libraries versioned?

Why does our program need this version of GLIBC?

Can we link libc statically?

Problems

How are shared libraries versioned?

1. By version number in the name: libc.so.6 - libc version 6.

$ ldd ./a.out linux-vdso.so.1 (0x00007ffd639d9000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fddc8404000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fddc82b5000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fddc82aa000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fddc8287000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fddc8281000) /lib64/ld-linux-x86-64.so.2 (0x00007fddc8619000)

2. By version numbers for individual symbols.

Problems

How are shared libraries versioned?

2. By version numbers for individual symbols.

The ability to have multiple implementations of one function with slightly different behavior, so that old programs continue to use the old version for compatibility.

Canonical example: memcpy.

Problems

$ ldd -v ./a.out Version information: ./a.out: libc.so.6 (GLIBC_2.14) => /lib/x86_64-linux-gnu/libc.so.6 libm.so.6 (GLIBC_2.23) => /lib/x86_64-linux-gnu/libm.so.6 $ readelf -s -V ./a.out | grep -P '2\.(14|23)' 66: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@GLIBC_2.14 (6) 111: 0000000000000000 0 FUNC GLOBAL DEFAULT UND lgamma@GLIBC_2.23 (7)

Problems

One-way compatibility:

— programs built with old symbol versions
  work on new systems;
— programs built with new symbol versions
  don't work on old systems;

During the build, the newest symbol version is chosen

— because "that's the right way".

Solutions That Don't Work

1. Fully static linking with GLibc.

Just specify -static when building.

— glibc is built on a complex plugin system that dynamically loads other libraries for DNS lookup, locale conversion, and backtrace. With static linking this functionality is completely disabled and you can't use getaddrinfo.

— you can't do dlopen from fully static binaries.

Conclusion:
More or less complex programs cannot be statically built with glibc.

Solutions That Don't Work

2. Using Musl instead of GLibc.

— low performance (malloc, memcpy, ...).

— you can't do dlopen from a fully static binary.

— when linking a dynamic binary with musl, only the ld-musl.so loader can be used, and you have to carry it with you.

— due to small differences, some libraries won't build with Musl without patches.

Conclusion:
More or less complex programs are difficult to build with Musl.

Solutions That Don't Work

3. Asking the linker to prescribe dependencies
  on old symbol versions.

https://github.com/wheybags/glibc_version_header

__asm__(".symver _Exit,_Exit@GLIBC_2.2.5"); __asm__(".symver _IO_2_1_stderr_,_IO_2_1_stderr_@GLIBC_2.2.5"); __asm__(".symver _IO_2_1_stdin_,_IO_2_1_stdin_@GLIBC_2.2.5"); __asm__(".symver _IO_2_1_stdout_,_IO_2_1_stdout_@GLIBC_2.2.5"); ...

If you link with this file in the right order,
old symbol versions will be prescribed...

But some symbols that your program depends on
may be completely absent in the old version of libc.

Solutions That Don't Work

4. Patch the binary after building,
replacing symbol versions with weak ones.

http://www.lightofdawn.org/wiki/wiki.cgi/NewAppsOnOldGlibc

Doesn't work for the same reason as the previous method.

Solutions That Don't Work

5. Building with an old version of libc.

— in an old system in Docker;
  you'll need to bring a new compiler there... and make it work;

— in chroot;

— just -nodefaultlibs and explicitly specify the path
  to a sufficiently old version of libc;

— variant — -sysroot, CMake Toolchain.

Solutions That Don't Work

5. Building with an old version of libc.

Bad because you'll get old, inefficient code.

Doesn't work because the old version is missing some things:

__cxa_thread_atexit_impl

You can't just build a C++11 program
with an old version of libc.

And we're already using C++20 :)

A Solution That Works

A Solution That Works

To build portable binaries for Linux, you just need to...

1. Build libc++, libc++abi, libunwind from source
  and link statically.

2. Build all third-party libraries except libc from source
  and link statically.

3. Use separate allocator implementations
  instead of libc (jemalloc / tcmalloc).

4. Use a separate memcpy implementation
  instead of libc (ours is "Chinese" memcpy).

5. Use fixed libc header files (-nostdinc).

A Solution That Works

6. Take all functions that have too new versions in glibc
  from Musl, put them in a separate library and link before glibc.

7. Take all functions that are missing in old glibc versions
  from Musl and put them there too.

8. Configure third-party libraries so they don't have
  dependencies on too new libc functions (C11 threads)
  and too new Linux kernel features.

9. Our binary still remains a dynamically linked executable,
  but depends only on libc and doesn't depend on new symbol versions.

10. We can still do dlopen, but very carefully.

A Solution That Works

11. Compile all code with a sufficiently old instruction set
(SSE 4.2 for ClickHouse), and individual functions and translation units
— with newer ones.

Choose the appropriate implementation at runtime
using cpuid checks.

A Solution That Works

Just link a piece from Musl statically into the program,
and then GLibc dynamically???

... it works.

A single clickhouse binary is built on any Linux,
works the same on Ubuntu 8.04+, CentOS 5+ (12 years), GosLinux, etc...

+ builds and tests with any build options and all sanitizers.

+ simple cross-compilation on Linux amd64 for
  Linux AArch64, FreeBSD and Darwin.

Image: Pusheen the cat, Claire Belton, 2010; Aura, Telegram sticker.

Where to Look

https://github.com/ClickHouse/ClickHouse/tree/master/
 base/glibc-compatibility

— library with functions to replace glibc symbols.

https://github.com/ClickHouse/ClickHouse/blob/master/
 cmake/linux/default_libs.cmake

— how to describe the build in CMake (complicated).

https://clickhouse-builds.s3.yandex.net/
 12043/15cddc3c6e659cec18f4b16f7fc4cd3b3b36a1cb/
 clickhouse_build_check/report.html

— builds on every commit, build logs.

.