Sunday, January 12

Honey, I shrunk {fmt}: bringing binary size to 14k and ditching the C++ runtime

videobacks.net

{fmt} formatting is known for its small binary footprint,
often producing that is several times smaller call compared
to alternatives like IOStreams, Boost , or, somewhat ironically,
tinyformat. This is mainly achieved through careful of type
various , which effectively minimizes template bloat.

Formatting are passed via type-erased format_args:

auto vformat(string_view fmt, format_args args) -> std::string;

template
auto format(format_string fmt, &&… args) -> std::string {
vformat(fmt, fmt::make_format_args(args…));
}

As you can see, format delegates its to vformat, which is not
template.

iterators and other output are also type-erased through a specially
designed buffer API.

This approach confines template usage to a minimal -level , to
both a smaller binary and faster build times.

For example, the following code:

// .cc
#include

int main() {
fmt::print(“The answer is {}.”, 42);
}

compiles to just

.LC0:
.string “The answer is {}.”
main:
sub rsp, 24
mov eax, 1
mov edi, OFFSET FLAT:.LC0
mov esi, 17
mov rcx, rsp
mov rdx, rax
mov DWORD PTR [rsp], 42
call fmt::v11::vprint(fmt::v11::basic_string_view, fmt::v11::basic_format_args<:v11::context>)
xor eax, eax
add rsp, 24
ret

godbolt

It is much smaller than the equivalent IOStreams code and comparable to that
of printf:

.LC0:
.string “The answer is %.”
main:
sub rsp, 8
mov esi, 42
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 8
ret

godbolt

Unlike printf, {fmt} full runtime type . in format strings
can be caught at compile , and even when the format string is determined at
runtime, errors are managed through , preventing undefined behavior,
, and potential . Additionally, {fmt} calls are
generally more efficient, particularly when using positional arguments, which C
varargs are not well-suited for.

in 2020, dedicated some time to optimizing the library size,
successfully reducing it to 100kB (just ~57kB with - -flto).
A lot has changed since then. Most notably, {fmt} now uses the exceptional
Dragonbox for floating- formatting, kindly
contributed by its , Junekey Jeon. Let' explore how these changes have
impacted the binary size and see if further reductions are possible.

But why, some say, the binary size? Why choose this as our ?

There has been considerable in using {fmt} on memory-constrained
, see e.. #758 and #1226 for just examples from
the distant past. A particularly intriguing use is computing, with
people using {fmt} on like Amiga (#4054).

'll apply the same as in previous work, examining the
executable size of a that uses {fmt}, as this is most relevant to end
. All be conducted on an aarch64 Ubuntu 22.04 with GCC
11.4.0.

First, let's establish the baseline: what is the binary size for the latest
version of {fmt} (11.0.2)?

 » …
Read More

videobacks.net