What Could C/C++ "Lose" If They Defined a Standard Abi

What could C/C++ lose if they defined a standard ABI?

The freedom to implement things in the most natural way on each processor.

I imagine that c in particular has conforming implementations on more different architectures than any other language. Abiding by a ABI optimized for the currently common, high-end, general-purpose CPUs would require unnatural contortions on some the odder machines out there.

Does C have a standard ABI?

C defines no ABI. In fact, it bends over backwards to avoid defining an ABI. Those people, who like me, who have spent most of their programming lives programming in C on 16/32/64 bit architectures with 8 bit bytes, 2's complement arithmetic and flat address spaces, will usually be quite surprised on reading the convoluted language of the current C standard.

For example, read the stuff about pointers. The standard doesn't say anything so simple as "a pointer is an address" for that would be making an assumption about the ABI. In particular, it allows for pointers being in different address spaces and having varying width.

An ABI is a mapping from the execution model of the language to a particular machine/operating system/compiler combination. It makes no sense to define one in the language specification because that runs the risk of excluding C implementations on some architectures.

ABI vs C++ Standard

The standard defines what a program should do based on the code that you write. The ABI defines how that is implemented for a particular platform so that code compiled in different runs (possibly by different compilers/version) can interact.

That is, when you write:

void f(int i) { std::cout << i; }

The standard defines the behavior: a call to that function will cause the printout of the value of the argument. The ABI determines how the assembly is generated so that the function can be called (how is the name of f mangled?) the argument can be passed in (will the argument be somewhere in the stack? in a register?).

Regarding the bold part of the question... well, it depends. ABIs are heavy reads, and it is hard to read and understand them. But you should at least be familiar with some of the basics, like calling conventions (what is the cost of passing an object of type T?)... Beyond that I would make that a reactive approach: profile and if you need to understand what is going on, the ABI might help.

Most programmers don't know the ABI for their platform and they live as happily. I particularly have gone back and forth a couple of times to understand some peculiarities of the behavior of programs.

What is ABI, why doesn't C++ have a standard one, and why would it matter if it did?

ABI is an Application Binary Interface. It describes a standard for how application binaries are organized and accessed.

Standardization would allow multiple compilers to build binaries that were completely compatible with each other, or potentially allow single executables to run on various platforms without recompilation, etc.

What is compatibility for C++ mangling?

So here's some relevant definitions:

  • ABI: Application Binary Interface
  • Name Mangling

Name mangling is the way that the compile represents the method-names you define in C++ so that they're qualified "per class" so for instance ClassA::method() doesn't clash with ClassB::method() - this also facilitates overloading such that ClassA::method(String s) doesn't clash with ClassA::method(int i).

Internally these might be represented something like ClassA_method, ClassA_method^String, ClassA_method^int

As the second topic above discusses "name mangling is not merely a compiler-internal matter" - in cases where a public interface for a shared library is being generated, for instance.

So if you take a typedef and change it for your own code, it'll be okay for all the binaries you generate, but any pre-existing binaries such as 3rd party DLLs that depend on this typedef will break.

Library that includes undefined behavior function working on a certain compiler is portable?

Do not look to the C++ standard for all your answers.

The C++ standard does not define the behavior when object modules compiled by different compilers are linked together. The jurisdiction of the C++ standard is solely single C++ implementations whose creators choose to conform to the C++ standard.

Linking together different object modules is covered by an application binary interface (ABI). If you compile two functions with two compilers that both conform to the same ABI and link them with a linker that conforms to the ABI, they generally should work together. There are additional details to consider, such as how various things in the language(s) bind to corresponding things in the ABI. For example, one compiler might map long to some 32-bit integer type in the ABI while the other compiler might map long to some 64-bit integer type, and this would of course interfere with the functions working together unless corresponding adjustments were made.

ABI compatibility of different C/C++ language versions + GNU extensions

In general:

  • No you can't combine different language versions in the same program, this will cause "One Definition Rule" violations in many library headers.

You may find in limited cases that a few classes actually don't change with the language version. However this is rare, rvalue references are required by the C++11 Standard, and not available in C++03 mode at all.

As for versions with and without support for GNU extensions, you're likely to have more success, but you will still need to run each header through the preprocessor and verify that the exact same sequence of tokens is seen by the compiler using both options.

And that's completely apart from ABI changes, that could cause memory layout or name mangling to differ between compiler variants.

You can also avoid one-definition rule violations on your own public APIs by avoiding using any version and language-specific features. Essentially, this means a flat C API.



Related Topics



Leave a reply



Submit