Program Being Compiled Differently in 3 Major C++ Compilers. Which One Is Right

Program being compiled differently in 3 major C++ compilers. Which one is right?

GCC is correct, at least according to C++11 lookup rules. 3.4.3.1 [class.qual]/2 specifies that, if the nested name specifier is the same as the class name, it refers to the constructor not the injected class name. It gives examples:

B::A ba;           // object of type A
A::A a; // error, A::A is not a type name
struct A::A a2; // object of type A

It looks like MSVC misinterprets it as function-style cast expression creating a temporary C with y as a constructor parameter; and Clang misinterprets it as a declaration of a variable called y of type C.

Why is it important for C / C++ Code to be compilable on different compilers?

For most languages I care less about portability and more about conforming to international standards or accepted language definitions, from which properties portability is likely to follow. For C, however, portability is a useful idea, because it is very hard to write a program that is "strictly conforming" to the standard. (Why? Because the standards committees felt it necessary to grandfather some existing practice, including giving compilers some freedom you might not like them to have.)

So why try to conform to a standard or make your code acceptable to multiple compilers as opposed to simply writing whatever gcc (or your other favorite compiler) happens to accept?

  • Likely in 2015 gcc will accept a rather different language than it does today. You would prefer not to have to rewrite your old code.

  • Perhaps your code might be ported to very small devices, where the GNU toolchain is not as well supported.

  • If your code compiles with any ANSI C compiler straight out of the box with no errors and no warnings, your users' lives will be easier and your software may be widely ported and used.

  • Perhaps someone will invent a great new tool for analyzing C programs, refactoring C programs, improving performance of C programs, or finding bugs in C programs. We're not sure what version of C that tool will work on or what compiler it might be based on, but almost certainly the tool will accept standard C.

Of all these arguments, it's the tool argument I find most convincing. People forget that there are other things one can do with source code besides just compile it and run it. In another language, Haskell, tools for analysis and refactoring lagged far behind compilers, but people who stuck with the Haskell 98 standard have access to a lot more tools. A similar situation is likely for C: if I am going to go to the effort of building a tool, I'm going to base it on a standard with a lifetime of 10 years or so, not on a gcc version which might change before my tool is finished.

That said, lots of people can afford to ignore portability completely. For example, in 1995 I tried hard to persuade Linus Torvalds to make it possible to compile Linux with any ANSI C compiler, not just gcc. Linus had no interest whatever—I suspect he concluded that there was nothing in it for him or his project. And he was right. Having Linux compile only with gcc was a big loss for compiler researchers, but no loss for Linux. The "tool argument" didn't hold for Linux, because Linux became so wildly popular; people building analysis and bug-finding tools for C programs were willing to work with gcc because operating on Linux would allow their work to have a big impact. So if you can count on your project becoming a wild success like Linux or Mosaic/Netscape, you can afford to ignore standards :-)

Compiling C code using multiple compilers

Short answer: yes

Long answer:

Yes, but only if (and not limited to):

  • Your code doesn't use compiler specific stuff that's not available on the other compiler
  • The libraries your code relies on are available and set up correctly on the other compiler
  • Your code doesn't invoke/rely on undefined or implementation-defined behavior
  • The other compiler compiles roughly with the same C standard your current compiler.

I'll add more to the list as I think of them.

Can C++ code be valid in both C++03 and C++11 but do different things?

The answer is a definite yes. On the plus side there is:

  • Code that previously implicitly copied objects will now implicitly move them when possible.

On the negative side, several examples are listed in the appendix C of the standard. Even though there are many more negative ones than positive, each one of them is much less likely to occur.

String literals

#define u8 "abc"
const char* s = u8"def"; // Previously "abcdef", now "def"

and

#define _x "there"
"hello "_x // Previously "hello there", now a user defined string literal

Type conversions of 0

In C++11, only literals are integer null pointer constants:

void f(void *); // #1
void f(...); // #2
template<int N> void g() {
f(0*N); // Calls #2; used to call #1
}

Rounded results after integer division and modulo

In C++03 the compiler was allowed to either round towards 0 or towards negative infinity. In C++11 it is mandatory to round towards 0

int i = (-1) / 2; // Might have been -1 in C++03, is now ensured to be 0

Whitespaces between nested template closing braces >> vs > >

Inside a specialization or instantiation the >> might instead be interpreted as a right-shift in C++03. This is more likely to break existing code though: (from http://gustedt.wordpress.com/2013/12/15/a-disimprovement-observed-from-the-outside-right-angle-brackets/)

template< unsigned len > unsigned int fun(unsigned int x);
typedef unsigned int (*fun_t)(unsigned int);
template< fun_t f > unsigned int fon(unsigned int x);

void total(void) {
// fon<fun<9> >(1) >> 2 in both standards
unsigned int A = fon< fun< 9 > >(1) >>(2);
// fon<fun<4> >(2) in C++03
// Compile time error in C++11
unsigned int B = fon< fun< 9 >>(1) > >(2);
}

Operator new may now throw other exceptions than std::bad_alloc

struct foo { void *operator new(size_t x){ throw std::exception(); } }
try {
foo *f = new foo();
} catch (std::bad_alloc &) {
// c++03 code
} catch (std::exception &) {
// c++11 code
}

User-declared destructors have an implicit exception specification
example from What breaking changes are introduced in C++11?

struct A {
~A() { throw "foo"; } // Calls std::terminate in C++11
};
//...
try {
A a;
} catch(...) {
// C++03 will catch the exception
}

size() of containers is now required to run in O(1)

std::list<double> list;
// ...
size_t s = list.size(); // Might be an O(n) operation in C++03

std::ios_base::failure does not derive directly from std::exception anymore

While the direct base-class is new, std::runtime_error is not. Thus:

try {
std::cin >> variable; // exceptions enabled, and error here
} catch(std::runtime_error &) {
std::cerr << "C++11\n";
} catch(std::ios_base::failure &) {
std::cerr << "Pre-C++11\n";
}

Why do you need to recompile C/C++ for each OS?

Don't we target the CPU architecture/instruction set when compiling a C/C++ program?

No, you don't.

I mean yes, you are compiling for a CPU instruction set. But that's not all compilation is.

Consider the simplest "Hello, world!" program. All it does is call printf, right? But there's no "printf" instruction set opcode. So... what exactly happens?

Well, that's part of the C standard library. Its printf function does some processing on the string and parameters, then... displays it. How does that happen? Well, it sends the string to standard out. OK... who controls that?

The operating system. And there's no "standard out" opcode either, so sending a string to standard out involves some form of OS call.

And OS calls are not standardized across operating systems. Pretty much every standard library function that does something you couldn't build on your own in C or C++ is going to talk to the OS to do at least some of its work.

malloc? Memory doesn't belong to you; it belongs to the OS, and you maybe are allowed to have some. scanf? Standard input doesn't belong to you; it belongs to the OS, and you can maybe read from it. And so on.

Your standard library is built from calls to OS routines. And those OS routines are non-portable, so your standard library implementation is non-portable. So your executable has these non-portable calls in it.

And on top of all of that, different OSs have different ideas of what an "executable" even looks like. An executable isn't just a bunch of opcodes, after all; where do you think all of those constant and pre-initialized static variables get stored? Different OSs have different ways of starting up an executable, and the structure of the executable is a part of that.

Can code that is valid in both C and C++ produce different behavior when compiled in each language?

The following, valid in C and C++, is going to (most likely) result in different values in i in C and C++:

int i = sizeof('a');

See Size of character ('a') in C/C++ for an explanation of the difference.

Another one from this article:

#include <stdio.h>

int sz = 80;

int main(void)
{
struct sz { char c; };

int val = sizeof(sz); // sizeof(int) in C,
// sizeof(struct sz) in C++
printf("%d\n", val);
return 0;
}

Is a C++ compiler allowed to emit different machine code compiling the same program?

The C++ standard certainly doesn't say anything to prevent this from happening. In reality, however, a compiler is normally deterministic, so given identical inputs it will produce identical output.

The real question is mostly what parts of the environment it considers as its inputs -- there are a few that seem to assume characteristics of the build machine reflect characteristics of the target, and vary their output based on "inputs" that are implicit in the build environment instead of explicitly stated, such as via compiler flags. That said, even that is relatively unusual. The norm is for the output to depend on explicit inputs (input files, command line flags, etc.)

Offhand, I can only think of one fairly obvious thing that changes "spontaneously": some compilers and/or linkers embed a timestamp into their output file, so a few bytes of the output file will change from one build to the next--but this will only be in the metadata embedded in the file, not a change to the actual code that's generated.

What does it mean to say that C was compiled in C?

You start with assembly, you make a first version and can always write a new version of a compiler with the previous version.

You do not need a compiler in order to start compiler written in its language. You just need to write an interpreter for a C subset. Then you use this subset in order to enhance it.

At the time C was invented (around 1967) nobody thought it was a sane idea to write a compiler or an Operating System in a high level language, except Brian Kerninghan and Dennis Ritchie. They proved to be right!

Same program behavior is different in g++ & MSVS 2010

I would argue that g++ and clang (and VS2012) are correct. And that VS2010 is incorrect.

Essentially, to destruct an instance of Derived, the compiler must determine if the destructor of Base is accessible. A private destructor of a base class is not accessible to the derived class, so Derived cannot be destructed.

I would suggest this does not require that the class be instantiated. Under the "separate compilation" model, a compiler must assume that some other compilation unit (invisible to the compiler when compiling this compilation unit) will instantiate and/or destroy an instance of the class. Which means it should check/compile/emit code for the destructor in every compilation unit which has visibility of the definition of that destructor.

Rules related to the "separate compilation" model are among those known informally as "as if" rules - an implementation must act as if it follows each rule, even if it does things differently under the hood.

The thing is, the various "as if" rules are where compiler vendors often take shortcuts - in the interests of speed of compilation, or other things.

This is only one possible explanation of why VS2010 is getting it wrong.

The upshot is that VS2010 is not properly checking accessibility of destructors in such cases. Older Microsoft C++ compilers were quite notorious for being less than fully standard compliant. In recent years, Microsoft has strategically sought to improve their conformance with the standard, but it takes time for such things to flush out. This would be consistent with the observation that VS2010 does not complain, but VS2012 (like other compilers) does.

One possible explanation of this is that VS2010 does not fully comply with requirements of "separate compilation". Another is that is simply takes shortcuts when checking accessibility of private members in base classes. To determine if either of these - or something else - is the explanation, it would probably be necessary to dig into details of how VS2010 is implemented.



Related Topics



Leave a reply



Submit