What Are the Incompatible Differences Between C(99) and C++(11)

What are the incompatible differences between C(99) and C++(11)?

If you start from the common subset of C and C++, sometimes called clean C (which is not quite C90), you have to consider 3 types of incompatibilities:

  1. Additional C++ featues which make legal C illegal C++

    Examples for this are C++ keywords which can be used as identifiers in C or conversions which are implicit in C but require an explicit cast in C++.

    This is probably the main reason why Microsoft still ships a C frontend at all: otherwise, legacy code that doesn't compile as C++ would have to be rewritten.

  2. Additional C features which aren't part of C++

    The C language did not stop evolving after C++ was forked. Some examples are variable-length arrays, designated initializers and restrict. These features can be quite handy, but aren't part of any C++ standard, and some of them will probably never make it in.

  3. Features which are available in both C and C++, but have different semantics

    An example for this would be the linkage of const objects or inline functions.

A list of incompatibilities between C99 and C++98 can be found here (which has already been mentioned by Mat).

While C++11 and C11 got closer on some fronts (variadic macros are now available in C++, variable-length arrays are now an optional C language feature), the list of incompatibilities has grown as well (eg generic selections in C and the auto type-specifier in C++).

As an aside, while Microsoft has taken some heat for the decision to abandon C (which is not a recent one), as far as I know no one in the open source community has actually taken steps to do something about it: It would be quite possible to provide many features of modern C via a C-to-C++ compiler, especially if you consider that some of them are trivial to implement. This is actually possible right now using Comeau C/C++, which does support C99.

However, it's not really a pressing issue: Personally, I'm quite comfortable with using GCC and Clang on Windows, and there are proprietary alternatives to MSVC as well, eg Pelles C or Intel's compiler.

(struct *) vs (void *) -- Funtion prototype equivalence in C11/C99

Considering ISO C alone: section 6.3.2.3 specifies which casts among pointer types are required not to lose information:

  • A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
  • A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
  • A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.

(emphasis mine) So, let's look at your code again, adding in some of the declarations from dirent.h:

struct dirent;
typedef /* opaque */ DIR;
extern struct dirent *readdir (DIR *);

struct dirent *(*gl_readdir)(void *);
gl_readdir = (struct dirent *(*)(void *))readdir;
DIR *x = /* ... */;
struct dirent *y = gl_readdir(x);

This casts a function pointer of type struct dirent *(*)(DIR *) to a function pointer of type struct dirent *(*)(void *) and then calls the converted pointer. Those two function pointer types are not compatible (in most cases, two types must be exactly the same to be "compatible"; there are a bunch of exceptions but none of them apply here) so the code has undefined behavior.

I want to emphasize that "they have basically the same storage/representation/alignment requirements" is NOT enough to avoid undefined behavior. The infamous sockaddr mess involves types with the same representation and alignment requirements, and even the same initial common subsequence, but struct sockaddr and struct sockaddr_in are still not compatible types, and reading the sa_family field of a struct sockaddr that was cast from a struct sockaddr_in is still undefined behavior.

In the general case, to avoid undefined behavior due to incompatible function pointer types, you have to write "glue" functions that convert back from void * to whatever concrete type is expected by the underlying procedure:

static struct dirent *
gl_readdir_glue (void *closure)
{
return readdir((DIR *)closure);
}

gl_readdir = gl_readdir_glue;

GLOB_ALTDIRFUNC is a GNU extension. Its specification was clearly (to me, anyway) written back in the days when nobody worried about the compiler optimizing based on the assumption that undefined behavior could never occur, so I do not think you should assume that the compiler will Do What You Mean with gl_readdir = (struct dirent *(*)(void *))readdir; If you are writing code that uses GLOB_ALTDIRFUNC, write the glue functions.

If you are implementing GLOB_ALTDIRFUNC, just store the void * you get from the gl_opendir hook in a variable of type void *, and pass it directly to the gl_readdir and gl_closedir hooks. Don't try to guess what the caller wants it to be.


EDIT: The code in the link is, in fact, an implementation of glob. What it does is reduce the non-GLOB_ALTDIRFUNC case to the GLOB_ALTDIRFUNC case by setting the hooks itself. And it doesn't have the glue functions I recommended, it has gl_readdir = (struct dirent *(*)(void *))readdir; I wouldn't have done it that way, but is true that this particular class of undefined behavior is unlikely to cause problems with the compilers and optimization levels that are typically used for C library implementations.

Can C++ code be valid in both C++03 and C++11 but do different things?

The answer is a definite yes. On the plus side there is:

  • Code that previously implicitly copied objects will now implicitly move them when possible.

On the negative side, several examples are listed in the appendix C of the standard. Even though there are many more negative ones than positive, each one of them is much less likely to occur.

String literals

#define u8 "abc"
const char* s = u8"def"; // Previously "abcdef", now "def"

and

#define _x "there"
"hello "_x // Previously "hello there", now a user defined string literal

Type conversions of 0

In C++11, only literals are integer null pointer constants:

void f(void *); // #1
void f(...); // #2
template<int N> void g() {
f(0*N); // Calls #2; used to call #1
}

Rounded results after integer division and modulo

In C++03 the compiler was allowed to either round towards 0 or towards negative infinity. In C++11 it is mandatory to round towards 0

int i = (-1) / 2; // Might have been -1 in C++03, is now ensured to be 0

Whitespaces between nested template closing braces >> vs > >

Inside a specialization or instantiation the >> might instead be interpreted as a right-shift in C++03. This is more likely to break existing code though: (from http://gustedt.wordpress.com/2013/12/15/a-disimprovement-observed-from-the-outside-right-angle-brackets/)

template< unsigned len > unsigned int fun(unsigned int x);
typedef unsigned int (*fun_t)(unsigned int);
template< fun_t f > unsigned int fon(unsigned int x);

void total(void) {
// fon<fun<9> >(1) >> 2 in both standards
unsigned int A = fon< fun< 9 > >(1) >>(2);
// fon<fun<4> >(2) in C++03
// Compile time error in C++11
unsigned int B = fon< fun< 9 >>(1) > >(2);
}

Operator new may now throw other exceptions than std::bad_alloc

struct foo { void *operator new(size_t x){ throw std::exception(); } }
try {
foo *f = new foo();
} catch (std::bad_alloc &) {
// c++03 code
} catch (std::exception &) {
// c++11 code
}

User-declared destructors have an implicit exception specification
example from What breaking changes are introduced in C++11?

struct A {
~A() { throw "foo"; } // Calls std::terminate in C++11
};
//...
try {
A a;
} catch(...) {
// C++03 will catch the exception
}

size() of containers is now required to run in O(1)

std::list<double> list;
// ...
size_t s = list.size(); // Might be an O(n) operation in C++03

std::ios_base::failure does not derive directly from std::exception anymore

While the direct base-class is new, std::runtime_error is not. Thus:

try {
std::cin >> variable; // exceptions enabled, and error here
} catch(std::runtime_error &) {
std::cerr << "C++11\n";
} catch(std::ios_base::failure &) {
std::cerr << "Pre-C++11\n";
}

Why the C standard C11 isn't default in gcc?

The answer is in the page you linked:

GCC supports three versions of the C standard, although support for the most recent version is not yet complete.

Support for C99 is substantially complete, but I think there are a couple of minor things that haven't been implemented yet. According to that page, they intend to make C11 with GNU extensions the default in a future version.

Does C++ include C99 or C89?

Does C++ include C99 or C89?

Currently, C++ includes neither in its entirety, although the common subset of C and C++ is significant. Some features are missing, and there are incompatible differences between the languages. The non-normative sections of the C++ standard "Compatibility / C++ and ISO C [diff.iso]" and "C standard library [diff.library]" lists (some of) the differences.

However, C++ is "based" on the C standard and does refer to it. Here is quote from latest standard draft:

[intro.scope]

C++ is a general purpose programming language based on the C programming language as described in ISO/IEC 9899:2018 Programming languages — C (hereinafter referred to as the C standard).
C++ provides many facilities beyond those provided by C ...

[intro.refs]

The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document.
For dated references, only the edition cited applies.
For undated references, the latest edition of the referenced document (including any amendments) applies.

  • ...
  • ISO/IEC 9899:2018, Programming languages — C
  • ...

The library described in ISO/IEC 9899:2018, Clause 7, is hereinafter called the C standard library.



what version of C++ has what version of C?

  • C++20 is based on C18
  • C++17 is based on C11
  • C++14 is based on C99
  • C++11 is based on C99
  • C++03 is based on C89/C90
  • C++98 is based on C89/C90

From historical perspective, (pre-standard) C++ existed before C was standardised and thus originally couldn't have been based on any standard. Both lanuages have evolved from that common root and have diverged into their own directions.

Also, note that even though many C idioms work in C++, they are not idiomatic in C++ and are considered a bad practice. As far as I understand, the reason for the high degree of compatibility (beyond ability to write common headers) is the ease of porting a C program to C++ allowing it to be converted with small incremental steps. For example, there is never a reason to write NULL in C++ and hardly ever a reason to write malloc.

Can two structs in C99/C11 alias?

Your example is very close to creating two structs that are of compatible type (C11 6.2.7). But in order to be compatible, they must have the same members with the same names, and the struct tags must also be the same.

You don't have that, so (TL;DR) the structs in the question cannot alias.

Another thing one can play around with however, is a trick called common initial sequence (C11 6.5.2.3) where you can put both structs inside a union, which is visible in the translation unit. You'd then be allowed to inspect the first sequence of members of each struct type, until the point where they stop being compatible. You could do this:

typedef union
{
struct hello h;
struct world w;
} hack_t;

Then access individual members of either struct. Unfortunately, this rule is a bit exotic and compilers don't always support it well apparently - the rule was subject to some Defect Reports (DR). I'm not certain of its status in current C17.

But regardless of that trick, the union still makes it possible to lvalue access a struct hello or a struct world through a hack_t, since it is a union type that includes a compatible type among its members (C11 6.5/7). Mildly useful I suppose.

Apart from those cases, you cannot wildly cast pointer types from one pointer type to the other and de-reference. It would be a strict aliasing violation (C11 6.5/7) even if all individual members are compatible types. (You can however of course access any individual float member by de-referencing a float pointer to that member.)

Your restrict optimization does not apply since the structs can't alias, unless they are truly compatible types. So the compiler will assume they are always in different memory regions.

It doesn't matter if you use float or any other primitive data type, they all behave the same with the exception of the character types. A pointer to a character type can be used to access any data without strict aliasing violation (but not the other way around, reading character type through another incompatible type).

Why does C allow conversions between incompatible pointer types?

C99 doesn't permit implicit conversion between pointers of different types (except to/from void*). here's what the C99 Rationale says:

It is invalid to convert a pointer to an object of any type to a pointer to an object of a different type without an explicit cast.

This is a consequence of the rules for assignment, which has a constraint that one of the following shall hold (when pointers are involved) (C99 6.5.16.1 "Simple assignment"):

  • both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the
    qualifiers of the type pointed to by the right;
  • one operand is a pointer to an object or incomplete type and the other is a pointer to a qualified or unqualified version of void, and
    the type pointed to by the left has all the qualifiers of the type
    pointed to by the right;
  • the left operand is a pointer and the right is a null pointer constant

Passing a pointer as an argument to a prototyped function follows the same rules because (C99 6.5.2.2/7 "Function calls"):

If the expression that denotes the called function has a type that
does include a prototype, the arguments are implicitly converted, as
if by assignment, to the types of the corresponding parameters

Both C90 and C11 have similar wording.

I believe that many compilers (including GCC) relax this constraint to issue only a warning because there's too much legacy code that depends on it. Keep in mind that void* was an invention of the ANSI C standard, so pre-standard, and probably a lot of post-standard, code generally used char* or int* as a 'generic' pointer type.

What is the difference between the c99 and gcc commands with appropriate flags?

Try c99 --version on a typical Linux box. You will get the version and name of the compiler which is gcc.

c99 is just a shortcut to the c99 compliant compiler on your machine. That way you don't have to care about the actual compiler used. POSIX also requires some common command line options the compiler has to understand. If that is gcc, it shall enable c99 compliant features. This should be identical to gcc -std=c99.

gcc provides additional features which are enabled by default [1] when called by its native name and by the -std=gccXX option in addition to the CXX standard. For older versions, some of these extensions became part of the next C standard either directly or with slightly different syntax. A typical and appreciated extension for C90 is support for C++-style line-comments:

// this is not allowed in pure C90

For c99/gnu99 things are less obvious, but might still add some usefull features.

On other POSIX systems, e.g. Unix, you may find a different compiler. It shall still be available by the name c99.

Note that the current and only valid C standard is C11 since 2011. So if you want to use the new features (e.g. atomics, thread-support), you have to deviate from the pure POSIX-path. Yet it is likely POSIX might be updated some day.


[1] The default version of the C standard depends on the version of gcc. pre5 used C90, while gcc 5.x uses C11.

What C99 features are considered harmful or unsupported

A number of C99 features are optional, so their lack isn't technically non-conforming. I won't distinguish below.

  • Hmm, win doesn't have <stdint.h>, although there is an open-source version of stdint.h for Microsoft. Even when the file is implemented, many of the individual types are missing.

  • Complex and imaginary support is often missing or broken.

  • Extended identifiers and wide characters can be problem points.

See this list of C99 feature issues in gcc.



Related Topics



Leave a reply



Submit