What Is the Purpose of Using -Pedantic in the Gcc/G++ Compiler

why use gcc and g++ compiler drivers for c and c++

The only differences between gcc and g++ are that:

  • when the driver is used to invoke the linker, g++ causes libstdc++ to be linked as part of "stdlibs", while gcc will link only libc.
  • g++ will compile .c, .h and .i files as C++ unless the -x option is specified.

Both drivers will compile C or C++ depending on either the filename extension, or command-line switches. If you invoke the compiler-driver for compilation only and invoke the linker (ld) directly, using gcc or g++ -x, it makes no difference which you use.

Equally, if you invoke the gcc driver for C++ code and explicitly link stdlibc++ it also makes no difference - so long as your crt0.o is not C-only - a C++ runtime start-up must invoke global static constructors before main()) - this is likely to already be the case.

The definitive word from the documentation:

3.3 Compiling C++ Programs

C++ source files conventionally use one of the suffixes ‘.C’, ‘.cc’, ‘.cpp’, ‘.CPP’, ‘.c++’, ‘.cp’, or ‘.cxx’;
C++ header files often use ‘.hh’, ‘.hpp’, ‘.H’, or (for shared
template code) ‘.tcc’; and preprocessed C++ files use the suffix
‘.ii’. GCC recognizes files with these names and compiles them as C++
programs even if you call the compiler the same way as for compiling C
programs (usually with the name gcc).

However, the use of gcc does not add the C++ library. g++ is a program
that calls GCC and automatically specifies linking against the C++
library. It treats ‘.c’, ‘.h’ and ‘.i’ files as C++ source files
instead of C source files unless -x is used. This program is also
useful when precompiling a C header file with a ‘.h’ extension for use
in C++ compilations. On many systems, g++ is also installed with the
name c++.

When you compile C++ programs, you may specify many of the same
command-line options that you use for compiling programs in any
language; or command-line options meaningful for C and related
languages; or options that are meaningful only for C++ programs. See
Options Controlling C Dialect, for explanations of options for
languages related to C. See Options Controlling C++ Dialect, for
explanations of options that are meaningful only for C++ programs.

If you want to use just one, I suggest you use gcc and separately invoke the linker or explicitly link -libstdc++. That way the compilation mode will be dependent on the filename extension. Using g++ -x to compile C code is just going to cause confusion.

gcc -g :what will happen

That's kind of right, but incomplete. -g requests that the compiler and linker generate and retain source-level debugging/symbol information in the executable itself.

If...

  • the program happens to later crash and produce a core file (which suggests some problem in the actual code), or
  • a deliberate OS command forced it to core (e.g. kill -SIGQUIT pid), or
  • the program calls a function that dumps core (e.g. abort)

...- none of which are actually caused by the use of -g - then the debugger will know how to read that "-g" symbol information from the executable and cross-reference it with the core. This means you can see the proper names of variables and functions in your stack frames, get line numbers and see the source as you step around in the executable.

That debug information is useful whenever debugging - whether you started with a core or just the executable alone. It even helps produce better output from commands like pstack.

Note that your environment may have other settings to control whether cores are generated (they can be big, and there's no general way to know if/when they can be removed, so they're not always wanted). For example, on UNIX/LINUX shells it's often ulimit -c.

You may also be interested to read about DWARF Wikipedia - a commonly used debugging information format for encoding the embedded debug/symbol information in executable/library objects (e.g. on UNIX and Linux).

UPDATE per Victor's request in comments...

Symbol information lists identifiers from the source code (usually only after any name mangling needed), the (virtual) memory addresses/offsets at which they'll be loaded in the process memory, the type (e.g. data vs. code). For example...

$ cat ok.cc
int g_my_num;
namespace NS { int ns_my_num = 2; }
int f() { return g_my_num + NS::ns_my_num; }
int main() { return f(); }

$ g++ -g ok.cc -o ok # compile ok executable with symbol info

$ nm ok # show mangled identifiers
00000000004017c8 d _DYNAMIC
0000000000401960 d _GLOBAL_OFFSET_TABLE_
0000000000400478 R _IO_stdin_used
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
000000000040037c T _Z1fv # this is f()
0000000000401798 D _ZN2NS9ns_my_numE # this is NS::ns_my_num
00000000004017a8 d __CTOR_END__
00000000004017a0 d __CTOR_LIST__
00000000004017b8 d __DTOR_END__
00000000004017b0 d __DTOR_LIST__
0000000000400540 r __FRAME_END__
00000000004017c0 d __JCR_END__
00000000004017c0 d __JCR_LIST__
00000000004017c8 d __TMC_END__
00000000004017c8 d __TMC_LIST__
0000000000401980 A __bss_start
0000000000401788 D __data_start
0000000000400440 t __do_global_ctors_aux
00000000004002e0 t __do_global_dtors_aux
0000000000401790 d __dso_handle
0000000000000000 a __fini_array_end
0000000000000000 a __fini_array_start
w __gmon_start__
0000000000000000 a __init_array_end
0000000000000000 a __init_array_start
00000000004003a0 T __libc_csu_fini
00000000004003b0 T __libc_csu_init
U __libc_start_main
0000000000000000 a __preinit_array_end
0000000000000000 a __preinit_array_start
0000000000401980 A _edata
0000000000401994 A _end
0000000000400494 T _fini
000000000040047c T _init
0000000000400220 T _start
000000000040024c t call_gmon_start
0000000000401980 b completed.6118
0000000000401788 W data_start
0000000000400270 t deregister_tm_clones
0000000000401988 b dtor_idx.6120
0000000000401994 A end
0000000000400350 t frame_dummy
0000000000401990 B g_my_num # our global g_my_num
0000000000400390 T main # the int main() function
00000000004002a0 t register_tm_clones

$ nm ok | c++filt # c++filt "unmangles" identifiers...
00000000004017c8 d _DYNAMIC
0000000000401960 d _GLOBAL_OFFSET_TABLE_
0000000000400478 R _IO_stdin_used
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
000000000040037c T f()
0000000000401798 D NS::ns_my_num
00000000004017a8 d __CTOR_END__
00000000004017a0 d __CTOR_LIST__
00000000004017b8 d __DTOR_END__
00000000004017b0 d __DTOR_LIST__
0000000000400540 r __FRAME_END__
00000000004017c0 d __JCR_END__
00000000004017c0 d __JCR_LIST__
00000000004017c8 d __TMC_END__
00000000004017c8 d __TMC_LIST__
0000000000401980 A __bss_start
0000000000401788 D __data_start
0000000000400440 t __do_global_ctors_aux
00000000004002e0 t __do_global_dtors_aux
0000000000401790 d __dso_handle
0000000000000000 a __fini_array_end
0000000000000000 a __fini_array_start
w __gmon_start__
0000000000000000 a __init_array_end
0000000000000000 a __init_array_start
00000000004003a0 T __libc_csu_fini
00000000004003b0 T __libc_csu_init
U __libc_start_main
0000000000000000 a __preinit_array_end
0000000000000000 a __preinit_array_start
0000000000401980 A _edata
0000000000401994 A _end
0000000000400494 T _fini
000000000040047c T _init
0000000000400220 T _start
000000000040024c t call_gmon_start
0000000000401980 b completed.6118
0000000000401788 W data_start
0000000000400270 t deregister_tm_clones
0000000000401988 b dtor_idx.6120
0000000000401994 A end
0000000000400350 t frame_dummy
0000000000401990 B g_my_num
0000000000400390 T main
00000000004002a0 t register_tm_clones

Notes:

  • our functions f() and main() are type T (which stands for "TEXT" - used for read-only non-zero memory content whether it's actually text or other data or executable code),
  • g_my_num is B being a global with implicitly zero-ed out memory, while
  • NS::ns_my_num is D as the executable has to explicitly provide the value 2 to occupy that memory.

The man/info-page for nm documents these things further....

When to use the -g flag to GCC

The GNU ld documentation says that -g will be ignored, so it doesn't make much sense to pass it. In general you pass -g to gcc (which really is a front-end for the whole compilation process and not just a compiler) and it will take care of it.

What does -g option do in gcc

It makes the compiler add debug information to the resulting binaries. This information allows a debugger to associate the instructions in the code with source code files and line numbers. Having debug symbols makes certain kinds of debugging (like stepping through code) much easier, if not possible at all.

The -g option actually has a few tunable parameters, check the manual. Also, it's most useful if you don't optimize the code, so use -O0 or -Og (in newer versions) - optimizations break the connection between instructions and source code. (Most importantly you have to not omit frame pointers from function calls, which is a popular optimization but basically completely ruins the ability to walk up the call stack.)

The debug symbols themselves are written in a standardized language (I think it's DWARF2), and there are libraries for reading that. A program could even read its own debug symbols at runtime, for instance.

Debug symbols (as well as other kinds of symbols like function names) can be removed from a binary later on with the strip command. However, since you'll usually combine debug symbols with unoptimizied builds, there's not much point in that - rather, you'd build a release binary with different optimizations and without symbols from the start.

Other compilers such as MSVC don't include debug information in the binary itself, but rather store it in a separate file and/or a "symbol server" -- so if the home user's application crashes and you get the core dump, you can pull up the symbols from your server and get a readable stack trace. GCC might add a feature like that in the future; I've seen some discussions about it.

What is the difference between g++ and gcc?

gcc and g++ are compiler-drivers of the GNU Compiler Collection (which was once upon a time just the GNU C Compiler).

Even though they automatically determine which backends (cc1 cc1plus ...) to call depending on the file-type, unless overridden with -x language, they have some differences.

The probably most important difference in their defaults is which libraries they link against automatically.

According to GCC's online documentation link options and how g++ is invoked, g++ is equivalent to gcc -xc++ -lstdc++ -shared-libgcc (the 1st is a compiler option, the 2nd two are linker options). This can be checked by running both with the -v option (it displays the backend toolchain commands being run).

What's the difference between gcc and g++/gcc-c++?

gcc will compile C source files as C and C++ source files as C++ if the file has an appropriate extension; however it will not link in the C++ library automatically.

g++ will automatically include the C++ library; by default it will also compile files with extensions that indicate they are C source as C++, instead of as C.

From http://gcc.gnu.org/onlinedocs/gcc/Invoking-G_002b_002b.html#Invoking-G_002b_002b:

C++ source files conventionally use one of the suffixes .C, .cc, .cpp, .CPP, .c++, .cp, or .cxx; C++ header files often use .hh, .hpp, .H, or (for shared template code) .tcc; and preprocessed C++ files use the suffix .ii. GCC recognizes files with these names and compiles them as C++ programs even if you call the compiler the same way as for compiling C programs (usually with the name gcc).

However, the use of gcc does not add the C++ library. g++ is a program that calls GCC and treats .c, .h and .i files as C++ source files instead of C source files unless -x is used, and automatically specifies linking against the C++ library. This program is also useful when precompiling a C header file with a .h extension for use in C++ compilations.

For example, to compile a simple C++ program that writes to the std::cout stream, I can use either (MinGW on Windows):

  • g++ -o test.exe test.cpp
  • gcc -o test.exe test.cpp -lstdc++

But if I try:

  • gcc -o test.exe test.cpp

I get undefined references at link time.

And for the other difference, the following C program:

#include <stdlib.h>
#include <stdio.h>

int main()
{
int* new;
int* p = malloc(sizeof(int));

*p = 42;
new = p;

printf("The answer: %d\n", *new);

return 0;
}

compiles and runs fine using:

  • gcc -o test.exe test.c

But gives several errors when compiled using:

  • g++ -o test.exe test.c

Errors:

test.c: In function 'int main()':
test.c:6:10: error: expected unqualified-id before 'new'
test.c:6:10: error: expected initializer before 'new'
test.c:7:32: error: invalid conversion from 'void*' to 'int*'
test.c:10:9: error: expected type-specifier before '=' token
test.c:10:11: error: lvalue required as left operand of assignment
test.c:12:36: error: expected type-specifier before ')' token

Why is GHC distributed with gcc and g++?

GHC is generally compatible with many/several versions of GCC (the incompatibilities appear when using the evil mangler).

If you try using other C compilers, you'll have a few low level issues to contend with (flags, asm formats).

Note that more recent GHCs deprecate the C backend in favor of the LLVM backend, making this somewhat moot for day-to-day Haskell development.

Are there any downsides to compiling with -g flag?

If you use -g (which on recent GCC or Clang can be used with optimization flags like -O2):

  • compilation time is slower (and linking will use a lot more memory)
  • the executable is a bigger file (see elf(5) and use readelf(1)...)
  • the executable carries a lot of information about your source code.
  • you can use GDB easily
  • some interesting libraries, like Ian Taylor's libbacktrace, requires DWARF information (e.g. -g)

If you don't use -g it would be harder to use the GDB debugger (but possible).

So if you transmit the binary executable to a partner that should not understand how your source code was written, you need to avoid -g

See also the strip(1) and strace(1) commands.

Notice that using the -g flag for debugging information is also valid for Ocaml, Rust

PS. Recent GCC (e.g. GCC 10 or GCC 11 in 2021) accept many debugger flags. With -g3 your executable carries more debug information (e.g. description of C++ macros and their expansion) that with -g or -g1. Of course, compilation time increases, and executable size also. In principle, your GCC plugin (perhaps Bismon in 2021, or those inside the source code of the Linux kernel) could add even more debug information. In practice, you won't do that unless you can improve your debugger. However, a GCC plugin (or some #pragmas) can remove some debug information (e.g. remove debug information for a selected set of functions).

Difference between CC, gcc and g++?

The answer to this is platform-specific; what happens on Linux is different from what happens on Solaris, for example.

The easy part (because it is not platform-specific) is the separation of 'gcc' and 'g++':

  • gcc is the GNU C Compiler from the GCC (GNU Compiler Collection).
  • g++ is the GNU C++ Compiler from the GCC.

The hard part, because it is platform-specific, is the meaning of 'CC' (and 'cc').

  • On Solaris, CC is normally the name of the Sun C++ compiler.
  • On Solaris, cc is normally the name of the Sun C compiler.
  • On Linux, if it exists, CC is probably a link to g++.
  • On Linux, cc is a link to gcc.

However, even on Solaris, it could be that cc is the old BSD-based C compiler from /usr/ucb. In practice, that usually isn't installed and there's just a stub that fails, wreaking havoc on those who try to compile and install self-configuring software.

On HP-UX, the default 'cc' is still a K&R-only C compiler installed to permit relinking of the kernel when necessary, and unusable for modern software work because it doesn't support standard C. You have to use alternative compiler names ('acc' IIRC). Similarly, on AIX, the system C compiler goes by names such as 'xlc' or 'xlc32'.

Classically, the default system compiler was called 'cc' and self-configuring software falls back on that name when it doesn't know what else to use.

POSIX attempted to legislate its way around this by requiring the programs c89 (originally) and later c99 to exist; these are the compilers compatible with the ISO/IEC 9899:1989 and 9899:1999 C standards. It is doubtful that POSIX succeeded.


The question asks about the differences in terms of features and libraries. As before, the answer is platform specific in part, and generic in part.

The big divide is between the C compilers and the C++ compilers. The C++ compilers will accept C++ programs and will not compile arbitrary C programs. (Although it is possible to write C in a subset that is also understood by C++, many C programs are not valid C++ programs). Similarly, the C compilers will accept C programs and will reject most C++ programs (because most C++ programs use constructs not available in C).

The set of libraries available for use depends on the language. C++ programs can usually use C libraries on a given platform; C programs cannot usually use C++ libraries. So, C++ has a larger set of libraries available.

Note that if you are on Solaris, the object code produced by CC is not compatible with the object code produced by g++ -- they are two separate compilers with separate conventions for things such as exception handling and name mangling (and the name mangling is deliberately different to ensure that incompatible object files are not linked together!). This means that if you want to use a library compiled with CC, you must compile your whole program with CC. It also means that if you want to use one library compiled with CC and another compiled with g++, you are out of luck. You have to recompile one of the libraries at least.

In terms of quality of assembler generated, the GCC (GNU Compiler Collection) does a very good job. But sometimes the native compilers work a bit better. The Intel compilers have more extensive optimizations that have not yet been replicated in GCC, I believe. But any such pontifications are hazardous while we do not know what platform you are concerned with.

In terms of language features, the compilers all generally hew fairly close to the current standards (C++98, C++2003, C99), but there are usually small differences between the standard language and the language supported by the compiler. The older C89 standard support is essentially the same (and complete) for all C compilers. There are differences in the darker corners of the language. You need to understand 'undefined behaviour', 'system defined behaviour' and 'unspecified behaviour'; if you invoke undefined behaviour, you will get different results at different times. There are also many options (especially with the GCC) to tweak the behaviour of the compiler. The GCC has a variety of extensions that make life simpler if you know you are only targetting that compiler family.



Related Topics



Leave a reply



Submit