Why Does C++ Need a Separate Header File

Why does C++ need a separate header file?

You seem to be asking about separating definitions from declarations, although there are other uses for header files.

The answer is that C++ doesn't "need" this. If you mark everything inline (which is automatic anyway for member functions defined in a class definition), then there is no need for the separation. You can just define everything in the header files.

The reasons you might want to separate are:

  1. To improve build times.
  2. To link against code without having the source for the definitions.
  3. To avoid marking everything "inline".

If your more general question is, "why isn't C++ identical to Java?", then I have to ask, "why are you writing C++ instead of Java?" ;-p

More seriously, though, the reason is that the C++ compiler can't just reach into another translation unit and figure out how to use its symbols, in the way that javac can and does. The header file is needed to declare to the compiler what it can expect to be available at link time.

So #include is a straight textual substitution. If you define everything in header files, the preprocessor ends up creating an enormous copy and paste of every source file in your project, and feeding that into the compiler. The fact that the C++ standard was ratified in 1998 has nothing to do with this, it's the fact that the compilation environment for C++ is based so closely on that of C.

Converting my comments to answer your follow-up question:

How does the compiler find the .cpp file with the code in it

It doesn't, at least not at the time it compiles the code that used the header file. The functions you're linking against don't even need to have been written yet, never mind the compiler knowing what .cpp file they'll be in. Everything the calling code needs to know at compile time is expressed in the function declaration. At link time you will provide a list of .o files, or static or dynamic libraries, and the header in effect is a promise that the definitions of the functions will be in there somewhere.

Why are function bodies in C/C++ placed in separate source code files instead of headers?

Function bodies are placed into .cpp files to achieve the following:

  1. To make the compiler parse and compile them only once, as opposed to forcing it to compile them again, again and again everywhere the header file is included. Additionally, in case of header implementation linker will later have to detect and eliminate identical external-linkage functions arriving in different object files.

    Header pre-compilation facilities implemented by many modern compilers might significantly reduce the wasted effort required for repetitive recompilation of the same header file, but they don't entirely eliminate the issue.

  2. To hide the implementations of these functions from the future users of the module or library. Implementation hiding techniques help to enforce certain programming discipline, which reduces parasitic inter-dependencies between modules and thus leads to cleaner code and faster compilation times.

    I'd even say that even if users have access to full source code of the library (i.e. nothing is really "hidden" from them), clean separation between what is supposed to be visible through header files and what is not supposed to be visible is beneficial to library's self-documenting properties (although such separation is achievable in header-only libraries as well).

  3. To make some functions "invisible" to the outside world (i.e. internal linkage, not immediately relevant to your example with class methods).

  4. Non-inline functions residing in a specific translation unit can be subjected to certain context-dependent optimizations. For example, two different functions with identical tail portions can end up "sharing" the machine code implementing these identical tails.

    Functions declared as inline in header files are compiled multiple times in different translation units (i.e. in different contexts) and have to be eliminated by the linker later, which makes it more difficult (if at all possible) to take advantage of such optimization opportunities.

  5. Other reasons I might have missed.

Why does C/C++ have header files unlike other languages like C# and Java?

When the C language was designed computers had very few resources, and it was important to keep the processing time and memory usage to a minimum. A design that would enable a compiler to read the source file and compile it "on the fly", in a single pass, with minimal resource usage, was preferred to designs that made the compiler either do multiple passes over the source code or construct a large data structure in memory before emitting compiled code. Header files enable compiling in a single pass: they give you a way to declare and use a symbol while its definition may come later in the source code, or it may even be in a different file or external library.

The design for newer languages such as C# and Java considered programmer convenience more important than optimizing the compiler's resource usage. Also many modern C compilers already use multiple passes because it's impossible to apply many code optimizations in a single pass; using a single-pass design for new languages has little benefit nowadays.

Header files also have other benefits: they allow you to separate interface from implementation, so that the header file serves as an API documentation without exposing implementation details.

What is the point of header files in C?

Header files are needed to declare functions and variables that are available. You might not have access to the definitions (=the .c files) at all; C supports binary-only distribution of code in libraries.

Why would I compile 2 C/++ source files instead of using a header file?

The human mind can't hold very much information at a time, so we chop things up into smaller, logical and coherent pieces.

OK. So one main.cpp that includes the dozens or hundreds or thousands files in the program, all implemented in header files of a reasonable size, each covering one concept or aggregating more headers files should that one concept be too broad to be easily be described1 in a single header. Problem solved, right? Yup. But that's only one problem.

What about resource consumption?

It's helpful to read How does the compilation/linking process work? before continuing.

During preprocessing every include statement is replaced with the contents of the included file. The result is one mammoth file that's fed into the compiler. That takes up a lot of memory. Further, if one file includes everything as headers, then every time you make a change, no matter how small, this one file will need to be be built. It will include the project's worth of headers and that one change causes ALL of the files to be recompiled. This gets very time consuming.

Memory keeps getting cheaper, so constraining the first resource isn't as important as it was in the 1970s when all of this was being invented2. Still rears it's ugly head from time to time. That's why I cross compile on a big, fat PC rather than building code directly on a Raspberry Pi.

Time doesn't get cheaper. Never has, never will.

But if you follow the best practices and headers contain an interface (what it does) rather than an implementation (how it does it) you'll find that the header doesn't change much. What changes most of the time is the how-to details in the implementation files. A small change is confined to the one implementation file that provides that changed behaviour. Likely this is the only file that needs to be recompiled. This is a huge improvement from every file every time to one or two files every time.

On the rare occasions where the interface changes, well, you suck it up.

1 Because that's what code is: A description of behaviour. It's not a list of instructions for the computer to execute--that's the compiler output--it's a description of the behaviour of the program. The compiler's job is to turn your description onto the instructions.

2 This also explains why building in more recently created languages is much less complicated3. They don't have baggage left over from having to make things work in the glorious days of 1/2 K ram and a CPUs clocked in the kilohertz. They also learned from a lot of mistakes.

3 For the end user, anyway. The back end of a modern build system is some crazy code, man.



Related Topics



Leave a reply



Submit