Pros & Cons of Putting All Code in Header Files in C++

Pros/ Cons of using one big include file

I would say that in general this is a bad idea for several reasons:

  • It gives you poor encapsulation: clients should just pull in the headers they need. With this approach the inclusion will pull in everything, which as Alok mentions will increase build time and sensitivity to rebuilds
  • There's no distinction between interface classes and implementation classes, ie those that clients of your library use and ones that are used internally by this library that clients don't need to (and perhaps shouldn't) see
  • If any of your headers define macros then these may now 'leak' into any other code that includes the header, which may be undesirable. Anyone who's ever had to type #undef MIN will know this pain.
  • There's a possibility of recursive inclusion if you have several classes that need to be aware of each other, so it may be sensitive to the order of inclusion or you'll get include cycles

I think though there is one instance where it may be acceptable, which is if your library only provides a few classes/functions that are intended to be called by clients, and all the rest are just internal classes used by the implementation. So clients can just include mylib.h and that's all they need to worry about. This also makes it easier if you want to compile your library as a static library as you can just distribute the library and one header.

What are the advantages and disadvantages of implementing classes in header files?

Possible advantages of putting everything in header files:

  • Less redundancy (which leads to easier changes, easier refactoring, etc.)
  • May give compiler/linker better opportunities for optimization
  • Often easier to incorporate into an existing project

Possible disadvantages of putting everything in header files:

  • Longer compile/link cycles
  • Loss of separation of interface and implementation
  • Could lead to hard-to-resolve circular dependencies
  • Lots of inlining could increase executable size
  • Prevents binary compatibility of shared libraries/DLLs
  • Upsets co-workers who prefer the traditional ways of using C++

Why shouldn't I put everything in header?

why can't I just put everything in the header and have no cpp files?

Well, you must have at least one source file, or else you have nothing to compile.

But, to answer, why shouldn't you put everything in a single source file: Because the size of that source file increases linearly to the size of the entire program (splitting into header files does not reduce the size. What is relevant is the size after pre-processing). And therefore (re-)compiling it becomes increasingly slower, and any change to any part of that source file (i.e. any part of the entire program, also the headers that are included) requires you to re-compile that source file (i.e. the entire program) again.

If you split the program into multiple source files, only those source files which are modified need to be re-compiled. For big projects, this can reduce an hour of compilation to a minute, which is a boon when the typical workflow is "edit -> compile -> debug -> edit -> compile -> ...".

Dynamic linking can further this advantage: You can simply replace a dynamic library even without re-linking (as long as the new version is ABI compatible).


For fairness, let me also answer why should you put everything in a single source file: Because it reduces the compilation time from scratch. If your workflow doesn't work with incremental re-builds, then reducing the full compilation time a bit is better than nothing. And because it allows better optimization, because a compiler cannot do inline expansion across source files (link time optimization may reduce this advantage, if you can rely on it being available).


Ideal solution is probably neither to define all functions in a single massive source file, nor is it ideal to define all functions in separate source files each. Ideal option is probably somewhere in between.

A typical convention is to have a single source file for member functions of each class, but there is no absolute reason why this convention should be followed. It is completely fine to define member functions of multiple classes in a single source file, and also fine to divide definitions member functions of one class into separate files, as long as you have an argument for doing so.


I thought this approach would be "convenient" because you stop looking for a function: it was in a cpp file; damn not: it was a template so it was in the header... it's pretty frustrating!

This is not a strong argument compared to compile time considerations. Development environments are available (even for free, and have been for decades) which allow you to jump to the definition of a function declaration (or invocation) in a fraction of a second.

C - Header Files versus Functions

If you care about speed, you first should write a correct program, care about efficient algorithms (read Introduction to Algorithms), benchmark & profile it (perhaps using gprof and/or oprofile), and focus your efforts mostly on the few percents of source code which are critical to performance.

You'll better define these small critical functions in common included header files as static inline functions. The compiler would then be able to inline every call to them if it wants to (and it needs access to the definition of the function to inline).

In general small inlined functions would often run faster, because there is no call overhead in the compiled machine code; sometimes, it might perhaps go slightly slower, because inlining increases machine code size which is detrimental to CPU cache efficiency (read about locality of reference). Also a header file with many static inline functions needs more time to be compiled.


As a concrete example, my Linux system has a header /usr/include/glib-2.0/glib/gstring.h (from Glib in GTK) containing

/* -- optimize g_string_append_c --- */
#ifdef G_CAN_INLINE
static inline GString*
g_string_append_c_inline (GString *gstring,
gchar c)
{
if (gstring->len + 1 < gstring->allocated_len)
{
gstring->str[gstring->len++] = c;
gstring->str[gstring->len] = 0;
}
else
g_string_insert_c (gstring, -1, c);
return gstring;
}
#define g_string_append_c(gstr,c) g_string_append_c_inline (gstr, c)
#endif /* G_CAN_INLINE */

The G_CAN_INLINE preprocessor flag would have been enabled by some previously included header file.

It is a good example of inline function: it is short (a dozen of lines), it would run quickly its own code (excluding the time to call to g_string_insert_c), so it is worth to be defined as static inline.

It is not worth defining as inline a short function which runs by itself a significant time. There is no point inlining a matrix multiplication for example (the call overhead is insignificant w.r.t. the time to make a 100x100 or 8x8 matrix multiplication). So choose carefully the functions you want to inline.


You should trust the compiler, and enable its optimizations (in particular when benchmarking or profiling). For GCC, that would mean compiling with gcc -O3 -mcpu=native (and I also recommend -Wall -Wextra to get useful warnings). You might use link time optimizations by compiling and linking with gcc -flto -O3 -mcpu=native

Is it a good practice to always create a .cpp for each .h in a C++ project?

I wouldn't add unnecessary .cpp files. Each .cpp file you add must be compiled, which just slows down the build process.

In general, using your class will only require the header file anyways - I see no advantage to an "empty" .cpp file for consistency in the project.

Does storing code in header files cause memory management issues in C++?

... one man told me that storing all the code in .h files produces some memory management issues. [...] And I'll avoid this issue if I'll store a code in .h/.cpp files. Is that true?

Memory management usually refers to handling of dynamic memory at runtime. For clarity: Writing all your code in headers has nothing to do with that. However, doing so may increase the amount of memory that the compiler uses indeed. So, if that's what the one man meant, then yes, it could potentially be true - but it's not the biggest problem with the approach.

Because I get too many duplicates of one class

This is a silly argument. In fact, using only header files for definitions means that there is exactly one translation unit where the class definition is included exactly once.

The potential problem is that your single translation unit includes all definitions in all header files. Since everything is processed in one go, the maximum memory required by the compiler is potentially higher. It's probably insignificant to most projects, but for something like libreoffice, it could probably become a problem.


A bigger problem with defining all functions in headers - i.e. using a single translation unit - is that any change to any header, however small, will cause the single massive translation unit to change, and you'll be required to compile it again. With multiple, smaller, translation units, only ones that are affected by the change will need recompilation.

So, the problem with your approach will be that every recompilation is as slow as compiling from scratch. Sure, if the compilation takes a minute, it doesn't matter, but for projects that take hours to compile from scratch, this is essential.

Why does C++ need a separate header file?

You seem to be asking about separating definitions from declarations, although there are other uses for header files.

The answer is that C++ doesn't "need" this. If you mark everything inline (which is automatic anyway for member functions defined in a class definition), then there is no need for the separation. You can just define everything in the header files.

The reasons you might want to separate are:

  1. To improve build times.
  2. To link against code without having the source for the definitions.
  3. To avoid marking everything "inline".

If your more general question is, "why isn't C++ identical to Java?", then I have to ask, "why are you writing C++ instead of Java?" ;-p

More seriously, though, the reason is that the C++ compiler can't just reach into another translation unit and figure out how to use its symbols, in the way that javac can and does. The header file is needed to declare to the compiler what it can expect to be available at link time.

So #include is a straight textual substitution. If you define everything in header files, the preprocessor ends up creating an enormous copy and paste of every source file in your project, and feeding that into the compiler. The fact that the C++ standard was ratified in 1998 has nothing to do with this, it's the fact that the compilation environment for C++ is based so closely on that of C.

Converting my comments to answer your follow-up question:

How does the compiler find the .cpp file with the code in it

It doesn't, at least not at the time it compiles the code that used the header file. The functions you're linking against don't even need to have been written yet, never mind the compiler knowing what .cpp file they'll be in. Everything the calling code needs to know at compile time is expressed in the function declaration. At link time you will provide a list of .o files, or static or dynamic libraries, and the header in effect is a promise that the definitions of the functions will be in there somewhere.



Related Topics



Leave a reply



Submit