C/C++ Header and Implementation Files: How Do They Work

C/C++ header and implementation files: How do they work?

The header file declares functions/classes - i.e. tells the compiler when it is compiling a .cpp file what functions/classes are available.

The .cpp file defines those functions - i.e. the compiler compiles the code and therefore produces the actual machine code to perform those actions that are declared in the corresponding .hpp file.

In your example, main.cpp includes a .hpp file. The preprocessor replaces the #include with the contents of the .hpp file. This file tells the compiler that the function myfunction is defined elsewhere and it takes one parameter (an int) and returns an int.

So when you compile main.cpp into object file (.o extension) it makes a note in that file that it requires the function myfunction. When you compile myfunction.cpp into an object file, the object file has a note in it that it has the definition for myfunction.

Then when you come to linking the two object files together into an executable, the linker ties the ends up - i.e. main.o uses myfunction as defined in myfunction.o.

I hope that helps

What exactly goes to a header file and what in implementation file in C++?

There is no standard rule on what must be in a header file, exception made for Standard Library headers. I mean, theoretically speaking we could completely avoid header files and copying and pasting declarations around in .cpp files.

That said, it's more a matter of common sense and experience. You will put stuff into headers or .cpps according to your will and to your needs. We could list some use cases:

template (as in your example) declarations and implementations usually go in headers. See this thread for more info.
you do put function declarations in a header when you think/want they will be used in more than a translation unit
you don't put function declarations in a header when you don't want/think them to be used in other translation units
if you want a function to be inlined and want that function to be used in different translation units, then you put its definition in a header (preceded by the inline keyword if it's a non-member function)
you do put a class declaration (aka forward declaration) in a header when you think/want that objects of that type will be accessed* from different translation units
you don't put a class declaration inside a header when you think/want that class to be accessed only in one translation unit
you do put a class definition (i.e. the whole class interface) in a header when you think/want that objects of that type will be created* in different translation units
you don't put a class definition in a header when you want that objects of that type will be created only in one translation unit
if you want to define a global variable, you hardly put the definition in a header file (unless you want to include that header just in a single translation unit)
if you are developing a library, you will put declarations of those functions and classes you want to provide the user into header files
if you are developing a library, you will find a way to put into .cpp files those implementation details you don't want to make public (as @Joachim Pileborg suggested, see the pimpl idiom)
you usually don't want to put using declaration or using directives in a header, because they will pollute those translation units that will #include the header
when it's possible, you don't want to #include other headers into yours; you definitely do prefer to forward declare what you need to make your program compile
finally, roughly speaking, the less stuff you put inside a header, the faster your files will compile; and, let me state the obvious, you do want your files to compile fast!

Notes

Above I've talked mainly of classes and functions, but in general those rule of thumbs are valid for enumerations and typedef declarations.

Member function definition in a header is missing from the list, because it's a matter of function definition inside or outside class definition, not of .h vs .cpp files

* With accessed I mean used through a pointer or a reference, in contrast to created

Where is the implementation of included C++/C header files?

In general, the implementation is distributed as form of pre-compiled libraries. You need to tell the compiler where they are located.

For example, for gcc, quoting the online manual

-llibrary
-l library
Search the library named library when linking. [...]

and,

-Ldir
Add directory dir to the list of directories to be searched for -l.

Note: you don't need to explicitly specify the standard libraries, they are automatically linked. Rather, if you don't want them to be linked with you binary, you need to inform the compiler by passing the -nostdlib option.

How can a C++ header file include implementation?

Ok, not a C/C++ expert by any means, but I thought the point of a header file was to declare the functions, then the C/CPP file was to define the implementation.

The true purpose of a header file is to share code amongst multiple source files. It is commonly used to separate declarations from implementations for better code management, but that is not a requirement. It is possible to write code that does not rely on header files, and it is possible to write code that is made up of just header files (the STL and Boost libraries are good examples of that). Remember, when the preprocessor encounters an #include statement, it replaces the statement with the contents of the file being referenced, then the compiler only sees the completed pre-processed code.

So, for example, if you have the following files:

Foo.h:

#ifndef FooH
#define FooH

class Foo
{
public:
    UInt32 GetNumberChannels() const;

private:
    UInt32 _numberChannels;
};

#endif

Foo.cpp:

#include "Foo.h"

UInt32 Foo::GetNumberChannels() const
{
    return _numberChannels;
}

Bar.cpp:

#include "Foo.h"

Foo f;
UInt32 chans = f.GetNumberChannels();

The preprocessor parses Foo.cpp and Bar.cpp separately and produces the following code that the compiler then parses:

Foo.cpp:

class Foo
{
public:
    UInt32 GetNumberChannels() const;

private:
    UInt32 _numberChannels;
};

UInt32 Foo::GetNumberChannels() const
{
    return _numberChannels;
}

Bar.cpp:

class Foo
{
public:
    UInt32 GetNumberChannels() const;

private:
    UInt32 _numberChannels;
};

Foo f;
UInt32 chans = f.GetNumberChannels();

Bar.cpp compiles into Bar.obj and contains a reference to call into Foo::GetNumberChannels(). Foo.cpp compiles into Foo.obj and contains the actual implementation of Foo::GetNumberChannels(). After compiling, the linker then matches up the .obj files and links them together to produce the final executable.

So why is there an implementation in a header?

By including the method implementation inside the method declaration, it is being implicitly declared as inlined (there is an actual inline keyword that can be explicitly used as well). Indicating that the compiler should inline a function is only a hint which does not guarantee that the function will actually get inlined. But if it does, then wherever the inlined function is called from, the contents of the function are copied directly into the call site, instead of generating a CALL statement to jump into the function and jump back to the caller upon exiting. The compiler can then take the surrounding code into account and optimize the copied code further, if possible.

Does it have to do with the const keyword?

No. The const keyword merely indicates to the compiler that the method will not alter the state of the object it is being called on at runtime.

What exactly is the benefit/point of doing it this way vs. defining the implementation in the CPP file?

When used effectively, it allows the compiler to usually produce faster and better optimized machine code.

C++ header and implementation files: what to include?

The simple answer is that you almost always want to include .h files, and compile .cpp files. CPP files are (usually) the true code, and H files are (usually) forward-declarations.

The longer answer is that you may be able to include either, and it might work for you, but both will give slightly different results.

What "include" does is basically copy/paste the file in at that line. It doesn't matter what the extension is, it will include the contents of the file the same way.

But C++ code is, by convention, usually written this way:

SomeClass.cpp -

#include "SomeClass.h"
#include <iostream>

void SomeClass::SomeFunction()
{
  std::cout << "Hello world\n";
}

SomeClass.h -

class SomeClass
{
  public:
    void SomeFunction();
};

If you include either of those, you can use the code from it. However, if you have multiple files that include the same .cpp file, you may get errors about re-definition. Header files (.h files) usually contain only forward declarations, and no implementations, so including them in multiple places won't give you errors about re-definition.

If you somehow manage to compile without errors when including .cpp files from other .cpp files, you can still end up with duplicate code. This happens if you include the same .cpp files in multiple other files. It's like you wrote the function twice. This will make your program bigger on disk, take longer to compile, and run a bit slower.

The main caveat is that this implementation/forward declaration convention doesn't hold true for code that uses templates. Template code will still be handed to you as .h files, but it (usually) is implemented directly in the .h file, and won't have accompanying .cpp files.

Can someone help clarify how header files work?

The other answers here have effectively explained the way header files and the preprocessor work. The biggest problem you have is the circular dependencies, which from experience, I know can be a royal pain. Also, when that starts happening, the compiler starts to behave in very odd ways and throw error messages that aren't super helpful. The method I was taught by a C++ guru in college was to start each file (a header file for instance) with

//very beginning of the file
#ifndef HEADER_FILE_H //use a name that is unique though!!
#define HEADER_FILE_H
...
//code goes here
...
#endif
//very end of the file

This uses preprocessor directives to automatically prevent circular dependencies. Basically, I always use an all uppercase version of the file name. custom-vector.h becomes

#ifndef CUSTOM_VECTOR_H
#define CUSTOM_VECTOR_H

This allows you to include files willie-nillie without creating circular dependencies because if a file is included multiple times, its preprocessor variable is already defined, so the preprocessor skips the file. It also makes it easier later on to work with the code because you don't have to sift through your old header files to make sure you haven't already included something. I'll repeat again though, make sure the variable names you use in your #define statements are unique for you otherwise you could run into problems where something doesn't get included properly ;-).

Good luck!

C/C++ Header and Implementation Files: How Do They Work