Why Are Forward Declarations Necessary

Why are forward declarations necessary?

The short answer is that computing power and resources advanced exponentially between the time that C was defined and the time that Java came along 25 years later.

The longer answer...

The maximum size of a compilation unit -- the block of code that a compiler processes in a single chunk -- is going to be limited by the amount of memory that the compiling computer has. In order to process the symbols that you type into machine code, the compiler needs to hold all the symbols in a lookup table and reference them as it comes across them in your code.

When C was created in 1972, computing resources were much more scarce and at a high premium -- the memory required to store a complex program's entire symbolic table at once simply wasn't available in most systems. Fixed storage was also expensive, and extremely slow, so ideas like virtual memory or storing parts of the symbolic table on disk simply wouldn't have allowed compilation in a reasonable timeframe.

The best solution to the problem was to chunk the code into smaller pieces by having a human sort out which portions of the symbol table would be needed in which compilation units ahead of time. Imposing a fairly small task on the programmer of declaring what he would use saved the tremendous effort of having the computer search the entire program for anything the programmer could use.

It also saved the compiler from having to make two passes on every source file: the first one to index all the symbols inside, and the second to parse the references and look them up. When you're dealing with magnetic tape where seek times were measured in seconds and read throughput was measured in bytes per second (not kilobytes or megabytes), that was pretty meaningful.

C++, while created almost 17 years later, was defined as a superset of C, and therefore had to use the same mechanism.

By the time Java rolled around in 1995, average computers had enough memory that holding a symbolic table, even for a complex project, was no longer a substantial burden. And Java wasn't designed to be backwards-compatible with C, so it had no need to adopt a legacy mechanism. C# was similarly unencumbered.

As a result, their designers chose to shift the burden of compartmentalizing symbolic declaration back off the programmer and put it on the computer again, since its cost in proportion to the total effort of compilation was minimal.

What is the significance of forward declaration in C programming?

Forward declarations of functions in C typically have two different uses.

Modules

The header of exported functions are declared in a header file which is included in a client module.

Mutual Recursion

In mutual recursion two functions call each other repeatedly. Without a forward declaration one of the two functions will be undeclared in the body of the other.

Example:

int Odd(int n);

int Even(int n)
{
    return (n == 0)? 1: Odd(n - 1);
}

int Odd(int n)
{
    return (n == 0)? 0: Even(n - 1);
}

With a function pointer though, we can do without a forward declaration:

int (*odd)(int n);

int Even(int n)
{
    return (n == 0)? 1: odd(n - 1);
}

int Odd(int n)
{
    return (n == 0)? 0: Even(n - 1);
}

void Init(void)
{
    odd = Odd;
    ...
}

What are forward declarations in C++?

Why forward-declare is necessary in C++

The compiler wants to ensure you haven't made spelling mistakes or passed the wrong number of arguments to the function. So, it insists that it first sees a declaration of 'add' (or any other types, classes, or functions) before it is used.

This really just allows the compiler to do a better job of validating the code and allows it to tidy up loose ends so it can produce a neat-looking object file. If you didn't have to forward declare things, the compiler would produce an object file that would have to contain information about all the possible guesses as to what the function add might be. And the linker would have to contain very clever logic to try and work out which add you actually intended to call, when the add function may live in a different object file the linker is joining with the one that uses add to produce a dll or exe. It's possible that the linker may get the wrong add. Say you wanted to use int add(int a, float b), but accidentally forgot to write it, but the linker found an already existing int add(int a, int b) and thought that was the right one and used that instead. Your code would compile, but wouldn't be doing what you expected.

So, just to keep things explicit and avoid guessing, etc, the compiler insists you declare everything before it is used.

Difference between declaration and definition

As an aside, it's important to know the difference between a declaration and a definition. A declaration just gives enough code to show what something looks like, so for a function, this is the return type, calling convention, method name, arguments, and their types. However, the code for the method isn't required. For a definition, you need the declaration and then also the code for the function too.

How forward-declarations can significantly reduce build times

You can get the declaration of a function into your current .cpp or .h file by #includ'ing the header that already contains a declaration of the function. However, this can slow down your compile, especially if you #include a header into a .h instead of .cpp of your program, as everything that #includes the .h you're writing would end up #include'ing all the headers you wrote #includes for too. Suddenly, the compiler has #included pages and pages of code that it needs to compile even when you only wanted to use one or two functions. To avoid this, you can use a forward-declaration and just type the declaration of the function yourself at the top of the file. If you're only using a few functions, this can really make your compiles quicker compared to always #including the header. For really large projects, the difference could be an hour or more of compile time bought down to a few minutes.

Break cyclic references where two definitions both use each other

Additionally, forward-declarations can help you break cycles. This is where two functions both try to use each other. When this happens (and it is a perfectly valid thing to do), you may #include one header file, but that header file tries to #include the header file you're currently writing... which then #includes the other header, which #includes the one you're writing. You're stuck in a chicken and egg situation with each header file trying to re #include the other. To solve this, you can forward-declare the parts you need in one of the files and leave the #include out of that file.

Eg:

File Car.h

#include "Wheel.h"  // Include Wheel's definition so it can be used in Car.
#include <vector>

class Car
{
    std::vector<Wheel> wheels;
};

File Wheel.h

Hmm... the declaration of Car is required here as Wheel has a pointer to a Car, but Car.h can't be included here as it would result in a compiler error. If Car.h was included, that would then try to include Wheel.h which would include Car.h which would include Wheel.h and this would go on forever, so instead the compiler raises an error. The solution is to forward declare Car instead:

class Car;     // forward declaration

class Wheel
{
    Car* car;
};

If class Wheel had methods which need to call methods of Car, those methods could be defined in Wheel.cpp and Wheel.cpp is now able to include Car.h without causing a cycle.

why does a C++ need a forward declaration either through a header or a statement and Java doesn't need?

Object files

Java compilers rely on extra information that is compiled into the bytecode (.class) files they produce and which your code uses, while C++ compilers do not store this kind of information in the compiled object code (.obj) files they produces.

Being an older language, C++ relies on the "C model" of program compilation and linkage. This means that object files contain mostly code and data, with only enough metadata needed for object files to be linked together (either statically or dynamically at runtime). Consequently, the compiler relies on header files included while parsing your source code to know about the external class types, functions, and variables your code refers to.

This also means that clients of your C++ code need both the object files (.obj, .lib, or .dll) and header files for your code in order to use it themselves.

Java, being a more modern language, stores a lot of symbolic metadata into the object (.class) files it produces, which allows the compiler to extract the external class types, methods, and variables your code refers to during later compiles.

This also means that clients of your Java code only need the object files (.class or .jar) for your code in order to use it.

Forward declarations

Java compilers make multiple passes on the source code, which means that you only need to declare/defined a variable or method in one place within your code; forward declarations are not necessary because the compiler works harder to determine type information.

C++, in contrast, is defined so ~~that it can be implemented as a single-pass compiler, which means~~ that the compiler needs to know all of the relevant type information about a class, variable, or method at the point it is used in your source code. This means that you have to provide that type declaration information before the actual object definition in your code. This is why header files are used.

When can I use a forward declaration?

Put yourself in the compiler's position: when you forward declare a type, all the compiler knows is that this type exists; it knows nothing about its size, members, or methods. This is why it's called an incomplete type. Therefore, you cannot use the type to declare a member, or a base class, since the compiler would need to know the layout of the type.

Assuming the following forward declaration.

class X;

Here's what you can and cannot do.

What you can do with an incomplete type:

Declare a member to be a pointer or a reference to the incomplete type:
```
class Foo {
    X *p;
    X &r;
};
```
Declare functions or methods which accept/return incomplete types:
```
void f1(X);
X    f2();
```
Define functions or methods which accept/return pointers/references to the incomplete type (but without using its members):
```
void f3(X*, X&) {}
X&   f4()       {}
X*   f5()       {}
```

What you cannot do with an incomplete type:

Use it as a base class
```
class Foo : X {} // compiler error!
```

Use it to declare a member:

class Foo {
    X m; // compiler error!
};

Define functions or methods using this type

void f1(X x) {} // compiler error!
X    f2()    {} // compiler error!

Use its methods or fields, in fact trying to dereference a variable with incomplete type

class Foo {
    X *m;            
    void method()            
    {
        m->someMethod();      // compiler error!
        int i = m->someField; // compiler error!
    }
};

When it comes to templates, there is no absolute rule: whether you can use an incomplete type as a template parameter is dependent on the way the type is used in the template.

For instance, std::vector<T> requires its parameter to be a complete type, while boost::container::vector<T> does not. Sometimes, a complete type is required only if you use certain member functions; this is the case for std::unique_ptr<T>, for example.

A well-documented template should indicate in its documentation all the requirements of its parameters, including whether they need to be complete types or not.

Should one use forward declarations instead of includes wherever possible?

The forward-declaration method is almost always better. (I can't think of a situation where including a file where you can use a forward declaration is better, but I'm not gonna say it's always better just in case).

There are no downsides to forward-declaring classes, but I can think of some downsides for including headers unnecessarily:

longer compilation time, since all translation units including C.h will also include A.h, although they might not need it.
possibly including other headers you don't need indirectly
polluting the translation unit with symbols you don't need
you might need to recompile source files that include that header if it changes (@PeterWood)

Why must we Forward Declare a class and include the corresponding header file in a header file

Here are the basics:

For any type A, if you declare a variable of type A&, A*, A**, A***,etc, then the compiler does not need to know the complete definition of A at the site of variable declaration. All it needs to know that A is a type; that is it. So a forward declaration is enough:
```
class A; //forward declaration

class B
{
   A * pA;  //okay - compiler knows A is a type
   A & refA;/ okay - compiler knows A is a type
};
```
The complete definition is not required because the compiler can still compute sizeof(B) which in turn depends on sizeof(A*) and sizeof(A&) — these are known to the compiler, even though it doesn't know sizeof(A). Note that sizeof(A*) is just a size of pointer on that platform (which is usually 4 bytes on 32bit system or 8 bytes on 64bit system).
For any type A, if you declare a variable of type A, A[N], A[M]N] etc, then the compiler needs to know the complete definition of type A at the site of variable declaration. A forward declaration would not be enough in this case.
```
class A; //forward declaration
class B
{
   A a;  //error - the compiler only knows A is a type
         //it doesn't know its size!
};
```
But this is correct:
```
#include "A.h" //which defines A

class B
{
   A a;  //okay
};
```
The complete definition is required so that the compiler could compute sizeof(A), which is not possible if the compiler doesn't know definition of A.
Note that definition of a class means "the complete specification of the class members, their types, and whether the class has virtual function(s) or not". If the compiler knows these, it can compute the size of the class.

Knowing these basics, you can decide whether to include headers to other headers or only forward declaration would be enough. If the forward declaration is enough, that is the option you should choose. Include a header only if it is required.

However if you provide forward declaration of A in the header B.h, then you have to include the header file A.h in the implementation file of B which is B.cpp, because in the implementation file of B, you need to access the members of A for which the compiler requires the complete definition of A. Well again, include only if you need to access the members of A. :-)

Sorry I didn't see the last paragraph of your answer. What is confusing me is why do I need the forward declaration also. Doesn't including the header file A.h alone provides complete definition of class A?? –

I don't know what is there in the header file. Also, if in spite of including the header file, you also need to provide the forward declaration, then it implies that the header is implemented incorrectly. I suspect that there is a circular dependency:

Make sure that no two header files include each other. For example, if A.h includes B.h, then B.h must not include A.h, directly or indirectly.
Use forward declaration and pointer-declaration to break such circular dependency. The logic is pretty much straight-forward. If you cannot include A.h in B.h, which implies you cannot declare A a in B.h (because for this, you have to include the header A.h also). So even though you cannot declare A a, you can still declare A *pA, and for this a forward declaration of A is enough. That way you break the circular dependency.

Hope that helps.

Why Are Forward Declarations Necessary