What Are Declarations and Declarators and How Are Their Types Interpreted by the Standard

What are declarations and declarators and how are their types interpreted by the standard?

I refer to the C++11 standard in this post

Declarations

Declarations of the type we're concerned with are known as simple-declarations in the grammar of C++, which are of one of the following two forms (§7/1):

decl-specifier-seqopt init-declarator-listopt ;

attribute-specifier-seq decl-specifier-seqopt init-declarator-list ;

The attribute-specifier-seq is a sequence of attributes ([[something]]) and/or alignment specifiers (alignas(something)). Since these don't affect the type of the declaration, we can ignore them and the second of the above two forms.

Declaration specifiers

So the first part of our declaration, the decl-specifier-seq, is made up of declaration specifiers. These include some things that we can ignore, such as storage specifiers (static, extern, etc.), function specifiers (inline, etc.), the friend specifier, and so on. However, the one declaration specifier of interest to us is the type specifier, which may include simple type keywords (char, int, unsigned, etc.), names of user-defined types, cv-qualifiers (const or volatile), and others that we don't care about.

Example: So a simple example of a decl-specifier-seq which is just a sequence of type specifiers is const int. Another one could be unsigned int volatile.

You may think "Oh, so something like const volatile int int float const is also a decl-specifier-seq?" You'd be right that it fits the rules of the grammar, but the semantic rules disallow such a decl-specifier-seq. Only one type specifier is allowed, in fact, except for certain combinations (such as unsigned with int or const with anything except itself) and at least one non-cv-qualifier is required (§7.1.6/2-3).

Quick Quiz (you might need to reference the standard)

  1. Is const int const a valid declaration specifier sequence or not? If not, is it disallowed by the syntactic or semantic rules?


    Invalid by semantic rules! const cannot be combined with itself.

  2. Is unsigned const int a valid declaration specifier sequence or not? If not, is it disallowed by the syntactic or semantic rules?


    Valid! It doesn't matter that the const separates the unsigned from int.

  3. Is auto const a valid declaration specifier sequence or not? If not, is it disallowed by the syntactic or semantic rules?


    Valid! auto is a declaration specifier but changed category in C++11. Before it was a storage specifier (like static), but now it is a type specifier.

  4. Is int * const a valid declaration specifier sequence or not? If not, is it disallowed by the syntactic or semantic rules?


    Invalid by syntactic rules! While this may very well be the full type of a declaration, only the int is the declaration specifier sequence. The declaration specifiers only provides the base type, and not compound modifiers like pointers, references, arrays, etc.

Declarators

The second part of a simple-declaration is the init-declarator-list. It is a sequence of declarators separated by commas, each with an optional initializer (§8). Each declarator introduces a single variable or function into the program. The most simple form of declarator is just the name you're introducing - the declarator-id. The declaration int x, y = 5; has a declaration specifier sequence that is just int, followed by two declarators, x and y, the second of which has an initializer. We will, however, ignore initializers for the rest of this post.

A declarator can have a particularly complex syntax because this is the part of the declaration that allows you to specify whether the variable is a pointer, reference, array, function pointer, etc. Note that these are all part of the declarator and not the declaration as a whole. This is precisely the reason why int* x, y; does not declare two pointers - the asterisk * is part of the declarator of x and not part of the declarator of y. One important rule is that every declarator must have exactly one declarator-id - the name it is declaring. The rest of the rules about valid declarators are enforced once the type of the declaration is determined (we'll come to it later).

Example: A simple example of a declarator is *const p, which declares a const pointer to... something. The type it points to is given by the declaration specifiers in its declaration. A more terrifying example is the one given in the question, (*(*(&e)[10])())[5], which declares a reference to an array of function pointers that return pointers to... again, the final part of the type is actually given by the declaration specifiers.

You're unlikely to ever come across such horrible declarators but sometimes similar ones do appear. It's a useful skill to be able to read a declaration like the one in the question and is a skill that comes with practice. It is helpful to understand how the standard interprets the type of a declaration.

Quick Quiz (you might need to reference the standard)

  1. Which parts of int const unsigned* const array[50]; are the declaration specifiers and the declarator?


    Declaration specifiers: int const unsigned
    Declarator: * const array[50]

  2. Which parts of volatile char (*fp)(float const), &r = c; are the declaration specifiers and the declarators?


    Declaration specifiers: volatile char
    Declarator #1: (*fp)(float const)
    Declarator #2: &r

Declaration Types

Now we know that a declaration is made up of a declarator specifier sequence and a list of declarators, we can begin to think about how the type of a declaration is determined. For example, it might be obvious that int* p; defines p to be a "pointer to int", but for other types it's not so obvious.

A declaration with multiple declarators, let's say 2 declarators, is considered to be two declarations of particular identifiers. That is, int x, *y; is a declaration of identifier x, int x, and a declaration of identifier y, int *y.

Types are expressed in the standard as English-like sentences (such as "pointer to int"). The interpretation of a declaration's type in this English-like form is done in two parts. First, the type of the declaration specifier is determined. Second, a recursive procedure is applied to the declaration as a whole.

Declaration specifiers type

The type of a declaration specifier sequence is determined by Table 10 of the standard. It lists the types of the sequences given that they contain the corresponding specifiers in any order. So for example, any sequence that contains signed and char in any order, including char signed, has type "signed char". Any cv-qualifier that appears in the declaration specifier sequence is added to the front of the type. So char const signed has type "const signed char". This makes sure that regardless of what order you put the specifiers, the type will be the same.

Quick Quiz (you might need to reference the standard)

  1. What is the type of the declaration specifier sequence int long const unsigned?


    "const unsigned long int"

  2. What is the type of the declaration specifier sequence char volatile?


    "volatile char"

  3. What is the type of the declaration specifier sequence auto const?


    It depends! auto will be deduced from the initializer. If it is deduced to be int, for example, the type will be "const int".

Declaration type

Now that we have the type of the declaration specifier sequence, we can work out the type of an entire declaration of an identifier. This is done by applying a recursive procedure defined in §8.3. To explain this procedure, I'll use a running example. We'll work out the type of e in float const (*(*(&e)[10])())[5].

Step 1 The first step is to split the declaration into the form T D where T is the declaration specifier sequence and D is the declarator. So we get:

T = float const
D = (*(*(&e)[10])())[5]

The type of T is, of course, "const float", as we determined in the previous section. We then look for the subsection of §8.3 that matches the current form of D. You'll find that this is §8.3.4 Arrays, because it states that it applies to declarations of the form T D where D has the form:

D1 [ constant-expressionopt ] attribute-specifier-seqopt

Our D is indeed of that form where D1 is (*(*(&e)[10])()).

Now imagine a declaration T D1 (we've gotten rid of the [5]).

T D1 = const float (*(*(&e)[10])())

It's type is "<some stuff> T". This section states that the type of our identifier, e, is "<some stuff> array of 5 T", where <some stuff> is the same as in the type of the imaginary declaration. So to work out the remainder of the type, we need to work out the type of T D1.

This is the recursion! We recursively work out the type of an inner part of the declaration, stripping a bit of it off at every step.

Step 2 So, as before, we split our new declaration into the form T D:

T = const float
D = (*(*(&e)[10])())

This matches paragraph §8.3/6 where D is of the form ( D1 ). This case is simple, the type of T D is simply the type of T D1:

T D1 = const float *(*(&e)[10])()

Step 3 Let's call this T D now and split it up again:

T = const float
D = *(*(&e)[10])()

This matches §8.3.1 Pointers where D is of the form * D1. If T D1 has type "<some stuff> T", then T D has type "<some stuff> pointer to T". So now we need the type of T D1:

T D1 = const float (*(&e)[10])()

Step 4 We call it T D and split it up:

T = const float
D = (*(&e)[10])()

This matches §8.3.5 Functions where D is of the form D1 (). If T D1 has type "<some stuff> T", then T D has type "<some stuff> function of () returning T". So now we need the type of T D1:

T D1 = const float (*(&e)[10])

Step 5 We can apply the same rule we did for step 2, where the declarator is simply parenthesised to end up with:

T D1 = const float *(&e)[10]

Step 6 Of course, we split it up:

T = const float
D = *(&e)[10]

We match §8.3.1 Pointers again with D of the form * D1. If T D1 has type "<some stuff> T", then T D has type "<some stuff> pointer to T". So now we need the type of T D1:

T D1 = const float (&e)[10]

Step 7 Split it up:

T = const float
D = (&e)[10]

We match §8.3.4 Arrays again, with D of the form D1 [10]. If T D1 has type "<some stuff> T", then T D has type "<some stuff> array of 10 T". So what is T D1's type?

T D1 = const float (&e)

Step 8 Apply the parentheses step again:

T D1 = const float &e

Step 9 Split it up:

T = const float
D = &e

Now we match §8.3.2 References where D is of the form & D1. If T D1 has type "<some stuff> T", then T D has type "<some stuff> reference to T". So what is the type of T D1?

T D1 = const float e

Step 10 Well it's just "T" of course! There is no <some stuff> at this level. This is given by the base case rule in §8.3/5.

And we're done!

So now if we look at the type we determined at each step, substituting the <some stuff>s from each level below, we can determine the type of e in float const (*(*(&e)[10])())[5]:

<some stuff> array of 5 T
│ └──────────┐
<some stuff> pointer to T
│ └────────────────────────┐
<some stuff> function of () returning T
| └──────────┐
<some stuff> pointer to T
| └───────────┐
<some stuff> array of 10 T
| └────────────┐
<some stuff> reference to T
| |
<some stuff> T

If we combine this all together, what we get is:

reference to array of 10 pointer to function of () returning pointer to array of 5 const float

Nice! So that shows how the compiler deduces the type of a declaration. Remember that this is applied to each declaration of an identifier if there are multiple declarators. Try figuring out these:

Quick Quiz (you might need to reference the standard)

  1. What is the type of x in the declaration bool **(*x)[123];?


    "pointer to array of 123 pointer to pointer to bool"

  2. What are the types of y and z in the declaration int const signed *(*y)(int), &z = i;?


    y is a "pointer to function of (int) returning pointer to const signed int"

    z is a "reference to const signed int"

If anybody has any corrections, please let me know!

In the standard, what is derived-declarator-type ?

It's being defined right there and then. It's a way of carrying whatever comes before T across to the next type, similar to:

<some stuff> T
<some stuff> reference to T

It's just whatever comes before T in the type of T D1.

For example, if you have the declaration int& (*const * p)[30], T is int, D is & (*const * p)[30] and D1 is (*const * p)[30]. The type of T D1 is "pointer to const pointer to array of 30 int". And so, according to the rule you quoted, the type of p is "pointer to const pointer to array of 30 reference to int".

Of course, this declaration is then disallowed by §3.4.2/5:

There shall be no references to references, no arrays of references, and no pointers to references.

I think the informal terminology of it being a derived declarator type list comes from the C standard's definition of a derived type (similar to a compound type in C++):

Any number of derived types can be constructed from the object, function, and
incomplete types, as follows:

  • An array type [...]
  • An structure type [...]
  • An union type [...]
  • An function type [...]
  • An pointer type [...]

In response to the comments: It seems you're getting confused between the type and the declarator. For example, if int* p is the declarator, then the type of p is "pointer to int". The type is expressed as these English-like sentences.

Example 1: int *(&p)[30]

This is a declaration T D where (§8.3.1 Pointers):

  • T -> int
  • D -> *(&p)[3]

D has the form:

* attribute-specifier-seqopt cv-qualifier-seqopt D1

where D1 is (&p)[3]. That means T D1 is of the form int (&p)[3] which has type "reference to array of 3 int" (you work this out recursively, next step using §8.3.4 Arrays and so on). Everything before the int is the derived-declarator-type-list. So we can infer that p in our original declaration has type "reference to array of 3 pointer to int". Magic!

Example 2: float (*(*(&e)[10])())[5]

This is a declaration T D where (§8.3.4 Arrays):

  • T -> float
  • D -> (*(*(&e)[10])())[5]

D is of the form:

D1 [ constant-expressionopt ] attribute-specifier-seqopt

where D1 is (*(*(&e)[10])()). This means T D1 is of the form float (*(*(&e)[10])()) which has type "reference to array of 10 pointer to function of () returning pointer to float" (which you work out by applying §8.3/6 and then §8.3.1 Pointers and so on). Everything before the float is the derived-declarator-type-list. So we can infer that p in our original declaration has type "reference to array of 10 pointer to function of () returning pointer to array of 5 float". Magic again!

C declarator understanding

You omitted an important previous paragraph:

4 In the following subclauses, consider a declaration

    T D1

where T contains the declaration specifiers that specify a type T (such as int) and D1 is a declarator that contains an identifier ident. The type specified for the identifier ident in the various forms of declarator is described inductively using this notation.

So, when we get to paragraphs 5 and 6, we know the declaration we are considering contains within it some identifier which we label ident. E.g., in int foo(void), ident is foo.

Paragraph 5 says that if the declaration “T D1” is just ”T ident”, it declares ident to be of type T.

Paragraph 6 says that if the declaration “T D1” is just ”T (ident)”, it also declares ident to be of type T.

These are just establishing the base cases for a recursive specification of declaration. Clause 6.7.5.1 goes on to say that if the declaration “T D1” is ”T * some-qualifiers D” and the same declaration without the * and the qualifiers, ”T D” would declare ident to be “some-derived-type T” (like “array of T” or “pointer to T”), then the declaration with the * and the qualifiers declares ident* to be “some-derived-type some-qualifiers pointer to T”.

For example, int x[3] declares x to be “array of 3 int”, so this rule in 6.7.5.1 tells us that “int * const x[3] declares x to be “array of 3 const pointer to int”—it takes the “array of 3” that must have been derived previously and appends “const pointer to” to it.

Similarly, clauses 6.7.5.2 and 6.7.5.3 tell us to append array and function types to declarators with brackets (for subscripts) and postfix parentheses.

Is there a mistake in the C standard description of pointer declarators?

For the record, there are three answers that were deleted by their authors, supporting a conclusion that this aspect of the C standard is tricky. I do not believe the other current answer correctly matches the example source text (int * const * foo) to the symbols in the passage from the standard (T, D1, type-qualifier-list, and so on).

Thus I conclude this is indeed a mistake in the standard.

I believe a fix is simply to remove the last sentence, “For each type qualifier in the list, ident is a so-qualified pointer.” Any qualifier in the type-qualifier list is already incorporated, correctly, into the previous sentence, and any qualifier inside D is already incorporated in that declarator. So this seems like just a superfluous sentence that may have arisen inadvertently in some edit.

Declaration with multiple declarators - Definition?

Each declarator is individually considered to define or merely declare its identifier.

f() is only declared. There should be a definition somewhere else.

i is defined. A subsequent declaration would need to use extern to avoid being a redefinition.

§3.1 Declarations and definitions in the C++14 standard says,

A declaration is a definition unless it declares a function without specifying the function’s body, it contains the extern specifier or…

The paragraph goes on and on with quite a few rules and exceptions. It may perhaps be a defect in the standard that it fails to mention declarators there, despite discussing features that do not immediately appertain to entire declarations.

We also have §8/3,

Each init-declarator in a declaration is analyzed separately as if it was in a declaration by itself.

This could be interpreted to override the "contagious" formulation of rules in §3.1/2.

C11: how does fixed-length array declaration fit in the C11 standard's specification?

5 is assignment-expression.

If you look at the definition of assignment-expression, one of it is conditional-expression. And one definition for that is logical-OR-expression. By tracking down this definition chain, you'll eventually reach primary-expression, for which on one definition is constant.

Understanding declarations in the same scope

Case 7

It is explained by 6.7.2.3 paragraph 1, 4 and 5 (page 137) (emphasis is mine)

1 A specific type shall have its content defined at most once.

4 All declarations of structure, union, or enumerated types that
have the same scope and use the same tag declare the same type.

Irrespective of whether there is a tag or what other declarations of
the type are in the same translation unit, the type is incomplete
[footnote 129)] until immediately after the closing brace of the list
defining the content, and complete thereafter.

5 Two declarations of structure, union, or enumerated types which
are in different scopes or use different tags declare distinct types.
Each declaration of a structure, union, or
enumerated type which does not include a tag declares a distinct type.

So an example of identical types of enums [if not for paragraph 1] would be like

enum TagNameA
{
a
};
enum TagNameA
{
a
};

Case 8
It is explained by 6.7.2.2 paragraph 3 (page 136) (emphasis is mine)

The identifiers in an enumerator list are declared as constants that
have type int and may appear wherever such are permitted [footnote:
127)]

...

[footnote 127)] Thus, the identifiers of enumeration constants
declared in the same scope shall all be distinct from each other and
from other identifiers declared in ordinary declarators.

where in Case 8

const unsigned char a;

is an ordinary declarator for a that is not distinct from the enumeration constant identifier a.



Related Topics



Leave a reply



Submit