Why am I Being Told That an Array Is a Pointer? What Is the Relationship Between Arrays and Pointers in C++

Array and pointers in c++

I often hear that the name of an array is constant pointer to a block of memory

You've often been mislead - or you've simply misunderstood. An array is not a constant pointer to a block of memory. Array is an object that contains a sequence of sub-objects. All objects are a block of memory. A pointer is an object that contains an address of an object i.e. it points to the object.

So in the following quote, a is an array, p points to the first sub-object within a.

int a[10];

and

int * const p= a;

must be equal in a sense that p is pointer that points to the same block of memory as array a[] and also it may not be changed to point to another location in memory.

If that is your definition of equal, then that holds for non-array objects as well:

char c;
int * const p = &c;

Here p "points to the same memory as c" and may not be changed to point to another location in memory. Does that mean that char objects are "equal" to pointers? No. And arrays aren't either.

But isn't a (the name of the array), a constant pointer that points to the same element of the array?

No, the name of the array isn't a constant pointer. Just like name of the char isn't a constant pointer.

the name of an array holds the address of the first element in the array, right?

Let's be more general, this is not specific to arrays. The name of a variable "holds the address" of the object that the variable names. The address is not "held" in the memory at run time. It's "held" by the compiler at compile time. When you operate on a variable, the compiler makes sure that operations are done to the object at the correct address.

The address of the array is always the same address as where the first element (sub-object) of the array is. Therefore, the name indeed does - at least conceptually - hold the same address.

And if i use *(a+1), this is the same as a[1], right? [typo fixed]

Right. I'll elaborate: One is just another way of writing another in the case of pointers. Oh, but a isn't a pointer! Here is the catch: The array operand is implicitly converted to a pointer to first element. This implicit conversion is called decaying. This is special feature of array types - and it is the special feature which probably makes understanding the difference between pointers and arrays difficult the most.

So, even though the name of the array isn't a pointer, it can decay into a pointer. The name doesn't always decay into a pointer, just in certain contexts. It decays when you use operator[], and it decays when you use operator+. It decays when you pass the array to a function that accepts a pointer to the type of the sub-object. It doesn't decay when you use sizeof and it doesn't decay when you pass it to a function that accepts an array by reference.

Are arrays Pointers?

Let's get the important stuff out of the way first: arrays are not pointers. Array types and pointer types are completely different things and are treated differently by the compiler.

Where the confusion arises is from how C treats array expressions. N1570:

6.3.2.1 Lvalues, arrays, and function designators

...

3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.

Let's look at the following declarations:

int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int *parr = arr;

arr is a 10-element array of int; it refers to a contiguous block of memory large enough to store 10 int values. The expression arr in the second declaration is of array type, but since it is not the operand of & or sizeof and it isn't a string literal, the type of the expression becomes "pointer to int", and the value is the address of the first element, or &arr[0].

parr is a pointer to int; it refers to a block of memory large enough to hold the address of a single int object. It is initialized to point to the first element in arr as explained above.

Here's a hypothetical memory map showing the relationship between the two (assuming 16-bit ints and 32-bit addresses):


Object Address 0x00 0x01 0x02 0x03
------ ------- ----------------------
arr 0x10008000 0x00 0x00 0x00 0x01
0x10008004 0x00 0x02 0x00 0x03
0x10008008 0x00 0x04 0x00 0x05
0x1000800c 0x00 0x06 0x00 0x07
0x10008010 0x00 0x08 0x00 0x09
parr 0x10008014 0x10 0x00 0x80 0x00

The types matter for things like sizeof and &; sizeof arr == 10 * sizeof (int), which in this case is 20, whereas sizeof parr == sizeof (int *), which in this case is 4. Similarly, the type of the expression &arr is int (*)[10], or a pointer to a 10-element array of int, whereas the type of &parr is int **, or pointer to pointer to int.

Note that the expressions arr and &arr will yield the same value (the address of the first element in arr), but the types of the expressions are different (int * and int (*)[10], respectively). This makes a difference when using pointer arithmetic. For example, given:

int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int *p = arr;
int (*ap)[10] = &arr;

printf("before: arr = %p, p = %p, ap = %p\n", (void *) arr, (void *) p, (void *) ap);
p++;
ap++;
printf("after: arr = %p, p = %p, ap = %p\n", (void *) arr, (void *) p, (void *) ap);

the "before" line should print the same values for all three expressions (in our hypothetical map, 0x10008000). The "after" line should show three different values: 0x10008000, 0x10008002 (base plus sizeof (int)), and 0x10008014 (base plus sizeof (int [10])).

Now let's go back to the second paragraph above: array expressions are converted to pointer types in most circumstances. Let's look at the subscript expression arr[i]. Since the expression arr is not appearing as an operand of either sizeof or &, and since it is not a string literal being used to initialize another array, its type is converted from "10-element array of int" to "pointer to int", and the subscript operation is being applied to this pointer value. Indeed, when you look at the C language definition, you see the following language:

6.5.2.1 Array subscripting
...

2 A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

In practical terms, this means you can apply the subscript operator to a pointer object as though it were an array. This is why code like

int foo(int *p, size_t size)
{
int sum = 0;
int i;
for (i = 0; i < size; i++)
{
sum += p[i];
}
return sum;
}

int main(void)
{
int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int result = foo(arr, sizeof arr / sizeof arr[0]);
...
}

works the way it does. main is dealing with an array of int, whereas foo is dealing with a pointer to int, yet both are able to use the subscript operator as though they were both dealing with an array type.

It also means array subscripting is commutative: assuming a is an array expression and i is an integer expression, a[i] and i[a] are both valid expressions, and both will yield the same value.

Is an array name a pointer?

An array is an array and a pointer is a pointer, but in most cases array names are converted to pointers. A term often used is that they decay to pointers.

Here is an array:

int a[7];

a contains space for seven integers, and you can put a value in one of them with an assignment, like this:

a[3] = 9;

Here is a pointer:

int *p;

p doesn't contain any spaces for integers, but it can point to a space for an integer. We can, for example, set it to point to one of the places in the array a, such as the first one:

p = &a[0];

What can be confusing is that you can also write this:

p = a;

This does not copy the contents of the array a into the pointer p (whatever that would mean). Instead, the array name a is converted to a pointer to its first element. So that assignment does the same as the previous one.

Now you can use p in a similar way to an array:

p[3] = 17;

The reason that this works is that the array dereferencing operator in C, [ ], is defined in terms of pointers. x[y] means: start with the pointer x, step y elements forward after what the pointer points to, and then take whatever is there. Using pointer arithmetic syntax, x[y] can also be written as *(x+y).

For this to work with a normal array, such as our a, the name a in a[3] must first be converted to a pointer (to the first element in a). Then we step 3 elements forward, and take whatever is there. In other words: take the element at position 3 in the array. (Which is the fourth element in the array, since the first one is numbered 0.)

So, in summary, array names in a C program are (in most cases) converted to pointers. One exception is when we use the sizeof operator on an array. If a was converted to a pointer in this context, sizeof a would give the size of a pointer and not of the actual array, which would be rather useless, so in that case a means the array itself.

Understanding difference/similarities with array and pointers in c++

The main difference between arrays and pointers is that they are completely different things.

As array is a collection of objects, which is laid out contiguously in memory. For example, int x[5] defines an array named x, which is a collection of 5 integers, laid out side by side in memory. Individual elements in the array may be accessed using "array syntax" of the form x[i] where i is an integral value with values between 0 and 4. (Other values of i will result in undefined behaviour).

A pointer is a variable which holds a value that is an address in memory. For example, int *p defines p as a pointer to an int, and it can be initialised with the address of a variable of type int. For example, p = &some_int causes p to contain the address of some_int. When that is done, the notation *p (called dereferencing) provides access to the pointed-to variable. For example, *p = 42 will set some_int to have the value 42.

You'll notice, in the description above, I have not used the word "pointer" in describing an array, nor have I used the word "array" to describe a pointer. They are completely different things.

However, they can be used in ways that makes them seem the same, because of a few rules in the language. Firstly, there is a conversion called the "array-to-pointer" conversion. Because of this, it is possible to do

  int x[5];
int *p = x;

The initialisation of p actually works by using the array-to-pointer conversion. Because it is being used to initialise a pointer, the compiler implicitly converts x to a pointer, equal to the address of x[0]. To do this explicitly (without the compiler silently and sneakily doing a conversion) you could have written

  int *p = &x[0];

and got exactly the same effect. Either way, the assignment *p = 42 will subsequently have the effect of assigning x[0] to 42.

That suggests there is a relationship between expressions involving pointers and expressions involving (the name of) arrays. If p is equal to &x[0], then

  • p + i is equivalent to &x[i]; AND
  • *(p + i) is equivalent to x[i].

The language rules of C and C++ make these relationships symmetric, so (look carefully here)

  • x + i is equivalent to &x[i]; AND
  • *(x + i) is equivalent to x[i]

and, with pointers

  • p + i is equivalent to &p[i]; AND
  • *(p + i) is equivalent to p[i]

Which basically means that pointer syntax can be used to work with arrays (thanks to the pointer-to-array conversion) AND array syntax can be used to work with pointers.

Really bad textbooks then go on from this and conclude that pointers are arrays and that arrays are pointers. But they are not. If you find textbooks which say such things, burn them. Arrays and pointers are different things entirely. What we have here is a syntactic equivalence - even though arrays and pointers are different things entirely, they can be worked on using the same syntax.

One of the differences - where the syntactic equivalence does not apply - is that arrays cannot be reassigned. For example;

 int x[5];
int y[5];
int *p = y; // OK - pointer to array conversion

x = y; // error since x is an array
x = p; // error since x is an array

The last two statements will be diagnosed by a C or C++ compiler as an error, because x is an array.

Your example

 int *pointer = new int[10];

is a little different again. pointer is still not an array. It is a pointer, initialised with a "new expression", which dynamically allocates an array of 10 integers. But because of the syntactic equivalence of pointers and arrays, pointer can be treated syntactically AS IF it is an array of 10 elements.

Note: the above is concerned with raw arrays. The C++ standard library also has a type named std::array which is a data structure which contains an array, but behaves somewhat differently than described here.

Are pointers and arrays any different in C?

Your code snippet is correct. However, pointers and arrays in C are indeed different. Put simply "the pointer to type T" is not same as "the array of type T".

Please have a look at C Faq discussing Pointers & arrays to get a better understanding of this.

C: differences between char pointer and array

True, but it's a subtle difference. Essentially, the former:

char amessage[] = "now is the time";

Defines an array whose members live in the current scope's stack space, whereas:

char *pmessage = "now is the time";

Defines a pointer that lives in the current scope's stack space, but that references memory elsewhere (in this one, "now is the time" is stored elsewhere in memory, commonly a string table).

Also, note that because the data belonging to the second definition (the explicit pointer) is not stored in the current scope's stack space, it is unspecified exactly where it will be stored and should not be modified.

Edit: As pointed out by Mark, GMan, and Pavel, there is also a difference when the address-of operator is used on either of these variables. For instance, &pmessage returns a pointer of type char**, or a pointer to a pointer to chars, whereas &amessage returns a pointer of type char(*)[16], or a pointer to an array of 16 chars (which, like a char**, needs to be dereferenced twice as litb points out).

difference between pointer to an array and pointer to the first element of an array

At runtime, a pointer is a "just a pointer" regardless of what it points to, the difference is a semantic one; pointer-to-array conveys a different meaning (to the compiler) compared with pointer-to-element

When dealing with a pointer-to-array, you are pointing to an array of a specified size - and the compiler will ensure that you can only point-to an array of that size.

i.e. this code will compile

int theArray[5];
int (*ptrToArray)[5];
ptrToArray = &theArray; // OK

but this will break:

int anotherArray[10];
int (*ptrToArray)[5];
ptrToArray = &anotherArray; // ERROR!

When dealing with a pointer-to-element, you may point to any object in memory with a matching type. (It doesn't necessarily even need to be in an array; the compiler will not make any assumptions or restrict you in any way)

i.e.

int theArray[5];
int* ptrToElement = &theArray[0]; // OK - Pointer-to element 0

and..

int anotherArray[10];
int* ptrToElement = &anotherArray[0]; // Also OK!

In summary, the data type int* does not imply any knowledge of an array, however the data type int (*)[5] implies an array, which must contain exactly 5 elements.



Related Topics



Leave a reply



Submit