Standard Conversions: Array-To-Pointer Conversion

standard conversions: Array-to-pointer conversion

In both C and C++, an array can be used as if it were a pointer to its first element. Effectively, given an array named x, you can replace most uses of &x[0] with just x.

This is how subscripting is able to be used with array objects:

int x[5];
x[2]; // this is the same as (&x[0])[2]

This is also how an array can be passed to a function that has a parameter of pointer type:

void f(int* p);

int x[5];
f(x); // this is the same as f(&x[0])

There are several contexts in which the array-to-pointer conversion does not take place. Examples include when an array is the operand of sizeof or the unary-& (the address-of operator), when a string literal is used to initialize an array, and when an array is bound to a reference to an array.

When are arrays converted to pointers?

I know of the following expressions in which an array is not converted/decayed to a pointer.

  1. When used in a sizeof operator: sizeof(array)
  2. When used in an addressof operator: &array
  3. When used to bind a reference to an array: int (&ref)[3] = array;.
  4. When deducing the typename to be used for instantiating templates.
  5. When used in decltype: decltype(array)

Wording of array-to-pointer conversion and undefined behaviour

The paragraph is just in general imprecise, in my opinion. It doesn't say what "the array" refers to at all. No array has been introduced before, only array types.

I guess it should probably state explicitly that it refers to the array object result of the glvalue, after temporary materialization if applicable.

Then I think it should also have a requirement that the type of that result object be similar to that of the original expression type. That way the result object will always be an array object and it can't have a "wrong" type in the same sense as for pointer arithmetic which already applies in a manual &pa[0] "decay". (see [expr.add]/6)

But that is my own interpretation of what seems a reasonable interpretation/improvement. I don't think the current wording makes that clear.


This is not the only part of the standard with imprecise wording in regard to lvalue type mismatches like this. See for example CWG 2535 for a similar situation with member access, where a similar resolution is suggested.

Array-to-pointer conversion in VC++

In paragraph 4 of 5.16 [expr.cond] in the standard says:

If the second and third operands are glvalues of the same value category and have the same type, the result
is of that type and value category [...]

So the result should be an lvalue to type array of 1 int in your case.

Paragraph 5 starts with "Otherwise, the result is a prvalue" which is obviously the other case to the "If..." of paragraph 4.

In my reading paragraph 6 also doesn't apply.

Lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are performed on the second and third operands.

If we've determined that the result is a glvalue then there's no way that the after an lvalue-to-rvalue conversion we have a valid object to use as the result so paragraph 6 must be following on the "Otherwise..." started in paragraph 5. To be fair, I don't think that this is completely unambiguous even if an alternative reading to mine would result in greater inconsistency.

tl;dr: I think that gcc is correct in this case.

A workaround could be to use: *(true ? &ary1 : &ary2) instead.

Is the array to pointer decay changed to a pointer object?

"But a itself is not pointing to another region of memory, it IS the region of memory itself.

"So when the compiler converts it to a pointer, does it save it (like p) somewhere in memory or it's an implicit conversion?"

It is an implicit conversion. The compiler does not implement the creation of a separate pointer object in memory (which you can f.e. assign in any manner with a different memory address) to hold the address of the first element.

The standard states (emphasize mine):

"Except when it is the operand of the sizeof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined."

Source: ISO/IEC 9899:2018 (C18), 6.3.2.1/4

The array is converted to an expression of pointer type, it is not an lvalue.

The compiler just evaluates a to &a[0] (pointer to a[0]).



"I understand that array names are converted to pointers."

An array does not always convert to a pointer to its first element. Look at the first part of the quote above. F.e. when used as &a, a does not decay to a pointer to its first element. Rather it gains a pointer to the whole array int (*)[3].

Can a pointer convert to an array during a function call?

Pointers are pointers, and arrays are arrays. However, arrays naturally decays to pointers to their first element. So when you pass the array arr to any of the function you have, it will decay to &arr[0].

Also note that when declaring function arguments, array-notation (like int arr[]) doesn't mean that it's an array, it's still translated by the compiler as a pointer (i.e. int* arr).

Regarding the decay from array to pointer, it can't happen the other way. Once you have a pointer, all you have is that pointer and the single element it points to.

Overload resolution and array-to-pointer decay - why is int (&a)[2] and int* a considered equally exact regarding overload resolution

@463035818_is_not_a_number already explained in their answer why void foo(int* a); is a better match than template<std::size_t N> void foo(int (&a)[N]).

This answer only applies to the second example in the question:

void foo(int (&a)[2]);
void foo(int* a);

// why is this call ambiguous?
int arr[2];
foo(arr);


1. Finding the best matching function

Both functions can be found by name lookup, are candidate functions and viable functions.

So which function will be called (or if the call is ambiguous) is based on wether any of them is a better viable function than the other.

As per 12.4.3 Best viable function [over.match.best] (2):

(2) Given these definitions, a viable function F1 is defined to be a better function than another viable function F2 if for all arguments i, ICSi(F1) is not a worse conversion sequence than ICSi(F2), and then

[...]

(2.1) for some argument j, ICSj(F1) is a better conversion sequence than ICSj(F2), [...]

So to determine if any of the two functions is better we need to check the conversion sequences of their arguments and compare them.



2. Comparing the conversion sequences

2.1 Required conversion sequences

Let's first determine which conversion sequences we need for both calls:

  • void foo(int (&a)[2]);

    only requires the identity conversion sequence, as per 12.4.3.1.4 Reference binding [over.ics.ref] (1):

    (1) When a parameter of reference type binds directly to an argument expression, the implicit conversion sequence is the identity conversion [...]

  • void foo(int* a);

    requires a conversion sequence consisting of an Array-to-pointer conversion conversion, as per 7.3.2 Array-to-pointer conversion [conv.array] (1):

    (1) An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to a prvalue of type “pointer to T”. [...]

2.2 Which conversion sequence is better?

To determine which conversion sequence is better we need to reference 12.4.3.2 Ranking implicit conversion sequences [over.ics.rank] - in this case mainly paragraph (3) and (4).

(1) This subclause defines a partial ordering of implicit conversion sequences based on the relationships better conversion sequence and better conversion. If an implicit conversion sequence S1 is defined by these rules to be a better conversion sequence than S2, then it is also the case that S2 is a worse conversion sequence than S1. If conversion sequence S1 is neither better than nor worse than conversion sequence S2, S1 and S2 are said to be indistinguishable conversion sequences.

2.3 Is one of the conversion sequences a subsequence of the other?

Note: we'll skip (3.1) because it only applies to list-initialization.

So let's start with the first condition, (3.2.1):

(3.2) Standard conversion sequence S1 is a better conversion sequence than standard conversion sequence S2 if:

  • (3.2.1) S1 is a proper subsequence of S2 (comparing the conversion sequences in the canonical form defined by [over.ics.scs], excluding any Lvalue Transformation; the identity conversion sequence is considered to be a subsequence of any non-identity conversion sequence) or, if not that, [...]

Note: Due to questions about this clause in the comments i'll explain this rule in detail.

tl;dr: Neither conversion sequence is a subsequence of the other.

To fully understand this rule we need to first define a few terms:

  • sequence (as in standard conversion sequence) refers to a mathematical sequence; in essence a sequence in maths is a set with the additional constraints that the order of elements matters and repetition of elements is allowed.
  • proper subsequence (or strict subsequence)

    Given Sequences A and B, if A is a subsequence of B, and A is not equal to B, then A is a proper subsequence of B. (Sample Image)

    (see subset / proper subset - it's the same for sequences)
    A few examples to illustrate the principle:
    • (A, B) is a proper subsequence of (A, B, C, D) (we can remove elements)
    • (B, C) is a proper subsequence of (A, B, C, D) (removing is allowed at any position)
    • (A, B, C) is not a proper subsequence of (A, C, B, D) (order matters)
    • (A) is not a proper subsequence of (A) (if both sequences are equal, neither is a proper subsequence of the other)
    • () is a proper subsequence of (A, B, C) (the empty sequence is a proper subsequence of all non-empty sequences)
  • identity (as in identity conversion / identity conversion sequence) refers to the mathematical identity element (often shortened to just indentity)
    • The terms identity conversion & identity conversion sequence in this case refers to an empty sequence of conversions (()) - applying no conversions to a value always results in the original value: value ∘ () = value

So with this we can break down the (3.2.1) rule:


  • comparing the conversion sequences in the canonical form defined by [over.ics.scs], excluding any Lvalue Transformation

    • [over.ics.scs] describes the way we need to order the conversions (if we have more than one) - because remember ordering does matter for sequences.
    • If a conversion sequence contains a Lvalue Transformation we need to remove it before checking for proper subsequences.

  • the identity conversion sequence is considered to be a subsequence of any non-identity conversion sequence

    • This is a roundabout way of saying that an empty conversion sequence is considered a subsequence of any non-empty conversion sequence (one of the rules for proper subsequences).

  • Standard conversion sequence S1 is a better conversion sequence than standard conversion sequence S2 if:

    • S1 is a proper subsequence of S2
    • So we need to check if S1 is a proper subsequence of S2, and if it is then S1 is better than S2

So now let's apply this to our specific case:

  • We have sequence S1 for foo(int (&a)[2]): () (identity conversion sequence)
  • and sequence S2 for foo(int* a);: ("Array-to-pointer conversion")

Now we need to remove Lvalue-conversions.
In 12.4.3.1.1 Standard conversion sequences [over.ics.scs] (3) Array-to-pointer conversion is listed as an Lvalue-conversion, therefore we must remove it from S2.

So S2 = ()

Now we need to check if S1 is a proper subsequence of S2.

It can't be a proper subsequence because S1 = S2, therefore this rule does not apply.

2.4 ranking the conversion sequences

Next we need to check the rank of the conversion sequences as per (3.2.2)

(3.2.2) the rank of S1 is better than the rank of S2, or S1 and S2 have the same rank and are distinguishable by the rules in the paragraph below, or, if not that,

There are three possible ranks for conversions: Exact Match, Promotion or Conversion, as per 12.4.3.1.1 Standard conversion sequences [over.ics.scs] (3):

(3) Each conversion [...] also has an associated rank (Exact Match, Promotion, or Conversion). These are used to rank standard conversion sequences. The rank of a conversion sequence is determined by considering the rank of each conversion in the sequence and the rank of any reference binding. If any of those has Conversion rank, the sequence has Conversion rank; otherwise, if any of those has Promotion rank, the sequence has Promotion rank; otherwise, the sequence has Exact Match rank.


















































































ConversionCategoryRank
No conversions requiredIdentityExact Match
Lvalue-to-rvalue conversionLvalue TransformationExact Match
Array-to-pointer conversionLvalue TransformationExact Match
Function-to-pointer conversionLvalue TransformationExact Match
Qualification conversionsQualification AdjustmentExact Match
Function pointer conversionQualification AdjustmentExact Match
Integral promotionsPromotionPromotion
Floating-point promotionPromotionPromotion
Integral conversionsConversionConversion
Floating-point conversionsConversionConversion
Floating-integral conversionsConversionConversion
Pointer conversionsConversionConversion
Pointer-to-member conversionsConversionConversion
Boolean conversionsConversionConversion

convert int array to pointer in C

Its perfectly valid in C. foo argument is a pointer that can point to any type. When you pass an array, it decays to a pointer pointing to the first element of the array (i.e.,address location of the first element is passed). So,

 ptr -> &buf[0] ;

Is converting between pointer-to-T, array-of-T and pointer-to-array-of-T ever undefined behaviour?

This is a C-only answer.

C11 (n1570) 6.3.2.3 p7

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned*) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

*) In general, the concept “correctly aligned” is transitive: if a pointer to type A is correctly aligned for a pointer to type B, which in turn is correctly aligned for a pointer to type C, then a pointer to type A is correctly aligned for a pointer to type C.

The standard is a little vague what happens if we use such a pointer (strict aliasing aside) for anything else than converting it back, but the intent and wide-spread interpretation is that such pointers should compare equal (and have the same numerical value, e.g. they should also be equal when converted to uintptr_t), as an example, think about (void *)array == (void *)&array (converting to char * instead of void * is explicitly guaranteed to work).

T(*pa1)[6] = (T(*)[6])a;

This is fine, the pointer is correctly aligned (it’s the same pointer as &a).

T(*pa2)[3][2] = (T(*)[3][2])a; // (i)
T(*pa3)[1][2][3] = (T(*)[1][2][3])a; // (ii)

Iff T[6] has the same alignment requirements as T[3][2], and the same as T[1][2][3], (i), and (ii) are safe, respectively. To me, it sounds strange, that they couldn’t, but I cannot find a guarantee in the standard that they should have the same alignment requirements.

T *p = a; // safe, of course
T *p1 = *pa1; // *pa1 has type T[6], after lvalue conversion it's T*, OK
T *p2 = **pa2; // **pa2 has type T[2], or T* after conversion, OK
T *p3 = ***pa3; // ***pa3, has type T[3], T* after conversion, OK

Ignoring the UB caused by passing int * where printf expects void *, let’s look at the expressions in the arguments for the next printf, first the defined ones:

a[5] // OK, of course
(*pa1)[5]
(*pa2)[2][1]
(*pa3)[0][1][2]
p[5] // same as a[5]
p1[5]

Note, that strict aliasing isn’t a problem here, no wrongly-typed lvalue is involved, and we access T as T.

The following expressions depend on the interpretation of out-of-bounds pointer arithmetic, the more relaxed interpretation (allowing container_of, array flattening, the “struct hack” with char[], etc.) allows them as well; the stricter interpretation (allowing a reliable run-time bounds-checking implementation for pointer arithmetic and dereferencing, but disallowing container_of, array flattening (but not necessarily array “lifting”, what you did), the struct hack, etc.) renders them undefined:

p2[5] // UB, p2 points to the first element of a T[2] array
p3[5] // UB, p3 points to the first element of a T[3] array


Related Topics



Leave a reply



Submit