Is Adding to a "Char *" Pointer Ub, When It Doesn't Actually Point to a Char Array

Is adding to a char * pointer UB, when it doesn't actually point to a char array?

See CWG 1314

According to 6.9 [basic.types] paragraph 4,

The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T).


and 4.5 [intro.object] paragraph 5,

An object of trivially copyable or standard-layout type (6.9 [basic.types]) shall occupy contiguous bytes of storage.


Do these passages make pointer arithmetic (8.7 [expr.add] paragraph 5) within a standard-layout object well-defined (e.g., for writing one's own version of memcpy?

Rationale (August, 2011):

The current wording is sufficiently clear that this usage is permitted.

I strongly disagree with CWG's statement that "the current wording is sufficiently clear", but nevertheless, that's the ruling we have.

I interpret CWG's response as suggesting that a pointer to unsigned char into an object of trivially copyable or standard-layout type, for the purposes of pointer arithmetic, ought to be interpreted as a pointer to an array of unsigned char whose size equals the size of the object in question. I don't know whether they intended that it would also work using a char pointer or (as of C++17) a std::byte pointer. (Maybe if they had decided to actually clarify it instead of claiming the existing wording was clear enough, then I would know the answer.)

(A separate issue is whether std::launder is required to make the OP's code well-defined. I won't go into this here; I think it deserves a separate question.)

What is the difference between char array and char pointer in C?

char* and char[] are different types, but it's not immediately apparent in all cases. This is because arrays decay into pointers, meaning that if an expression of type char[] is provided where one of type char* is expected, the compiler automatically converts the array into a pointer to its first element.

Your example function printSomething expects a pointer, so if you try to pass an array to it like this:

char s[10] = "hello";
printSomething(s);

The compiler pretends that you wrote this:

char s[10] = "hello";
printSomething(&s[0]);

What is the difference between char s[] and char *s?

The difference here is that

char *s = "Hello world";

will place "Hello world" in the read-only parts of the memory, and making s a pointer to that makes any writing operation on this memory illegal.

While doing:

char s[] = "Hello world";

puts the literal string in read-only memory and copies the string to newly allocated memory on the stack. Thus making

s[0] = 'J';

legal.

Why is it not allowed to assign const char * to const variable?

Because name is non-const, it implies you are allowed to change the values.

For example:

*name = 'S'; // Change from "something" to "Something"

But _name was declared const, meaning you cannot change it.

You cannot take fixed, constant data, and assign it to a different variable; that is saying "It's OK if you change this".

Array of Pointers in C how is the operator [ ] used

In the first code snippet, you are initializing an array of pointers with character constants. This results in an integer-to-pointer conversion of those constants. So for example the first element of the array contains the address 97 (assuming ASCII encoding).

When you later attempt to print, you are passing a char * where a char is expected. Using the wrong format specifier triggers undefined behavior. One of the ways that UB can manifest is that things appear to work properly which is the case here.

What probably happened is that pointers and integers get passed to functions in the same manner. And if your system uses little-endian byte representation (which it appears it does), it will end up reading the value used to initialize the array.

Regarding the array index operator [], the expression E1[E2] is exactly the same as *((E1) + (E2)). In the first code snippet array[i] has type char * while in the second code snippet it has type char because that is the type of the respective array elements.



Related Topics



Leave a reply



Submit