What Is the Type of String Literals in C and C++

What is the type of string literals in C and C++?

In C the type of a string literal is a char[] - it's not const according to the type, but it is undefined behavior to modify the contents. Also, 2 different string literals that have the same content (or enough of the same content) might or might not share the same array elements.

From the C99 standard 6.4.5/5 "String Literals - Semantics":

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

In C++, "An ordinary string literal has type 'array of n const char'" (from 2.13.4/1 "String literals"). But there's a special case in the C++ standard that makes pointer to string literals convert easily to non-const-qualified pointers (4.2/2 "Array-to-pointer conversion"):

A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”.

As a side note - because arrays in C/C++ convert so readily to pointers, a string literal can often be used in a pointer context, much as any array in C/C++.


Additional editorializing: what follows is really mostly speculation on my part about the rationale for the choices the C and C++ standards made regarding string literal types. So take it with a grain of salt (but please comment if you have corrections or additional details):

I think that the C standard chose to make string literal non-const types because there was (and is) so much code that expects to be able to use non-const-qualified char pointers that point to literals. When the const qualifier got added (which if I'm not mistaken was done around ANSI standardization time, but long after K&R C had been around to accumulate a ton of existing code) if they made pointers to string literals only able to be be assigned to char const* types without a cast nearly every program in existence would have required changing. Not a good way to get a standard accepted...

I believe the change to C++ that string literals are const qualified was done mainly to support allowing a literal string to more appropriately match an overload that takes a "char const*" argument. I think that there was also a desire to close a perceived hole in the type system, but the hole was largely opened back up by the special case in array-to-pointer conversions.

Annex D of the standard indicates that the "implicit conversion from const to non-const qualification for string literals (4.2) is deprecated", but I think so much code would still break that it'll be a long time before compiler implementers or the standards committee are willing to actually pull the plug (unless some other clever technique can be devised - but then the hole would be back, wouldn't it?).

What is the type of string literal in C? [duplicate]

String literals in C are not pointers, they are arrays of chars. You can tell this by looking at sizeof("hello, world"), which is 13, because null terminator is included in the size of the literal.

C99 allows string literals to be assigned to char *, which is different from C++, which requires const char *.

What is the type of a string literal in C++? [duplicate]

The type of the string literal "Hello" is "array of 6 const char".

Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string [...]

It can, however, be converted to a const char* by array-to-pointer conversion. Array-to-pointer conversion results in a pointer to the first element of the array.

What is the data type of a string literal in C++?

Expressions have type. String literals have type if they are used as an expression. Yours isn't.

Consider the following code:

#include <stdio.h>

#define STR "HelloHelloHello"

char global[] = STR;

int main(void)
{
char local[] = STR;
puts(STR);
}

There are three string literals in this program formed using the same tokens, but they are not treated the same.

The first, the initializer for global, is part of static initialization of an object with static lifetime. By section 3.6.2, static initialization doesn't have to take place at runtime; the compiler can arrange for the result to be pre-formatted in the binary image so that the process starts execution with the data already in place, and it has done so here. It would also be legal to initialize this object in the same fashion as local[], as long as it was performed before the beginning of dynamic initialization of globals.

The second, the initializer for local, is a string literal, but it isn't really an expression. It is handled under the special rules of 8.5.2, which states that the characters within the string literal are independently used to initialize the array elements; the string literal is not used as a unit. This object has dynamic initialization, resulting in loading the value at runtime.

The third, an argument to the puts() call, actually does use the string literal as an expression, and it will have type const char[N], which decays to const char* for the call. If you really want to study object code used to handle the runtime type of a string literal, you should be using the literal in an expression, like this function call does.

What is a literal string & char array in C?

A string literal is an unnamed string constant in the source code. E.g. "abc" is a string literal.

If you do something like char str[] = "abc";, then you could say that str is initialized with a literal. str itself is not a literal, since it's not unnamed.

A string (or C-string, rather) is a contiguous sequence of bytes, terminated with a null byte.

A char array is not necessarily a C-string, since it might lack a terminating null byte.

What is the datatype of string literal in C++?

It is a const char[N] (which is the same thing as char const[N]), where N is the length of the string plus one for the terminating NUL (or just the length of the string if you define "length of a string" as already including the NUL).

This is why you can do sizeof("hello") - 1 to get the number of characters in the string (including any embedded NULs); if it was a pointer, it wouldn't work because it would always be the size of pointer on your system (minus one).

Understanding C-strings & string literals in C++

In the first case you are creating an actual array of characters, whose size is determined by the size of the literal you are initializing it with (8+1 bytes). The cstr variable is allocated memory on the stack, and the contents of the string literal (which in the code is located somewhere else, possibly in a read-only part of the memory) is copied into this variable.

In the second case, the local variable p is allocated memory on the stack as well, but its contents will be the address of the string literal you are initializing it with.

Thus, since the string literal may be located in a read-only memory, it is in general not safe to try to change it via the p pointer (you may get along with, or you may not). On the other hand, you can do whatever with the cstr array, because that is your local copy that just happens to have been initialized from the literal.

(Just one note: the cstr variable is of a type array of char and in most of contexts this translates to pointer to the first element of that array. Exception to this may be e.g. the sizeof operator: this one computes the size of the whole array, not just a pointer to the first element.)

Question about the type of `&hello` and `hello`

"Hello" is a string literal of type const char [6] which decays to const char* due to type decay.

Now let's see what is happening for each of the statements in you program.

Case 1

Here we consider the statement:

char* ptr = &"hello";

As i said, "hello" is of type const char [6]. So, applying the address of operator & on it gives us a const char (*)[6] which is read as a pointer to an array of size 6 with elements of type const char.

This means that there is a mismatch in the type on the right hand side(which is const char (*)[6]) and the left hand side(which is char*). And since there is no implicit conversion from a const char (*)[6] to a char* the compiler gives the mentioned error saying:

cannot convert 'const char (*)[6]' to 'char*'

Case 2

Here we consider the statement:

char* ptr1 = "hello"; //invalid C++

Since "Hello" is of type const char[6] meaning that the char elements inside the array are immutable(or non-changable). As i said, the type const char[6] decays to const char*. Thus on the right hand side we have a const char*. So if we were allowed to write char* ptr1 = "Hello"; then that would mean that we're allowed to change the elements of the array since there is no low-level const on ptr1. Thus, allowing char* ptr1 = "Hello"; would allow changing const marked data, which should not happen(since the data was not supposed to change as it was marked const). This is why the mentioned warning said:

ISO C++ forbids converting a string constant to 'char*'

So to prevent this from happening we have to add a low-level const as shown below:

vvvvv---------------------------> note the low-level const
const char* ptr1 = "Hello"; //valid c++

so that the pointer ptr1 is not allowed to change the const marked data.
By adding the low-level const highlighted above, it is meant that we're not allowed to change the underlying characters of the array.



Related Topics



Leave a reply



Submit