C++ Strings: [] VS. *

Difference between char[] and strings in C

C is [relatively] low-level statically typed programming language.

char c = 'c';
const char* s = "s";

The statements above differ not only in the value of the literal constant (c: single byte storage; s: two bytes storage + 4/8 byte pointer), but also in the type of variables (c: single byte, certain arithmetic ops; s: 4/8 byte pointer, different arithmetic).

I posit to you that the latter difference is more important; Literal constants are there to make use of variables, function arguments, struct members, etc easier.

Furthermore, the typical problems solved in C are of low-level nature where you are interested in logical difference between single character and a string. For example gpio, serial port, substring search algorithm.

[Of course C is also used in other domains, you are not likely to see much character vs string distinction in higher-level projects like glib or enlightenment.]

Python is a high-level dynamic language.

c = 'c'
s = "s"

In the statements above locals/labels c and s point to objects and type is determined at runtime, dynamically. Thus a distinction between a "character" and "string" is simply not needed.

Problems solved in Python are usually of much higher level, typically you'd deal with JSON blobs, HTTP requests, database queries, virtual machines, etc; Even if you need to deal with single characters, length-1 string is an acceptable approximation.

[If you used numpy or cffi, you would worry about storage of characters and strings and those modules provide mechanism to do so.]

String literals vs array of char when initializing a pointer

I think you're confused because char *p = "ab"; and char p[] = "ab"; have similar semantics, but different meanings.

I believe that the latter case (char p[] = "ab";) is best regarded as a short-hand notation for char p[] = {'a', 'b', '\0'}; (initializes an array with the size determined by the initializer). Actually, in this case, you could say "ab" is not really used as a string literal.

However, the former case (char *p = "ab";) is different in that it simply initializes the pointer p to point to the first element of the read-only string literal "ab".

I hope you see the difference. While char p[] = "ab"; is representable as an initialization such as you described, char *p = "ab"; is not, as pointers are, well, not arrays, and initializing them with an array initializer does something entirely different (namely give them the value of the first element, 0x61 in your case).

Long story short, C compilers only "replace" a string literal with a char array initializer if it is suitable to do so, i.e. it is being used to initialize a char array.

In C, can I initialize a string in a pointer declaration the same way I can initialize a string in a char array declaration?

No, those two lines do not achieve the same result.

char s[] = "string" results in a modifiable array of 7 bytes, which is initially filled with the content 's' 't' 'r' 'i' 'n' 'g' '\0' (all copied over at runtime from the string-literal).

char *s = "string" results in a pointer to some read-only memory containing the string-literal "string".

If you want to modify the contents of your string, then the first is the only way to go. If you only need read-only access to a string, then the second one will be slightly faster because the string does not have to be copied.


In both cases, there is no need to specify a null terminator in the string literal. The compiler will take care of that for you when it encounters the closing ".

How do strings and char arrays work in C?

What is the difference between an allocated char* and char[25]?

The lifetime of a malloc-ed string is not limited by the scope of its declaration. In plain language, you can return malloc-ed string from a function; you cannot do the same with char[25] allocated in the automatic storage, because its memory will be reclaimed upon return from the function.

Can literals be manipulated?

String literals cannot be manipulated in place, because they are allocated in read-only storage. You need to copy them into a modifiable space, such as static, automatic, or dynamic one, in order to manipulate them. This cannot be done:

char *str = "hello";
str[0] = 'H'; // <<== WRONG! This is undefined behavior.

This will work:

char str[] = "hello";
str[0] = 'H'; // <<=== This is OK

This works too:

char *str = malloc(6);
strcpy(str, "hello");
str[0] = 'H'; // <<=== This is OK too

How do you take care of null termination of string literals?

C compiler takes care of null termination for you: all string literals have an extra character at the end, filled with \0.

What's the difference between String.Count and String.Length?

On the surface they would seem functionally identical, but the main difference is:

  • Length is a property that is defined of strings and is the usual way to find the length of a string

  • .Count() is implemented as an extension method. That is, what string.Count() really does is call Enumerable.Count(this IEnumerable<char>), a System.Linq extension method, given that string is really a sequence of chars.

Performance concerns of LINQ enumerable methods notwithstanding, use Length instead, as it's built right into strings.



Related Topics



Leave a reply



Submit