Modifying a Char *Const String

Modifying a char *const string

t is pointing to a string literal it is undefined behavior to modify a string literal. The C++ draft standard section 2.14.5 String literals paragraph 12 says(emphasis mine):

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined. The effect of attempting to modify a string literal is undefined.

The relevant section from the C99 draft standard is 6.4.5 String literals paragraph 6 which says(emphasis mine):

It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.

On a typical modern Unix platform you will find string literals in the read-only segment which would result in a access violation if we attempt to modify it. We can use objdump to inspect the read-only section as follows:

objdump -s -j .rodata

we can see in the following live example that the string literal will indeed be found in the read-only section. Note that I had to add a printf otherwise the compiler would optimize out the string literal. Sample `objdump output:

Contents of section .rodata:
400668 01000200 776f726c 64002573 0a00 ....world.%s..

An alternative approach would be to have t point to an array with a copy of a string literal like so:

char r[] = "world";    
char *const t = r ;

How to change content of char * const after it has been set (C)

Well, the data that the pointer points at can change for a constant pointer, but not when you initialize it using a string literal. They have the rather curious property of having type char *, but being unable to change.

So, you can do:

char data[10] = "foobar";
char * const ptr = data;

printf("%s\n", ptr); // prints foobar
*ptr = 'z';
printf("%s\n", ptr); // prints zoobar

modify const char * vs char * content in easy way

That is cause your variables are just a pointers *. You're not modifiying their contents, but where they are pointing to.

char * a = "asd";
char * b = "qwe";
a = b;

now you threw away the contents of a. Now a and b points to the same place. If you modify one, both are modified.

In other words. Pointers are never constants (mostly). your const predicate in a pointer variable does not means nothing to the pointer.

The real difference is that the pointer (that is not const) is pointing to a const variable. and when you change the pointer it will be point to ANOTHER NEW const variable. That is why const has no effect on simple pointers.

Note: You can achieve different behaviours with pointers and const with more complex scenario. But with simple as it, it mostly has no effect.

How would one modify a constant string?

how would one modify a constant string (for example, by casting)?

If by this you mean, how would one attempt to modify it, you don't even need a cast. Your sample code was:

char *string1 = "Hello";
string1[0] = 'a'; // This will give a bus error

If I compile and run it, I get a bus error, as expected, and just like you did. But if I compile with -fwritable-strings, which causes the compiler to put string constants in read/write memory, it works just fine.

I suspect you were thinking of a slightly different case. If you write

const char *string1 = "Hello";
string1[0] = 'a'; // This will give a compilation error

the situation changes: you can't even compile the code. You don't get a Bus Error at run-time, you get a fatal error along the lines of "read-only variable is not assignable" at compile time.

Having written the code this way, one can attempt to get around the const-ness with an explicit cast:

((char *)string1)[0] = 'a';

Now the code compiles, and we're back to getting a Bus Error. (Or, with -fwritable-strings, it works again.)

is that considered bad practice, or is it something that is commonly done in C programming

I would say it is considered bad practice, and it is not something that is commonly done.

I'm still not sure quite what you're asking, though, or if I've answered your question. There's often confusion in this area, because there are typically two different kinds of "constness" that we're worried about:

  1. whether an object is stored in read-only memory

  2. whether a variable is not supposed to be modified, due to the constraints of a program's architecture

The first of these is enforced by the OS and by the MMU hardware. It doesn't matter what programming-language constructs you did or didn't use -- if you attempt to write to a readonly location, it's going to fail.

The second of these has everything to do with software engineering and programming style. If a piece of code promises not to modify something, that promise may let you make useful guarantees about the rest of the program. For example, the strlen function promises not to modify the string you hand it; all it does is inspect the string in order to compute its length.

Confusingly, in C at least, the const keyword has mostly to do with the second category. When you declare something as const, it doesn't necessarily (and in fact generally does not) cause the compiler to put the something into read-only memory. All it does is let the compiler give you warnings and errors if you break your promise -- if you accidentally attempt to modify something that elsewhere you declared as const. (And because it's a compile-time thing, you can also readily "cheat" and turn off this kind of constness with a cast.)

But there is read-only memory, and these days, compilers typically do put string constants there, even though (equally confusingly, but for historical reasons) string constants do not have the type const char [] in C. But since read-only memory is a hardware thing, you can't "turn it off" with a cast.

is it ok to change const char* variable?

This declaration:

const char* ch = "text"; 

Only states that what ch points to is const. It doesn't say that ch itself it const.

What is ch is being initialized with is the address of a string literal, and string literals are read-only. When you then do this:

ch  = "Long text";

You're assigning to ch the address of a different string literal. So what you're doing is well defined.

Had you attempted to do this:

ch[0] = 'X';

You would get a compiler error because you're trying to modify something that is const. Had you left off the const qualifier and done this, your code would most likely crash because you're attempting to modify a string literal which is read-only.

const char* allows to modify the string?

No string was replaced, you just reassigned str a new string which is stored on stack-memory.

Try this snippet, that the address was changed

#include <stdio.h>

int main(void)
{
const char *ptr = "Hello";
printf("Before: %p\n", ptr);

ptr = "World";
printf("After: %p\n", ptr);
}

Output [RESULT MAY VARY]:

Before: 0x402004
After: 0x402016

Modifying string before storing in array of const chars

You probably need this:

int no = 0;
const char *values[1000];
for(int i = 0; i < num_of_values; i++) {
if (/*is invalid*/) continue;
char tempbuffer[100]; // provide enough space for the worst case
sprintf(tempbuffer, "Value: %s", list_of_objs[i]->value);
values[no++] = strdup(tempbuffer);
}

but you need to free the pointers in values once you've done with them:

for (int i = 0; i < number_of_values_added; i++)
free(values[no++]);

If strdup is not available on your platform:

char *strdup(const char *string)
{
char newstring = malloc(strlen(string) + 1);
if (newstring)
strcpy(newstring, string);
return newstring;
}

Disclaimer: no error checking is done here for brevity.

Disclaimer 2: there may be other ways to resolve the problem, but as it stands here the question is not clear enough.

Your naive attempt and why it is wrong:

char buf[MAX_STRING_LENGTH];  // buf is just a buffer, it's not at all a string in terms oc C#
for(int i = 0; i < num_of_values; i++) {
if (/*is invalid*/) continue;
sprintf(buf, "Value: %s", list_of_objs[i]->value);
values[no++] = buf; // << you store the same buffer address in all elements of value
}

Modify const char * in C

If you want the reversed str2 back in main(), you will either need to pass an adequately sized buffer to reverse_const to hold the reversed string, or you will need to dynamically allocate storage for it in reverse_const (a local variable length array won't do):

#include <stdlib.h>
...
void reverse_const (const char **str_const)
{
int c_size = strlen (*str_const);
char *str = calloc (c_size + 1, sizeof *str);
strcpy (str, *str_const);
char *c_begin = str, *c_end = str + (c_size - 1);

int i;
for (i = 0; i < c_size / 2; i++) {
*c_begin ^= *c_end;
*c_end ^= *c_begin;
*c_begin ^= *c_end;

c_begin++;
c_end--;
}

*str_const = str;
printf ("%s\n", *str_const);
}

int main (void) {

char str1[] = "Indiana";
char *str2 = "Kentucky";

printf ("TESTS:\nString 1 pre-reversal: %s\n", str1);

reverse (str1);

printf ("String 1 post-reversal: %s\n", str1);
printf ("Constant string 2 pre-reversal: %s\n", str2);

reverse_const ((const char **)&str2);

printf ("Constant string 2 post-reversal: %s\n", str2);

free (str2);

return 0;
}

Output

$ ./bin/revconststr
TESTS:
String 1 pre-reversal: Indiana
String 1 post-reversal: anaidnI
Constant string 2 pre-reversal: Kentucky
ykcutneK
Constant string 2 post-reversal: ykcutneK

Returning The Pointer

You have an additional option to return the pointer to str to assign to str2 in main(). This is more what you would normally expect to see. Let me know if you have any questions:

char *reverse_const2 (const char **str_const)
{
int c_size = strlen (*str_const);
char *str = calloc (c_size + 1, sizeof *str);
strcpy (str, *str_const);
char *c_begin = str, *c_end = str + (c_size - 1);

int i;
for (i = 0; i < c_size / 2; i++) {
*c_begin ^= *c_end;
*c_end ^= *c_begin;
*c_begin ^= *c_end;

c_begin++;
c_end--;
}

//*str_const = str;
printf ("%s\n", *str_const);

return str;
}

int main (void)
{

char str1[] = "Indiana";
char *str2 = "Kentucky";

printf ("TESTS:\nString 1 pre-reversal: %s\n", str1);

reverse (str1);

printf ("String 1 post-reversal: %s\n", str1);
printf ("Constant string 2 pre-reversal: %s\n", str2);

str2 = reverse_const2 ((const char **)&str2);

printf ("Constant string 2 post-reversal: %s\n", str2);

free (str2);

return 0;
}

Why am I able to modify char * in this example?

The compiler adjusts the type of the parameter having an array type of this function declaration

void sortString(char string[50]);

to pointer to the element type

void sortString(char *string);

So for example these function declarations are equivalent and declare the same one function

void sortString(char string[100]);
void sortString(char string[50]);
void sortString(char string[]);
void sortString(char *string);

Within this function

void sortString(char *string)

there is used the character array buf that stores the copy of the passed array (or of the passed string literal through a pointer to it)

char buf[50];
strcpy(buf, s1);
sortString(buf);

So there is no problem. s1 can be a pointer to a string literal. But the content of the string literal is copied in the character array buf that is being changed

As for this code snippet

char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== still prints "Hello"

then it has undefined behavior because you may not change a string literal.

From the C Standard (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.

Pay attention to that in C++ opposite to C string literals have types of constant character arrays. It is advisable also in C to declare pointers to string literals with the qualifier const to avoid undefined behavior as for example

const char * read = "Hello";

By the way the function sortString has redundant swappings of elements in the passed string. It is better to declare and define it the following way

// Selection sort
char * sortString( char *s )
{
for ( size_t i = 0, n = strlen( s ); i != n; i++ )
{
size_t min_i = i;

for ( size_t j = i + 1; j != n; j++ )
{
if ( s[j] < s[min_i] )
{
min_i = j;
}
}

if ( i != min_i )
{
char c = s[i];
s[i] = s[min_i];
s[min_i] = c;
}
}

return s;
}

Modifying pointer to string literal in separate function

Apparently g++ is auto-converting *p to a const?

Quite the opposite. The string "abc" will be in your binary, and that is supposed to be readonly for your program. Therefore, that string should only be read, and the value you get when assigning the string literal in this situation is of type const char*. You get the error because you're assigning it to a non-const char*. Try this instead:

const char *p = "abc";

Also, you'll have to change the function, too:

void test(const char *ptr)
{
ptr = "test";
}

It's still going to print abc, however. That's because you're only modifying a copy of the value that you're passing. But C++ lets you pass a reference instead, which you can do like this:

void test(const char *&ptr)
{
ptr = "test";
}

Now that's a reference to a pointer pointing to a const char... whew! Both the "abc" and "test" will be in the program's binary when it is compiled. When the program is run, the address of "abc" is assigned to char *p, and then the function to change it to have the address of "test" instead is called. The & tells it to work with the actual char *p and not just a copy of it that gets lost when the function finishes.



Related Topics



Leave a reply



Submit