C++: Why Does Space Always Terminate a String When Read

C++: Why does space always terminate a string when read?

I'd guess you're reading your string with something like your_stream >> your_string;. Operator >> for strings is defined to work (about) the same as scanf's %s conversion, which reads up until it encounters whitespace -- therefore, operator>> does the same.

You can read an entire line of input instead with std::getline. You might also want to look at an answer I posted to a previous question (provides some alternatives to std::getline).

Reading string from input with space character?

Use:

fgets (name, 100, stdin);

100 is the max length of the buffer. You should adjust it as per your need.

Use:

scanf ("%[^\n]%*c", name);

The [] is the scanset character. [^\n] tells that while the input is not a newline ('\n') take input. Then with the %*c it reads the newline character from the input buffer (which is not read), and the * indicates that this read in input is discarded (assignment suppression), as you do not need it, and this newline in the buffer does not create any problem for next inputs that you might take.

Read here about the scanset and the assignment suppression operators.

Note you can also use gets but ....

Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.

Why/when to include terminating '\0' character for C Strings?

In case 1, you are creating a string literal (a constant which will be on read only memory) which will have the \0 implicitly added to it.

Since \0's position is relied upon to find the end of string, your stringLength() function prints 5.

In case 2, you are trying to initialise a character array of size 5 with a string of 5 characters leaving no space for the \0 delimiter. The memory adjacent to the string can be anything and might have a \0 somewhere. This \0 is considered the end of string here which explains those weird characters that you get. It seems that for the output you gave, this \0 was found only after 3 more characters which were also taken into account while calculating the string length. Since the contents of the memory change over time, the output may not always be the same.

In case 3, you are initialising a character array of size 6 with a string of size 5 leaving enough space to store the \0 which will be implicitly stored. Hence, it will work properly.

Case 4 is similar to case 3. No modification is done by

char stack4[5] = '\0';

because size of stack4 is 6 and hence its last index is 5. You are overwriting a variable with its old value itself. stack4[5] had \0 in it even before you overwrote it.

In case 5, you have completely filled the character array with characters without leaving space for \0. Yet when you print the string, it prints right. I think it is because the memory adjacent to the memory allocated by malloc() merely happened to be zero which is the value of \0. But this is undefined behavior and should not be relied upon. What really happens depends on the implementation.

It should be noted that malloc() will not initialise the memory that it allocates unlike calloc().

Both

char str[2]='\0';

and

char str[2]=0;

are just the same.

But you cannot rely upon it being zero. Memory allocated dynamically could be having zero as the default value owing to the working of the operating system and for security reasons. See here and here for more about this.

If you need the default value of dynamically allocated memory to be zero, you can use calloc().

Case 6 has the \0 in the end and characters in the other positions. The proper string should be displayed when you print it.

How can I read a string with spaces in it in C?

use fgets with STDIN as the file stream. Then you can specify the amount of data you want to read and where to put it.

Why are strings in C++ usually terminated with '\0'?

The title of your question references C strings. C++ std::string objects are handled differently than standard C strings. \0 is important when using C strings, and when I use the term string in this answer, I'm referring to standard C strings.

\0 acts as a string terminator in C. It is known as the null character, or NUL, and standard C strings are null-terminated. This terminator signals code that processes strings - standard libraries but also your own code - where the end of a string is. A good example is strlen which returns the length of a string: strlen works using the assumption that it operates on strings that are terminated using \0.

When you declare a constant string with:

const char *str = "JustAString";

then the \0 is appended automatically for you. In other cases, where you'll be managing a non-constant string as with your array example, you'll sometimes need to deal with it yourself. The docs for strncpy, which is used in your example, are a good illustration: strncpy copies over the null terminator character except in the case where the specified length is reached before the entire string is copied. Hence you'll often see strncpy combined with the possibly redundant assignment of a null terminator. strlcpy and strcpy_s were designed to address the potential problems that arise from neglecting to handle this case.

In your particular example, array[s.size()] = '\0'; is one such redundancy: since array is of size s.size() + 1, and strncpy is copying s.size() characters, the function will append the \0.

The documentation for standard C string utilities will indicate when you'll need to be careful to include such a null terminator. But read the documentation carefully: as with strncpy the details are easily overlooked, leading to potential buffer overflows.

Space for Null character in c strings

When is it necessary to explicitly provide space for a NULL character in C strings?

Always. Not having that \0 character there will make functions like strcpy, strlen and printing via %s behave wrong. It might work for some examples (like your own) but I won't bet anything on that.

On the other hand, if your string is binary and you know the length of the packet you don't need that extra space. But then you cannot use str* functions. And this is not the case of your question, anyway.

Remove spaces from a string in C

Easiest and most efficient don't usually go together…

Here's a possible solution for in-place removal:

void remove_spaces(char* s) {
char* d = s;
do {
while (*d == ' ') {
++d;
}
} while (*s++ = *d++);
}


Related Topics



Leave a reply



Submit