Can a Std::String Contain Embedded Nulls

Can a std::string contain embedded nulls?

Yes you can have embedded nulls in your std::string.

Example:

std::string s;
s.push_back('\0');
s.push_back('a');
assert(s.length() == 2);

Note: std::string's c_str() member will always append a null character to the returned char buffer; However, std::string's data() member may or may not append a null character to the returned char buffer.

Be careful of operator+=

One thing to look out for is to not use operator+= with a char* on the RHS. It will only add up until the null character.

For example:

std::string s = "hello";
s += "\0world";
assert(s.length() == 5);

The correct way:

std::string s = "hello";
s += std::string("\0world", 6);
assert(s.length() == 11);

Storing binary data more common to use std::vector

Generally it's more common to use std::vector to store arbitrary binary data.

std::vector<char> buf;
buf.resize(1024);
char *p = &buf.front();

It is probably more common since std::string's data() and c_str() members return const pointers so the memory is not modifiable. with &buf.front() you are free to modify the contents of the buffer directly.

C++: How do null characters work in std::string?

std::string supports embedded NUL characters*. The fact that your example code doesn't produce the expected result is, because you are constructing a std::string from a pointer to a zero-terminated string. There is no length information, and the c'tor stops at the first NUL character. s contains Hello, hence the output.

If you want to construct a std::string with an embedded NUL character, you have to use a c'tor that takes an explicit length argument:

std::string s("Hello\0, World", 13);
std::cout << s << std::endl;

produces this output:

Hello, World



* std::string maintains an explicit length member, so it doesn't need to reserve a character to act as the end-of-string sentinel.

How do you construct a std::string with an embedded null?

Since C++14

we have been able to create literal std::string

#include <iostream>
#include <string>

int main()
{
using namespace std::string_literals;

std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}

Before C++14

The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.

To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:

std::string   x("pq\0rs");   // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.

Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().

Also check out Doug T's answer below about using a vector<char>.

Also check out RiaD for a C++14 solution.

std::string equivalent for data with null characters?

std::string should be safe to do so... you only have to be careful using .c_str() method. Use .data().

string with embedded null characters

Just properly initialize string with the proper size of the char array. The rest will follow naturally.

#include <sstream>
#include <string>
#include <cstring>
#include <iostream>
#include <iomanip>
int main() {
const char array[] = "125 320 512 750 333\0 xyz";

// to get the string after the null, just add strlen
const char *after_the_null_character = array + strlen(array) + 1;
std::cout << "after_the_null_character:" << after_the_null_character << std::endl;

// initialized with array and proper, actual size of the array
std::string str{array, sizeof(array) - 1};
std::istringstream ss{str};
std::string word;
while (ss >> word) {
std::cout << "size:" << word.size() << ": " << word.c_str() << " hex:";
for (auto&& i : word) {
std::cout << std::hex << std::setw(2) << std::setfill('0') << (unsigned)i;
}
std::cout << "\n";
}
}

would output:

after_the_null_character: xyz
size:3: 125 hex:313235
size:3: 320 hex:333230
size:3: 512 hex:353132
size:3: 750 hex:373530
size:4: 333 hex:33333300
size:3: xyz hex:78797a

Note the zero byte after reading 333.

Does std::string have a null terminator?

No, but if you say temp.c_str() a null terminator will be included in the return from this method.

It's also worth saying that you can include a null character in a string just like any other character.

string s("hello");
cout << s.size() << ' ';
s[1] = '\0';
cout << s.size() << '\n';

prints

5 5

and not 5 1 as you might expect if null characters had a special meaning for strings.

why std::string::find() can handle '\0'?

The std::string class is a C++ class that represents a string which can contain a null character. Its member functions, like find, are designed to handle those embedded nulls.

strstr (a function from C) works with char* pointers, which point to C-style strings. Because C-style strings are null-terminated, they cannot handle embedded nulls. To this effect, strstr is documented as follows:

Locate substring

Returns a pointer to the first occurrence of str2 in str1, or a null pointer if str2 is not part of str1.

The matching process does not include the terminating null-characters, but it stops there.

The italicized part is relevant here.

Why setting null in the middle of std string doesn't have any effect

A std::string is not like a usual C string, and can contain embedded NUL characters without problems. However, if you do this you will notice the string is prematurely terminated if you use the .c_str() function to return a const char *.

Protobuf : C++ string with null characters inside

First, you need to construct the string appropriately. You cannot construct it using the constructors that are looking for NULL terminators, which is what string(const char *) is looking for.

You have to use the constructor that takes a pointer and length.

string s("name\0first", 10);

If you have already constructed a string, and want to append data that has embedded NULLs, you can use the append() method.

string s;
s.append("name\0first", 10);


Related Topics



Leave a reply



Submit