Is it legal to modify the result of std::string::op[]?
The quote means that you cannot modify the return of operator[]( size() )
, even if the value is well defined. That is, you must not modify the NUL terminator in the string even through the non-const overload.
This is basically your first option: i.e. pos >= size()
, but because of the requirement pos <= size()
the only possible value for that condition is pos == size()
.
The actual English description of the clause can be ambiguous (at least to me), but Appendix C, and in particular C.2.11 deals with changes in semantics in the string library, and there is no mention to this change --that would break user code. In C++03 the "referenced value shall not be modified" bit is not present and there is no ambiguity. The lack of mention in C.2.11 is not normative, but can be used as a hint that when they wrote the standard there was no intention on changing this particular behavior.
Effects of modifying std::string using op[] beyond its size?
Although you already got a correct comment saying the behaviour is undefined, there is something worthy of an actual answer too.
A C++ string
object can contain any sequence of characters you like. A C-style string is terminated by the first '\0'
. Consequently, a C++ string
object must store the size somewhere other than by searching for the '\0'
: it may contain embedded '\0'
characters.
#include <string>
#include <iostream>
int main() {
std::string s = "abc";
s += '\0';
s += "def";
std::cout << s << std::endl;
std::cout << s.c_str() << std::endl;
}
Running this, and piping the output through cat -v
to make control characters visible, I see:
abc^@def
abc
This explains what you're seeing: you're overwriting the '\0'
terminator, but you're not overwriting the size, which is stored separately.
As pointed out by kec, you might have seen garbage except you were lucky enough to have an additional zero byte after your extra characters.
Is it safe to pass std::string to C style APIs?
If the C API function requires read-only access to the contents of the std::string
then use the std::string::c_str()
member function to pass the string. This is guaranteed to be a null terminated string.
If you intend to use the std::string
as an out parameter, C++03 doesn't guarantee that the stored string is contiguous in memory, but C++11 does. With the latter it is OK to modify the string via operator[]
as long as you don't modify the terminating NUL character.
Can you avoid using temporary buffers when using std::string to interact with C style APIs?
In C++11 you can simply pass a pointer to the first element of the string (&str[0]
): its elements are guaranteed to be contiguous.
Previously, you can use .data()
or .c_str()
but the string is not mutable through these.
Otherwise, yes, you must perform a copy. But I wouldn't worry about this too much until profiling indicates that it's really an issue for you.
Is it safe to ever cast the result of string's c_str to a char*?
Yes, it's safe as long as the function you're passing it to does not attempt to modify the contents of the string.
You can even avoid the const_cast
using
c_api_lib_func(&str[0]);
Note that this is technically not safe with a pre-C++11 compiler because std::string
was not required to have contiguous storage for it's internal buffer.
Using &str[0]
, the function may even modify the contents of the string's internal buffer as long as it leaves the terminating NULL character alone.
How to replace all occurrences of a character in string?
std::string
doesn't contain such function but you could use stand-alone replace
function from algorithm
header.
#include <algorithm>
#include <string>
void some_func() {
std::string s = "example string";
std::replace( s.begin(), s.end(), 'x', 'y'); // replace all 'x' to 'y'
}
Why std::string object when constructed with default constructor behaves differently?
You have a string of length 0
and then you try to modify its contents using the subscript operator. That's undefined behavior, so at this point, no particular outcome is guaranteed. If you used at()
instead, it would have exposed the mistake and thrown an exception instead.
why the length is returning as 0
It started out as 0
and you didn't do anything to add to it (such as push_back
or +=
). But then again, since what you did earlier was undefined behavior, anything could have happened here.
In addition, I didn't get any kind of exception.
You can try std::string::at
instead, which will throw an std::out_of_range
exception when you try that.
Are the days of passing const std::string & as a parameter over?
The reason Herb said what he said is because of cases like this.
Let's say I have function A
which calls function B
, which calls function C
. And A
passes a string through B
and into C
. A
does not know or care about C
; all A
knows about is B
. That is, C
is an implementation detail of B
.
Let's say that A is defined as follows:
void A()
{
B("value");
}
If B and C take the string by const&
, then it looks something like this:
void B(const std::string &str)
{
C(str);
}
void C(const std::string &str)
{
//Do something with `str`. Does not store it.
}
All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C
takes a const&
because it doesn't store the string. It simply uses it.
Now, I want to make one simple change: C
needs to store the string somewhere.
void C(const std::string &str)
{
//Do something with `str`.
m_str = str;
}
Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A
passes a temporary; there's no reason why C
should have to copy the data. It should just abscond with what was given to it.
Except it can't. Because it takes a const&
.
If I change C
to take its parameter by value, that just causes B
to do the copy into that parameter; I gain nothing.
So if I had just passed str
by value through all of the functions, relying on std::move
to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.
Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?
It depends on your use case. How much do you hate memory allocations?
Function for both C-style strings and c++ std::string
As long as you know how big the output buffer needs to be you can create a std::string
and resize it to the buffer size. You can then pass a pointer to the std::string
buffer into the C-style overload.
#include <cstring>
#include <iostream>
#include <string>
void TransformString(const char *in_c_string, char *out_c_string) {
size_t length = strlen(in_c_string);
for (size_t i = 0; i < length; ++i)
out_c_string[i] = '*';
out_c_string[length] = 'a';
out_c_string[length+1] = 'b';
out_c_string[length+2] = 'c';
out_c_string[length+3] = '\0';
}
std::string TransformString(const std::string &in_string) {
std::string out;
out.resize(100);
TransformString(in_string.c_str(), &out[0]);
out.resize(strlen(&out[0]));
// IIRC there are some C++11 rule that allows 'out' to
// be automatically moved here (if it isn't RVO'd)
return out;
}
int main() {
std::string string_out = TransformString("hello world");
char charstar_out[100];
TransformString("hello world", charstar_out);
std::cout << string_out << "\n";
std::cout << charstar_out << "\n";
return 0;
}
Here is a live example: http://ideone.com/xwVWCh.
Related Topics
Conditionally Replace Regex Matches in String
How to Properly Delete a Pointer to Array
Why Does Long Long 2147483647 + 1 = -2147483648
How to Name This Key-Oriented Access-Protection Pattern
C++ Iterate into Nested Struct Field with Boost Fusion Adapt_Struct
Will I Be Able to Declare a Constexpr Lambda Inside a Template Parameter
Why Can't I Return Bigger Values from Main Function
How to Use the Windows API in Mingw
Why Does Glgetstring(Gl_Version) Return Null/Zero Instead of the Opengl Version
Forward Declaration & Circular Dependency
Best Way to for C++ Types to Self Register in a List
Is It Allowed to Cast Away Const on a Const-Defined Object as Long as It Is Not Actually Modified
Is There an Non-Short Circuited Logical "And" in C++
Accessing Parent's Protected Variables