writing directly to std::string internal buffers
I'm not sure the standard guarantees that the data in a std::string
is stored as a char*
. The most portable way I can think of is to use a std::vector
, which is guaranteed to store its data in a continuous chunk of memory:
std::vector<char> buffer(100);
FunctionInDLL(&buffer[0], buffer.size());
std::string stringToFillIn(&buffer[0]);
This will of course require the data to be copied twice, which is a bit inefficient.
Directly write into char* buffer of std::string
C++98/03
Impossible. String can be copy on write so it needs to handle all reads and writes.
C++11/14
In [string.require]:
The char-like objects in a
basic_string
object shall be stored contiguously. That is, for anybasic_string
objects
, the identity&*(s.begin() + n) == &*s.begin() + n
shall hold for all values ofn
such that0 <= n < s.size()
.
So &str.front()
and &str[0]
should work.
C++17
str.data()
, &str.front()
and &str[0]
work.
Here it says:
charT* data() noexcept;
Returns: A pointer
p
such thatp + i == &operator[](i)
for eachi
in[0, size()]
.Complexity: Constant time.
Requires: The program shall not alter the value stored
at p + size()
.
The non-const .data()
just works.
The recent draft has the following wording for .front()
:
const charT& front() const;
charT& front();
Requires:
!empty()
.Effects: Equivalent to
operator[](0)
.
And the following for operator[]
:
const_reference operator[](size_type pos) const;
reference operator[](size_type pos);
Requires:
pos <= size()
.Returns:
*(begin() + pos) if pos < size()
. Otherwise, returns a reference to an object of typecharT
with valuecharT()
, where modifying the object leads to undefined behavior.Throws: Nothing.
Complexity: Constant time.
So it uses iterator arithmetic. so we need to inspect the information about iterators. Here it says:
3 A basic_string is a contiguous container ([container.requirements.general]).
So we need to go here:
A contiguous container is a container that supports random access iterators ([random.access.iterators]) and whose member types
iterator
andconst_iterator
are contiguous iterators ([iterator.requirements.general]).
Then here:
Iterators that further satisfy the requirement that, for integral values n and dereferenceable iterator values
a
and(a + n)
,*(a + n)
is equivalent to*(addressof(*a) + n)
, are called contiguous iterators.
Apparently, contiguous iterators are a C++17 feature which was added in these papers.
The requirement can be rewritten as:
assert(*(a + n) == *(&*a + n));
So, in the second part we dereference iterator, then take address of the value it points to, then do a pointer arithmetic on it, dereference it and it's the same as incrementing an iterator and then dereferencing it. This means that contiguous iterator points to the memory where each value stored right after the other, hence contiguous. Since functions that take char*
expect contiguous memory, you can pass the result of &str.front()
or &str[0]
to these functions.
Are there downsides to using std::string as a buffer?
Don't use std::string
as a buffer.
It is bad practice to use std::string
as a buffer, for several reasons (listed in no particular order):
std::string
was not intended for use as a buffer; you would need to double-check the description of the class to make sure there are no "gotchas" which would prevent certain usage patterns (or make them trigger undefined behavior).- As a concrete example: Before C++17, you can't even write through the pointer you get with
data()
- it'sconst Tchar *
; so your code would cause undefined behavior. (But&(str[0])
,&(str.front())
, or&(*(str.begin()))
would work.) - Using
std::string
s for buffers is confusing to readers of your function's definition, who assume you would be usingstd::string
for, well, strings. In other words, doing so breaks the Principle of Least Astonishment. - Worse yet, it's confusing for whoever might use your function - they too may think what you're returning is a string, i.e. valid human-readable text.
std::unique_ptr
would be fine for your case, or evenstd::vector
. In C++17, you can usestd::byte
for the element type, too. A more sophisticated option is a class with an SSO-like feature, e.g. Boost'ssmall_vector
(thank you, @gast128, for mentioning it).- (Minor point:) libstdc++ had to change its ABI for
std::string
to conform to the C++11 standard, so in some cases (which by now are rather unlikely), you might run into some linkage or runtime issues that you wouldn't with a different type for your buffer.
Also, your code may make two instead of one heap allocations (implementation dependent): Once upon string construction and another when resize()
ing. But that in itself is not really a reason to avoid std::string
, since you can avoid the double allocation using the construction in @Jarod42's answer.
Is there a way to get std:string's buffer
Use std::vector<char>
if you want a real buffer.
#include <vector>
#include <string>
int main(){
std::vector<char> buff(MAX_PATH+1);
::GetCurrentDirectory(MAX_PATH+1, &buff[0]);
std::string path(buff.begin(), buff.end());
}
Example on Ideone.
Is it permitted to modify the internal std::string buffer returned by operator[] in C++11
operator[]
operator[]
returns a reference to the character. So if the string
is NOT const
, you can modify it safely.
For C++ 11, the characters are stored contiguously, so you can take &str[0]
as the beginning of the underlying array whose size is str.size()
. And you can modify any element between [ &str[0], &str[0] + str.size() )
, if the string
is NOT const
. e.g. you can pass &str[0]
and str.size()
to void func(char *arr, size_t arr_size)
: func(&str[0], str.size())
data()
andc_str()
members
For C++11 and C++14, both data()
and c_str()
returns const CharT*
, so you CANNOT modify element with the returned pointer. However, from C++17, data()
will return CharT*
, if string
is NOT const
. And data()
will be an alias to &str[0]
.
Can you avoid using temporary buffers when using std::string to interact with C style APIs?
In C++11 you can simply pass a pointer to the first element of the string (&str[0]
): its elements are guaranteed to be contiguous.
Previously, you can use .data()
or .c_str()
but the string is not mutable through these.
Otherwise, yes, you must perform a copy. But I wouldn't worry about this too much until profiling indicates that it's really an issue for you.
is there a way to set the length of a std::string without modifying the buffer content?
You should be using resize()
not reserve()
, then resize()
again to set the final length.
Otherwise when you resize()
from zero to the result returned by strlen()
the array will be filled with zero characters, overwriting what you wrote into it. The string is allowed to do that, because it (correctly) assumes that everything from the current size to the current reserved capacity is uninitialized data that doesn't contain anything.
In order for the string to know that the characters are actually valid and their contents should be preserved, you need to use resize()
initially, not reserve()
. Then when you resize()
again to make the string smaller it only truncates the unwanted end of the string and adds a null terminator, it won't overwrite what you wrote into it.
N.B. the initial resize()
will zero-fill the string, which is not strictly necessary in your case because you're going to overwrite the portion you care about and then discard the rest anyway. If the strings are very long and profiling shows the zero-filling is a problem then you could do this instead:
std::unique_ptr<char[]> str(new char[SOME_MAX_VALUE]);
some_C_API_func(str.get());
How do I perform string formatting to a static buffer in C++?
My thanks to all that posted suggestions (even in the comments).
I appreciate the suggestion by SJHowe, being the briefest solution to the problem, but one of the things I am looking to do with this attempt is to start coding for the C++ of the future, and not use anything deprecated.
The solution I decided to go with stems from the comment by Remy Lebeau:
#include <iostream> // For std::ostream and std::streambuf
#include <cstring> // For std::memset
template <int bufferSize>
class FixedBuffer : public std::streambuf
{
public:
FixedBuffer()
: std::streambuf()
{
std::memset(buffer, 0, sizeof(buffer));
setp(buffer, &buffer[bufferSize-1]); // Remember the -1 to preserve the terminator.
setg(buffer, buffer, &buffer[bufferSize-1]); // Technically not necessary for an std::ostream.
}
std::string get() const
{
return buffer;
}
private:
char buffer[bufferSize];
};
//...
constexpr int BUFFER_SIZE = 200;
FixedBuffer<BUFFER_SIZE> buffer;
std::ostream ostr(&buffer);
ostr << "PartA: " << intA << std::endl << "PartB: " << intB << std::endl << std::ends;
Use an external buffer for a string without copying
Yes, copying is always happening. BTW, you don't need to wrap std::string(buffer)
as the constructor std::string(char const*)
is implicit and a simple
foo(buffer);
will implicitly copy the buffer into the string. If you are the author of foo
you can add an overload
void foo(char const*)
that avoids the copying. However, C strings are suffering from the problem that the null terminator is part of the string APIs, and so you can't easily create substrings without mutating the underlying string (a la strtok
).
The Library Fundamentals Technical Specification contains a string_view
class that will eliminate the copying like char const*
, but preserves the subset capability of std::string
#include <iostream>
#include <experimental/string_view>
void foo(std::experimental::string_view v) { std::cout << v.substr(2,8) << '\n'; }
int main()
{
char const* buffer = "war and peace";
foo(buffer);
}
Live Example (requires libstdc++ 4.9 or higher in C++14 mode).
Related Topics
Initialization: Parenthesis VS. Equals Sign
How to Make Visual Studio Use the Native Amd64 Toolchain
C++: Where to Initialize Variables in Constructor
How to Use Formatmessage() Properly in C++
Function Declaration Inside or Outside the Class
How to Convert Cstring and Std::String Std::Wstring to Each Other
What Is the Underlying Data Structure of a Stl Set in C++
Is There Any Reason to Use the 'Auto' Keyword in C++03
Replacing Ld with Gold - Any Experience
Should I Pass an Std::Function by Const-Reference
How to Give Priority to Privileged Thread in Mutex Locking
Why Does the Enhanced Gcc 6 Optimizer Break Practical C++ Code
Cmake Linking Against Shared Library on Windows: Error About Not Finding .Lib File
How to Make Gcc Compile the .Text Section as Writable in an Elf Binary