Does the C++ Standard Mandate Poor Performance For Iostreams, or Am I Just Dealing With a Poor Implementation

How does std::ios_base::sync_with_stdio impact stream buffering?

std::fputc(stdout, c); could simply be implemented in terms of std::cout << c; or the other way round (or using a common primitive)

It is actually "the other way round". The synchronized std::cout is an unbuffered stream, and each std::cout << c; immediately executes std::fputc(stdout, c);.

synchronized streams would even have an advantage over non-synchronized ones! Fewer buffers, fewer cache misses

It's just one buffer either way: stdout's when synchronized or std::cout's when not. On my gcc/libstdc++, the main difference is that one is 1024 bytes and the other is 8191 (seriously). It might be interesting to profile the three existing implementations of the standard library (libstdc++, libc++, and MSVC) to spot the differences and what causes them. It may very well be that they are "imperfect implementations" - there is no reason unsynchronized std::cout << c; should ever be slower than (always synchronized) std::fputc(stdout, c);.

Does the C++ standard guarantee that when the return value of 'rdbuf' passed to the stream output operator, he conent of the buffer gets printed out

A std::basic_stringbuf is derived from a std::basic_streambuf. Cppreference describes its use:

The I/O stream objects std::basic_istream and std::basic_ostream, as well as all objects derived from them (std::ofstream, std::stringstream, etc), are implemented entirely in terms of std::basic_streambuf.

What does that mean? Well, let's take a look at the overload set for std::basic_istream::operator<< here:

basic_ostream& operator<<( std::basic_streambuf<CharT, Traits>* sb );
(10)

Behaves as an UnformattedOutputFunction. After constructing and checking the sentry object, checks if sb is a null pointer. If it is, executes setstate(badbit) and exits. Otherwise, extracts characters from the input sequence controlled by sb and inserts them into *this until one of the following conditions are met:
end-of-file occurs on the input sequence;
inserting in the output sequence fails (in which case the character to be inserted is not extracted);
an exception occurs (in which case the exception is caught).
If no characters were inserted, executes setstate(failbit). If an exception was thrown while extracting, sets failbit and, if failbit is set in exceptions(), rethrows the exception.

So, yes, it's guaranteed by the standard that std::cout << ss.rdbuf(); will have the effect you observed.

Best practice when dealing with C++ iostreams

I would probably make it into

std::istream& open_for_read(std::ifstream& ifs, const std::string& filename) {
    return filename == "-" ? std::cin : (ifs.open(filename), ifs);
}

and then supply an ifstream to the function.

std::ifstream ifs;
auto& is = open_for_read(ifs, the_filename);

// now use `is` everywhere:
if(!is) { /* error */ }

while(std::getline(is, line)) {
    // ...
}

ifs will, if it was opened, be closed when it goes out of scope as usual.

A throwing version might look like this:

std::istream& open_for_read(std::ifstream& ifs, const std::string& filename) {
    if(filename == "-") return std::cin;
    ifs.open(filename);
    if(!ifs) throw std::runtime_error(filename + ": " + std::strerror(errno));
    return ifs;
}

Why is istream/ostream slow

Actually, IOStreams don't have to be slow! It is a matter of implementing them in a reasonable way to make them fast, though. Most standard C++ library don't seem to pay too much attention to implement IOStreams. A long time ago when my CXXRT was still maintained it was about as fast as stdio - when used correctly!

Note that there are few performance traps for users laid out with IOStreams, however. The following guidelines apply to all IOStream implementations but especially to those which are tailored to be fast:

When using std::cin, std::cout, etc. you need to call std::sync_with_stdio(false)! Without this call, any use of the standard stream objects is required to synchronize with C's standard streams. Of course, when using std::sync_with_stdio(false) it is assumed that you don't mix std::cin with stdin, std::cout with stdout, etc.
Do not use std::endl as it mandates many unnecessary flushes of any buffer. Likewise, don't set std::ios_base::unitbuf or use std::flush unnecessarily.
When creating your own stream buffers (OK, few users do), make sure they do use an internal buffer! Processing individual characters jumps through multiple conditions and a virtual function which makes it hideously slow.

What does C++ iostreams have to offer in comparison with the C stdio library?

There are several advantages, mostly with the << and >> operators. Getting a line isn't all that different, although being able to read it into a std::string is a considerable advantage.

C++ I/O has type safety. You don't write your parameter list as a quoted string, and then again as variables and such. You write what you're going to print once, and C++ figures out how many parameters and what type they are. When you have type mismatches, C I/O might get the I/O wrong, or even try to access protected memory.

C++ I/O is easily to extend. You can write operator<<() and operator>>() easily, once you've got a sample to copy. printf() and friends cannot be extended. You have a fixed list of format types.

C++ I/O, while it looks fairly simple at first, has a lot of programmer-accessible structure, and therefore a good C++ programmer can modify it to cover cases that C I/O can't. (Don't overuse this.)

Will C++0x RValue references or other features will have an impact on streams performance?

You might be interested in some of the performance comparisons in my question here. Even the lowest level functions in the C++ standard library streams API are incredibly slow under common implementations, and looking through the source code of e.g. Visual C++'s stringbuf class, I don't see copying of small temporary objects. So rvalue-references are not likely to help much.

AFAICT, the main reason for slowness of C++ iostreams is that library developers are stuck with a mindset that I/O is the bottleneck, so there's no point in worrying about performance of the I/O library. But I/O is decidedly not the bottleneck.

Should I switch to C++ I/O streams?

I'm not a big user of streams myself, so I'll only list what I think about them. This is really subjective, I'll understand if my answer is voted for deletion.

I like : homogeneity

I may have a enum, a class or anything else, making my user defined type printable is always done by providing the same operator<< next to my type :

std::ostream &operator<<(std::ostream &, const MyType &);

You may ask yourself if a type is printable, but never how it is printable.

I like : abstraction

Obviously, it is incredibly easy to provide 'streaming capacities' to a user defined type. It's also a great to be able to provide our own implementation of a stream and have it fit transparently in an existing code. Once your operator<< are appropriately defined, writing to standard output, a memory buffer or a file is trivially changeable.

I dislike : formatting

I've always thought iomanip to be a mess. I hate writing things such as (I'm just throwing random manipulators here) :

std::cout << std::left << std::fixed << std::setprecision(0) << f << std::endl;

I think it was much easier with printf, but Boost.Format is helpful here.