C++ Streams Confusion: Istreambuf_Iterator VS Istream_Iterator

C++ streams confusion: istreambuf_iterator vs istream_iterator?

IOstreams use streambufs to as their source / target of input / output. Effectively, the streambuf-family does all the work regarding IO and the IOstream-family is only used for formatting and to-string / from-string transformation.

Now, istream_iterator takes a template argument that says what the unformatted string-sequence from the streambuf should be formatted as, like istream_iterator<int> will interpret (whitespace-delimited) all incoming text as ints.

On the other hand, istreambuf_iterator only cares about the raw characters and iterates directly over the associated streambuf of the istream that it gets passed.

Generally, if you're only interested in the raw characters, use an istreambuf_iterator. If you're interested in the formatted input, use an istream_iterator.

All of what I said also applies to ostream_iterator and ostreambuf_iterator.

C++ istreambuf_iterator template parameter

Yes you can only use the streambuf iterators to read "characters", since it gets characters directly from the buffer. There's no formatted input involved which means it can not convert the data.

Confused about usage of `std::istreambuf_iterator`

When you use istreambuf_iterator, you are manipulating the underlying streambuf object of the istream object. The streambuf object doesn't know anything about it's owner(the istream object), so calling functions on the streambuf object does not make changes to the istream object. That's why the flags in the istream object are not set when you reach the eof.

Do something like this:

std::istream& operator >> (std::istream& is, Foo& f)
{
is.read(&f.a, sizeof(f.a));
is.read(&f.b, sizeof(f.b));
return is;
}

Edit

I was stepping through the code in my debugger, and this is what I found. istream_iterator has two internal data members. A pointer to the associated istream object, and an object of the template type (Foo in this case). When you call ++it, it calls this function:

void _Getval()
{ // get a _Ty value if possible
if (_Myistr != 0 && !(*_Myistr >> _Myval))
_Myistr = 0;
}

_Myistr is the istream pointer, and _Myval is the Foo object. If you look here:

!(*_Myistr >> _Myval)

That's where it calls your operator>> overload. And it calls operator! on the returned istream object. And as you can see here, operator! only returns true if failbit or badbit are set, eofbit doesn't do it.

So, what happens next, if either failbit or badbit are set, the istream pointer gets NULL'd. And the next time you compare the iterator to the end iterator, it compares the istream pointer, which is NULL on both of them.

overriding `istream operator` vs using `sscanf`

Since you are reading from a file, the performance is going to be I/O-bound. Almost no matter what you do in memory, the effect on the overall performance is not going to be detectable.

I would prefer the operator>> route, because this would let me use the input iterator idiom of C++:

std::istream_iterator<Person> eos;
std::istream_iterator<Person> iit(inputFile);
std::copy(iit, eos, std::back_inserter(person_vector));

or even

std::vector<Person>   person_vector(
std::istream_iterator<Person>(inputFile)
, std::istream_iterator<Person>()
);

How do I properly check that a istreambuf_iterator has reached end-of-stream

isn't there any predefined symbol or equivalent that could help us here?

Yes.

you have to create a deafult iterator to be able to know if the iterator has reach end-of stream

Here it is!

Why doesn't the iterator has a function named, i.e. hasNext() or isEndOfStream()

The singular iterator as sentinel is actually quite elegant. It doesn't introduce any new names into the library, and it allows us to write generic code with start and end iterators that doesn't care what kind of thing you're iterating over. That's really important for the standard algorithms.

void bar(const char c);

template <typename Iterator>
void foo(Iterator start, Iterator end)
{
for (Iterator it = start; it != end; ++it)
bar(*it);
}

void version_1()
{
std::istreambuf_iterator<char> start(std::cin);
std::istreambuf_iterator<char> end;
foo(start, end);
}

void version_2()
{
std::vector<char> v{0,1,2,3,4,5};
foo(v.begin(), v.end());
}

Imagine if for containers you had start and end, but for streams you had start and a call to start.hasNext()? Now, that's not very elegant. That's messy!

Using stream to treat received data

I found what you want to create is a class which acts like a std::istream. Of course you can choose to create your own class, but I prefer to implement std::streambuf for some reasons.

First, people using your class are accustomed to using it since it acts the same as std::istream if you inherit and implement std::streambuf and std::istream.

Second, you don't need to create extra method or don't need to override operators. They're already ready in std::istream's class level.

What you have to do to implement std::streambuf is to inherit it, override underflow() and setting get pointers using setg().

std::basic_istream or std::basic_streambuf

std::basic_istream defines user interface: operator>>, read, etc. That's what you call when you want to do input.

std::basic_streambuf defines virtual member functions: underflow, sync, etc. That's what you derive from when you want to write your own input class. boost.iostreams makes it easy.

std::istream_iterator calls operator>> (so it interprets the input as a sequence of objects of some type for which operator>> is defined, goes through locale, skips whitespace, etc)

std::istreambuf_iterator accesses a streambuf directly (so it can only read characters, no locale involved, whitespace isn't special)



Related Topics



Leave a reply



Submit