Testing Stream.Good() or !Stream.Eof() Reads Last Line Twice

Testing stream.good() or !stream.eof() reads last line twice

You very, very rarely want to check bad, eof, and good. In particular for eof (as !stream.eof() is a common mistake), the stream currently being at EOF does not necessarily mean the last input operation failed; conversely, not being at EOF does not mean the last input was successful.

All of the stream state functions – fail, bad, eof, and good – tell you the current state of the stream rather than predicting the success of a future operation. Check the stream itself (which is equivalent to an inverted fail check) after the desired operation:

if (getline(stream, line)) {
use(line);
}
else {
handle_error();
}

if (stream >> foo >> bar) {
use(foo, bar);
}
else {
handle_error();
}

if (!(stream >> foo)) { // operator! is overloaded for streams
throw SomeException();
}
use(foo);

To read and process all lines:

for (std::string line; getline(stream, line);) {
process(line);
}

Pointedly, good() is misnamed and is not equivalent to testing the stream itself (which the above examples do).

Reading from text file until EOF repeats last line

Just follow closely the chain of events.

  • Grab 10
  • Grab 20
  • Grab 30
  • Grab EOF

Look at the second-to-last iteration. You grabbed 30, then carried on to check for EOF. You haven't reached EOF because the EOF mark hasn't been read yet ("binarically" speaking, its conceptual location is just after the 30 line). Therefore you carry on to the next iteration. x is still 30 from previous iteration. Now you read from the stream and you get EOF. x remains 30 and the ios::eofbit is raised. You output to stderr x (which is 30, just like in the previous iteration). Next you check for EOF in the loop condition, and this time you're out of the loop.

Try this:

while (true) {
int x;
iFile >> x;
if( iFile.eof() ) break;
cerr << x << endl;
}

By the way, there is another bug in your code. Did you ever try to run it on an empty file? The behaviour you get is for the exact same reason.

Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

Because iostream::eof will only return true after reading the end of the stream. It does not indicate, that the next read will be the end of the stream.

Consider this (and assume then next read will be at the end of the stream):

while(!inStream.eof()){
int data;
// yay, not end of stream yet, now read ...
inStream >> data;
// oh crap, now we read the end and *only* now the eof bit will be set (as well as the fail bit)
// do stuff with (now uninitialized) data
}

Against this:

int data;
while(inStream >> data){
// when we land here, we can be sure that the read was successful.
// if it wasn't, the returned stream from operator>> would be converted to false
// and the loop wouldn't even be entered
// do stuff with correctly initialized data (hopefully)
}

And on your second question: Because

if(scanf("...",...)!=EOF)

is the same as

if(!(inStream >> data).eof())

and not the same as

if(!inStream.eof())
inFile >> data

Repeated value at the end of the file

This is wrong

while(!inFile.eof())
{
inFile>>grades;
cout<<grades<<endl;
}

This is right

while (inFile >> grades)
{
cout << grades << endl;
}

Must be the single most common error on this forum. eof() does not tell you that the next read will have an end of file error, it tells you that the last read failed because of end of file. So if you must use eof() you should use it after you read not before.

same word added twice

This is because when you read the last line, it won't set eof bit and the fail bit, only when you read the END, eof bit is set and eof() returns true.

while (!inputStream.eof())  // at the eof, but eof() is still false
{
inputStream >> x; // this fails and you are using the last x
next = next + " " + x;
}

change it to

while( inputStream >> x){
// inputStream >> x; dont call this again!
next = next + " " + x;
}

C99: Is it standard that fscanf() sets eof earlier than fgetc()?

In my experience, when working with <stdio.h> the precise semantics of the "eof" and "error" bits are very, very subtle, so much so that it's not usually worth it (it may not even be possible) to try to understand exactly how they work. (The first question I ever asked on SO was about this, although it involved C++, not C.)

I think you know this, but the first thing to understand is that the intent of feof() is very much not to predict whether the next attempt at input will reach the end of the file. The intent is not even to say that the input stream is "at" the end of the file. The right way to think about feof() (and the related ferror()) is that they're for error recovery, to tell you a bit more about why a previous input call failed.

And that's why writing a loop involving while(!feof(fp)) is always wrong.

But you're asking about precisely when fscanf hits end-of-file and sets the eof bit, versus getc/fgetc. With getc and fgetc, it's easy: they try to read one character, and they either get one or they don't (and if they don't, it's either because they hit end-of-file or encountered an i/o error).

But with fscanf it's trickier, because depending on the input specifier being parsed, characters are accepted only as long as they're appropriate for the input specifier. The %s specifier, for example, stops not only if it hits end-of-file or gets an error, but also when it hits a whitespace character. (And that's why people were asking in the comments whether your input file ended with a newline or not.)

I've experimented with the program

#include <stdio.h>

int main()
{
char buffer[100];
FILE *stream = stdin;

while(!feof(stream)) {
fscanf(stream,"%s",buffer);
printf("%s\n",buffer);
}
}

which is pretty close to what you posted. (I added a \n in the printf so that the output was easier to see, and better matched the input.) I then ran the program on the input

This
is
a
test.

and, specifically, where all four of those lines ended in a newline. And the output was, not surprisingly,

This
is
a
test.
test.

The last line is repeated because that's what (usually) happens when you write while(!feof(stream)).

But then I tried it on the input

This\n
is\n
a\n
test.

where the last line did not have a newline. This time, the output was

This
is
a
test.

This time, the last line was not repeated. (The output was still not identical to the input, because the output contained four newlines while the input contained three.)

I think the difference between these two cases is that in the first case, when the input contains a newline, fscanf reads the last line, reads the last \n, notices that it's whitespace, and returns, but it has not hit EOF and so does not set the EOF bit. In the second case, without the trailing newline, fscanf hits end-of-file while reading the last line, and so does set the eof bit, so feof() in the while() condition is satisfied, and the code does not make an extra trip through the loop, and the last line is not repeated.

We can see a bit more clearly what's going on if we look at fscanf's return value. I modified the loop like this:

while(!feof(stream)) {
int r = fscanf(stream,"%s",buffer);
printf("fscanf returned %2d: %5s (eof: %d)\n", r, buffer, feof(stream));
}

Now, when I run it on a file that ends with a newline, the output is:

fscanf returned  1:  This (eof: 0)
fscanf returned 1: is (eof: 0)
fscanf returned 1: a (eof: 0)
fscanf returned 1: test. (eof: 0)
fscanf returned -1: test. (eof: 1)

We can clearly see that after the fourth call, feof(stream) is not true yet, meaning that we'll make that last, extra, unnecessary, fifth trip through the loop. But we can see that during the fifth trip, fscanf returns -1, indicating (a) that it did not read a string as expected and (b) it reached EOF.

If I run it on input not containing the trailing newline, on the other hand, the output is like this:

fscanf returned  1:  This (eof: 0)
fscanf returned 1: is (eof: 0)
fscanf returned 1: a (eof: 0)
fscanf returned 1: test. (eof: 1)

Now, feof is true immediately after the fourth call to fscanf, and the extra trip is not made.

Bottom line: the moral is (the morals are):

  1. Don't write while(!feof(stream)).
  2. Do use feof() and ferror() only to test why a previous input call failed.
  3. Do check the return value of scanf and fscanf.

And we might also note: Do beware of files not ending in newline! They can behave surprisingly differently.


Addendum: Here's a better way to write the loop:

while((r = fscanf(stream,"%s",buffer)) == 1) {
printf("%s\n", buffer);
}

When you run this, it always prints exactly the strings it sees in the input. It doesn't repeat anything; it doesn't do anything significantly differently depending on whether the last line does or doesn't end in a newline. And -- significantly -- it doesn't (need to) call feof() at all!


Footnote: In all of this I've ignored the fact that %s with *scanf reads strings, not lines. Also that %s tends to behave very badly if it encounters a string that's larger than the buffer that's to receive it.



Related Topics



Leave a reply



Submit