Skipping Expected Characters Like Scanf() with Cin

Skipping expected characters like scanf() with cin

You could create your own stream manipulator. It is fairly easy.

#include <ios>
#include <iostream>
using namespace std;

// skips the number of characters equal to the length of given text
// does not check whether the skipped characters are the same as it
struct skip
{
const char * text;
skip(const char * text) : text(text) {}
};

std::istream & operator >> (std::istream & stream, const skip & x)
{
ios_base::fmtflags f = stream.flags();
stream >> noskipws;

char c;
const char * text = x.text;
while (stream && *text++)
stream >> c;

stream.flags(f);
return stream;
}

int main()
{
int a, b;
cin >> a >> skip(" # ") >> b;
cout << a << ", " << b << endl;
return 0;
}

Why does scanf appear to skip input?

When you enter a, then cin >> i fails to read it because the type of i is int to which a character cannot be read. That means, a remains in the stream forever.

Now why i prints 0 is a different story. Actually it can print anything. The content of i is not defined once the attempt to read fails. Similar thing happens with scanf as well.

The proper way to write it this:

do
{
++j;
if (!(cin>>i))
{
//handle error, maybe you want to break the loop here?
}
cout<<i<<" "<<j<<"\n";
}
while((i!=8) && (j<10));

Or simply this (if you want to exit loop if error occurs):

int i = 0, j = 0;
while((i!=8) && (j<10) && ( cin >> i) )
{
++j;
cout<<i<<" "<<j<<"\n";
}

Take only digit as input by scanf() and avoid other characters

std::cin solution

Since you are already using std::cout1, you can use std::cin instead of scanf. Now, read the number first, and handle the sign after:

std::vector<int> numbers;
for (int n; std::cin >> n;) {
numbers.push_back(n);
if (std::cin.peek() == '+') { // if the next character is '+'
std::cin.get(); // then discard it
}
}

scanf solution

If you insist on using scanf (in which case you probably should be using printf as well), you can use the ignore functionality of the control string of scanf:

std::vector<int> numbers;
for (int n; std::scanf("%d%*1[+]", &n) == 1;) { // optionally discard '+' after number
numbers.push_back(n);
}

Let's break the %*1[+] conversion specification down:

  • % — the beginning of every conversion specification;

  • * — the result of this conversion should be discarded, rather than assigned to an argument;

  • 1 — the width of the conversion (i.e., the maximum number of characters to convert); and

  • [+] — the format specifier which means to match the longest string comprising + exclusively (subject to the width constraint, of course).

Altogether, this conversion specification means "discard at most one + character".

If no + sign is present, the scanf will return because consuming zero characters for a field results in a conversion error. The next character will not be consumed (which is expected).

Note that scanf returns number of receiving arguments (arguments after the format string) successfully assigned (which may be zero in case a matching failure occurred before the first receiving argument was assigned), or EOF if input failure occurs before the first receiving argument was assigned — see the cppreference page.


1 You are actually writing cout, which is hopefully std::cout brought in scope by using std::cout or using namespace std;.

Scanf does not work as expected

scanf("%d\n",&i); is equivalent to std::cin >> i >> std::ws;.

If you want the same behaviour for scanf, remove \n: scanf("%d",&i);

This is caused by the fact that any whitespace character in scanf means "skip input until non-whitespace is found"

Why do cin and getline exhibit different reading behavior?

reading behavior of cin and getline.

cin does not "read" anything. cin is an input stream. cin is getting read from. getline reads from an input stream. The formatted extraction operator, >>, reads from an input stream. What's doing the reading is >> and std::getline. std::cin does no reading of its own. It's what's being read from.

first cin read up until the "\n". once it hit that "\n", it increments the
cursor to the next position

No it doesn't. The first >> operator reads up until the \n, but does not read it. \n remains unread.

The second >> operator starts reading with the newline character. The >> operator skips all whitespace in the input stream before it extracts the expected value.

The detail that you're missing is that >> skips whitespace (if there is any) before it extracts the value from the input stream, and not after.

Now, it is certainly possible that >> finds no whitespace in the input stream before extracting the formatted value. If >> is tasked with extracting an int, and the input stream has just been opened and it's at the beginning of the file, and the first character in the file is a 1, well, the >> just doesn't skip any whitespace at all.

Finally, std::getline does not skip any whitespace, it just reads from the input stream until it reads a \n (or reaching the end of the input stream).

When and why do I need to use cin.ignore() in C++?

ignore does exactly what the name implies.

It doesn't "throw away" something you don't need. Instead, it ignores the number of characters you specify when you call it, up to the char you specify as a delimiter.

It works with both input and output buffers.

Essentially, for std::cin statements you use ignore before you do a getline call, because when a user inputs something with std::cin, they hit enter and a '\n' char gets into the cin buffer. Then if you use getline, it gets the newline char instead of the string you want. So you do a std::cin.ignore(1000,'\n') and that should clear the buffer up to the string that you want. (The 1000 is put there to skip over a specific number of chars before the specified delimiter, in this case, the '\n' newline character.)

Using scanf() in C++ programs is faster than using cin?

Here's a quick test of a simple case: a program to read a list of numbers from standard input and XOR all of the numbers.

iostream version:

#include <iostream>

int main(int argc, char **argv) {

int parity = 0;
int x;

while (std::cin >> x)
parity ^= x;
std::cout << parity << std::endl;

return 0;
}

scanf version:

#include <stdio.h>

int main(int argc, char **argv) {

int parity = 0;
int x;

while (1 == scanf("%d", &x))
parity ^= x;
printf("%d\n", parity);

return 0;
}

Results

Using a third program, I generated a text file containing 33,280,276 random numbers. The execution times are:

iostream version:  24.3 seconds
scanf version: 6.4 seconds

Changing the compiler's optimization settings didn't seem to change the results much at all.

Thus: there really is a speed difference.


EDIT: User clyfish points out below that the speed difference is largely due to the iostream I/O functions maintaining synchronization with the C I/O functions. We can turn this off with a call to std::ios::sync_with_stdio(false);:

#include <iostream>

int main(int argc, char **argv) {

int parity = 0;
int x;

std::ios::sync_with_stdio(false);

while (std::cin >> x)
parity ^= x;
std::cout << parity << std::endl;

return 0;
}

New results:

iostream version:                       21.9 seconds
scanf version: 6.8 seconds
iostream with sync_with_stdio(false): 5.5 seconds

C++ iostream wins! It turns out that this internal syncing / flushing is what normally slows down iostream i/o. If we're not mixing stdio and iostream, we can turn it off, and then iostream is fastest.

The code: https://gist.github.com/3845568

Behavior of std::cin on failure

The very link you cited explains what's happening:

https://www.learncpp.com/cpp-tutorial/stdcin-and-handling-invalid-input/

When the user enters input in response to an extraction operation,
that data is placed in a buffer inside of std::cin.

When the extraction operator is used, the following procedure happens:

  • If there is data already in the input buffer, that data is used for extraction.
  • If the input buffer contains no data, the user is asked to input data for extraction (this is the case most of the time). When the user
    hits enter, a ‘\n’ character will be placed in the input buffer.
  • operator>> extracts as much data from the input buffer as it can into the variable (ignoring any leading whitespace characters, such as
    spaces, tabs, or ‘\n’).
  • Any data that can not be extracted is left in the input buffer for the next extraction.

So far, so good. The article continues:

[Upon an input error] std::cin goes immediately into “failure mode”,
but also assigns the closest in-range value to the variable.
Consequently, x is left with the assigned value of 32767.

Additional inputs are skipped, leaving y with the initialized value of
0.

This explains the "0" you're seeing. It also explains why "z" wasn't replaced with "a".



Related Topics



Leave a reply



Submit