Getchar_Unlocked( ) VS Scanf() VS Cin

Using scanf() in C++ programs is faster than using cin?

Here's a quick test of a simple case: a program to read a list of numbers from standard input and XOR all of the numbers.

iostream version:

#include <iostream>

int main(int argc, char **argv) {

int parity = 0;
int x;

while (std::cin >> x)
parity ^= x;
std::cout << parity << std::endl;

return 0;
}

scanf version:

#include <stdio.h>

int main(int argc, char **argv) {

int parity = 0;
int x;

while (1 == scanf("%d", &x))
parity ^= x;
printf("%d\n", parity);

return 0;
}

Results

Using a third program, I generated a text file containing 33,280,276 random numbers. The execution times are:

iostream version:  24.3 seconds
scanf version: 6.4 seconds

Changing the compiler's optimization settings didn't seem to change the results much at all.

Thus: there really is a speed difference.


EDIT: User clyfish points out below that the speed difference is largely due to the iostream I/O functions maintaining synchronization with the C I/O functions. We can turn this off with a call to std::ios::sync_with_stdio(false);:

#include <iostream>

int main(int argc, char **argv) {

int parity = 0;
int x;

std::ios::sync_with_stdio(false);

while (std::cin >> x)
parity ^= x;
std::cout << parity << std::endl;

return 0;
}

New results:

iostream version:                       21.9 seconds
scanf version: 6.8 seconds
iostream with sync_with_stdio(false): 5.5 seconds

C++ iostream wins! It turns out that this internal syncing / flushing is what normally slows down iostream i/o. If we're not mixing stdio and iostream, we can turn it off, and then iostream is fastest.

The code: https://gist.github.com/3845568

undefined reference to 'getchar_unlocked' error

Quoting this answer on SO

getchar_unlocked is deprecated in Windows because it is thread unsafe
version of getchar().

getchar_unlocked which has less overheads as compared to scanf or cin is not a standard feature of c or c++ , you can always use getchar() for that purposes here .

or you can write a function getchar_unlocked() to return the values of getchar() for on machine testing purposes if you are bound to use it in your online question .

Input 300 000 sets of numbers with cin or scanf

How do you know the std::cin part is the problem? Did you profile your code? If not, I suggest doing that, it's often surprising which part of the code is taking up most time. See e.g. How can I profile C++ code running on Linux?.

You're doing a lot of unnecessary work in various parts of the code. For example, your max function does at least 7 comparissons, and looks extremely error prone to write. You could simply replace the whole function by:

std::max({ a, b, c })

I would also take a look at your min_replacements function and see if it can be simplified. Unfortunately, you're using variable names which are super vague, so it's pretty much impossible to understand what the code should be doing. I suggest using much more descriptive variable names. That way the code will become much easier to reason about. The way it's currently written, there's a very good change even you yourself won't be able to make sense of it in a month's time.

Just glacing over the min_replacements function though, there's definitely a lot more work going on than necessary. E.g. the last for-loop:

   for (int i = 0; i < n; ++i)
{
if (ot(*(ds + i), *(ps + i), *(rs + i)) || ooo(*(ds + i), *(ps + i), *(rs + i)) || to(*(ds + i), *(ps + i), *(rs + i)))
{
loop = false;
}
else
{
loop = true;
}
}

Each loop iterator sets the loop variable. Assuming this code is correct, you don't need the loop at all, just do the check only once for i = n - 1. That's already O(n) changed to O(1).

Why is getchar_unlocked() faster than alternatives?

scanf("%d\n", &x) has to parse the format string and lock the stream before and after the reading.

std::cin >> x might do locking too, and it might have to sync with stdin, and it might need to go through some abstraction layers.

With the above, you only do one type of input parsing (so no need to parse a format string and decide what to do based on that) and most importantly, you don't lock the stream.

Locking streams is mandated by POSIX, and glibc uses recursive mutexes to prevent multiple threads (even in a single-threaded environment) from accessing the stdin FILE simultaneously (which would corrupt it).
These mutexes are quite expensive (your read_int should be several (fivish?) times faster than scanf("%d",&x)).


Regarding your implementation, apart from fixing the magic number issue,
you should probably detect failures in getchar_unlocked too and report those failures through a separate channel -- e.g., by returning the parsed integer through a passed-in pointer and using the return status for error reporting.

If you want thread safety, you can still use getchar_unlocked to get a speedup compared to getchar, but you have to flockfile(stdin); and funlock(stdin); at the beginning and end (respectively) of your read_int function.

Faster than scanf?

The overhead I was encountering was not the parsing itself but the many calls to fread (same with fgetc, and friends). For each call, the libc has to lock the input stream to make sure two threads aren't stepping on each other's feet. Locking is a very costly operation.

What we're looking for is a function that gives us buffered input (reimplementing buffering is just too much effort) but avoids the huge locking overhead of fgetc.

If we can guarantee that there is only a single thread using the input stream, we can use the functions from unlocked_stdio(3), such as getchar_unlocked(3). Here is an example:

static int parseint(void)
{
int c, n;

n = getchar_unlocked() - '0';
while (isdigit((c = getchar_unlocked())))
n = 10*n + c-'0';

return n;
}

The above version doesn't check for errors. But it's guaranteed to terminate. If error handling is needed it might be enough to check feof(stdin) and ferror(stdin) at the end, or let the caller do it.

My original purpose was submitting solutions to programming problems at SPOJ, where the input is only whitespace and digits. So there is still room for improvement, namely the isdigit check.

static int parseint(void)
{
int c, n;

n = getchar_unlocked() - '0';
while ((c = getchar_unlocked()) >= '0')
n = 10*n + c-'0';

return n;
}

It's very, very hard to beat this parsing routine, both performance-wise and in terms of convenience and maintainability.



Related Topics



Leave a reply



Submit