C++ High Precision Time Measurement in Windows

C++ high precision time measurement in Windows

If you have a threaded application running on a multicore computer QueryPerformanceCounter can (and will) return different values depending on which core the code is executing on. See this MSDN article. (rdtsc has the same problem)

This is not just a theoretical problem; we ran into it with our application and had to conclude that the only reliable time source is timeGetTime which only has ms precision (which fortunately was sufficient in our case). We also tried fixating the thread affinity for our threads to guarantee that each thread always got a consistent value from QueryPerformanceCounter, this worked but it absolutely killed the performance in the application.

To sum things up there isn't a reliable timer on windows that can be used to time thing with micro second precision (at least not when running on a multicore computer).

Measuring time with a resolution of microseconds in C++?

Use QueryPerformanceCounter and QueryPerformanceFrequency for finest grain timing on Windows.

MSDN article on code timing with these APIs here (sample code is in VB - sorry).

Measure time, milliseconds or microseconds for Windows C++

You can use the standard C++ <chrono> library:

#include <iostream>
#include <chrono>

// long operation to time
long long fib(long long n) {
if (n < 2) {
return n;
} else {
return fib(n-1) + fib(n-2);
}
}

int main() {
auto start_time = std::chrono::high_resolution_clock::now();

long long input = 32;
long long result = fib(input);

auto end_time = std::chrono::high_resolution_clock::now();
auto time = end_time - start_time;

std::cout << "result = " << result << '\n';
std::cout << "fib(" << input << ") took " <<
time/std::chrono::milliseconds(1) << "ms to run.\n";
}

One thing to keep in mind is that using <chrono> enables type safe, generic timing code but to get that benefit you have use it a bit differently than you would use dumb, type-unsafe timing libraries that store durations and time points in types like int. Here's an answer that explains some specific usage scenarios and the differences between using untyped libraries and best practices for using chrono: https://stackoverflow.com/a/15839862/365496


The maintainer of Visual Studio's standard library implementation has indicated that the low resolution of high_resolution_clock has been fixed in VS2015 via the use of QueryPerformanceCounter().

precise time measurement

I usually use the QueryPerformanceCounter function.

example:

LARGE_INTEGER frequency;        // ticks per second
LARGE_INTEGER t1, t2; // ticks
double elapsedTime;

// get ticks per second
QueryPerformanceFrequency(&frequency);

// start timer
QueryPerformanceCounter(&t1);

// do something
...

// stop timer
QueryPerformanceCounter(&t2);

// compute and print the elapsed time in millisec
elapsedTime = (t2.QuadPart - t1.QuadPart) * 1000.0 / frequency.QuadPart;

C++ Cross-Platform High-Resolution Timer

For C++03:

Boost.Timer might work, but it depends on the C function clock and so may not have good enough resolution for you.

Boost.Date_Time includes a ptime class that's been recommended on Stack Overflow before. See its docs on microsec_clock::local_time and microsec_clock::universal_time, but note its caveat that "Win32 systems often do not achieve microsecond resolution via this API."

STLsoft provides, among other things, thin cross-platform (Windows and Linux/Unix) C++ wrappers around OS-specific APIs. Its performance library has several classes that would do what you need. (To make it cross platform, pick a class like performance_counter that exists in both the winstl and unixstl namespaces, then use whichever namespace matches your platform.)

For C++11 and above:

The std::chrono library has this functionality built in. See this answer by @HowardHinnant for details.

Microsecond resolution timestamps on Windows

I believe this is still useful: System Internals: Guidelines For Providing Multimedia Timer Support.

It does a good job of explaining the various timers available and their limitations. It might be that your archenemy will not so much be resolution, but latency.

QueryPerformanceCounter will not always run at CPU speed. In fact, it might try to avoid RDTSC, especially on multi-processor(/multi-core) systems: it will use the HPET on Windows Vista and later if it is available or the ACPI/PM timer.
On my system (Windows 7 x64, dual core AMD) the timer runs at 14.31818 MHz.

The same is true for earlier systems:

By default, Windows Server 2003 Service Pack 2 (SP2) uses the PM timer for all multiprocessor APIC or ACPI HALs, unless the check process to determine whether the BIOS supports the APIC or ACPI HALs fails."

The problem is, when the check fails. This simply means that your computer/BIOS is broken in a way. Then you might either fix your BIOS (recommended), or at least switch to using the ACPI timer (/usepmtimer) for the time being.

It is easy from C# - without P/Invoke - to check for high-resolution timer support with Stopwatch.IsHighResolution and then peek at Stopwatch.Frequency. It will make the necessary QueryPerformanceCounter call internally.

Also consider that if the timers are broken, the whole system will go havoc and in general, behave strangely, reporting negative elapsed times, slowing down, etc. - not just your application.

This means that you can actually rely on QueryPerformanceCounter.

... and contrary to popular belief, QueryPerformanceFrequency() "cannot change while the system is running".

Edit: As the documentation on QueryPerformanceCounter() states, "it should not matter which processor is called" - and in fact the whole hacking around with thread affinity is only needed if the APIC/ACPI detection fails and the system resorts to using the TSC. It is a resort that should not happen. If it happens on older systems, there is likely a BIOS update/driver fix from the manufacturer. If there is none, the /usepmtimer boot switch is still there. If that fails as well, because the system does not have a proper timer apart from the Pentium TSC, you might in fact consider messing with thread affinity - even then, the sample provided by others in the "Community Content" area of the page is misleading as it has a non-negligible overhead due to setting thread affinity on every start/stop call - that introduces considerable latency and likely diminishes the benefits of using a high resolution timer in the first place.

Game Timing and Multicore Processors is a recommendation on how to use them properly. Please consider that it is now five years old, and at that time fewer systems were fully ACPI compliant/supported - that is why while bashing it, the article goes into so much detail about TSC and how to work around its limitations by keeping an affine thread.

I believe it is a fairly hard task nowadays to find a common PC with zero ACPI support and no usable PM timer. The most common case is probably BIOS settings, when ACPI support is incorrectly set (sometimes sadly by factory defaults).

Anecdotes tell that eight years ago, the situation was different in rare cases. (Makes a fun read, developers working around design "shortcomings" and bashing chip designers. To be fair, it might be the same way vice versa. :-)

How do I measure a time interval in C?

High resolution timers that provide a resolution of 1 microsecond are system-specific, so you will have to use different methods to achieve this on different OS platforms. You may be interested in checking out the following article, which implements a cross-platform C++ timer class based on the functions described below:

  • [Song Ho Ahn - High Resolution Timer][1]

Windows

The Windows API provides extremely high resolution timer functions: QueryPerformanceCounter(), which returns the current elapsed ticks, and QueryPerformanceFrequency(), which returns the number of ticks per second.

Example:

#include <stdio.h>
#include <windows.h> // for Windows APIs

int main(void)
{
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER t1, t2; // ticks
double elapsedTime;

// get ticks per second
QueryPerformanceFrequency(&frequency);

// start timer
QueryPerformanceCounter(&t1);

// do something
// ...

// stop timer
QueryPerformanceCounter(&t2);

// compute and print the elapsed time in millisec
elapsedTime = (t2.QuadPart - t1.QuadPart) * 1000.0 / frequency.QuadPart;
printf("%f ms.\n", elapsedTime);
}

Linux, Unix, and Mac

For Unix or Linux based system, you can use gettimeofday(). This function is declared in "sys/time.h".

Example:

#include <stdio.h>
#include <sys/time.h> // for gettimeofday()

int main(void)
{
struct timeval t1, t2;
double elapsedTime;

// start timer
gettimeofday(&t1, NULL);

// do something
// ...

// stop timer
gettimeofday(&t2, NULL);

// compute and print the elapsed time in millisec
elapsedTime = (t2.tv_sec - t1.tv_sec) * 1000.0; // sec to ms
elapsedTime += (t2.tv_usec - t1.tv_usec) / 1000.0; // us to ms
printf("%f ms.\n", elapsedTime);
}


Related Topics



Leave a reply



Submit