Finding Out the CPU Clock Frequency (Per Core, Per Processor)

Finding out the CPU clock frequency (per core, per processor)

I'll expand on my comments here. This is too big and in-depth for me to fit in the comments.

What you're trying to do is very difficult - to the point of being impractical for the following reasons:

  • There's no portable way to get the processor frequency. rdtsc does NOT always give the correct frequency due to effects such as SpeedStep and Turbo Boost.
  • All known methods to measure frequency require an accurate measurement of time. However, a determined cheater can tamper with all the clocks and timers in the system.
  • Accurately reading either the processor frequency as well as time in a tamper-proof way will require kernel-level access. This implies driver signing for Windows.

There's no portable way to get the processor frequency:

The "easy" way to get the CPU frequency is to call rdtsc twice with a fixed time-duration in between. Then dividing out the difference will give you the frequency.

The problem is that rdtsc does not give the true frequency of the processor. Because real-time applications such as games rely on it, rdtsc needs to be consistent through CPU throttling and Turbo Boost. So once your system boots, rdtsc will always run at the same rate (unless you start messing with the bus speeds with SetFSB or something).

For example, on my Core i7 2600K, rdtsc will always show the frequency at 3.4 GHz. But in reality, it idles at 1.6 GHz and clocks up to 4.6 GHz under load via the overclocked Turbo Boost multiplier at 46x.

But once you find a way to measure the true frequency, (or you're happy enough with rdtsc), you can easily get the frequency of each core using thread-affinities.

Getting the True Frequency:

To get the true frequency of the processor, you need to access either the MSRs (model-specific registers) or the hardware performance counters.

These are kernel-level instructions and therefore require the use of a driver. If you're attempting this in Windows for the purpose of distribution, you will therefore need to go through the proper driver signing protocol. Furthermore, the code will differ by processor make and model so you will need different detection code for each processor generation.

Once you get to this stage, there are a variety of ways to read the frequency.

On Intel processors, the hardware counters let you count raw CPU cycles. Combined with a method of precisely measuring real time (next section), you can compute the true frequency. The MSRs give you access to other information such as the CPU frequency multiplier.


All known methods to measure frequency require an accurate measurement of time:

This is perhaps the bigger problem. You need a timer to be able to measure the frequency. A capable hacker will be able to tamper with all the clocks that you can use in C/C++.
This includes all of the following:

  • clock()
  • gettimeofday()
  • QueryPerformanceCounter()
  • etc...

The list goes on and on. In other words, you cannot trust any of the timers as a capable hacker will be able to spoof all of them. For example clock() and gettimeofday() can be fooled by changing the system clock directly within the OS. Fooling QueryPerformanceCounter() is harder.

Getting a True Measurement of Time:

All the clocks listed above are vulnerable because they are often derived off of the same system base clock in some way or another. And that system base clock is often tied to the system base clock - which can be changed after the system has already booted up by means of overclocking utilities.

So the only way to get a reliable and tamper-proof measurement of time is to read external clocks such as the HPET or the ACPI. Unfortunately, these also seem to require kernel-level access.


To Summarize:

Building any sort of tamper-proof benchmark will almost certainly require writing a kernel-mode driver which requires certificate signing for Windows. This is often too much of a burden for casual benchmark writers.

This has resulted in a shortage of tamper-proof benchmarks which has probably contributed to the overall decline of the competitive overclocking community in recent years.

How can I programmatically find the CPU frequency with C

How you find the CPU frequency is both architecture AND OS dependent, and there is no abstract solution.

If we were 20+ years ago and you were using an OS with no context switching and the CPU executed the instructions given it in order, you could write some C code in a loop and time it, then based on the assembly it was compiled into compute the number of instructions at runtime. This is already making the assumption that each instruction takes 1 clock cycle, which is a rather poor assumption ever since pipelined processors.

But any modern OS will switch between multiple processes. Even then you can attempt to time a bunch of identical for loop runs (ignoring time needed for page faults and multiple other reasons why your processor might stall) and get a median value.

And even if the previous solution works, you have multi-issue processors. With any modern processor, it's fair game to re-order your instructions, issue a bunch of them in the same clock cycle, or even split them across cores.

How to calculate the frequency of CPU cores

Full solution follows. I've adapted the IOCTL sample driver on MSDN to do this. Note, the IOCTL sample is the only relative WDM sample skeleton driver I could find and also the closest thing I could find to a WDM template because most kernel mode templates out of the box in WDK are WDF-based drivers (any WDM driver template is actually blank with absolutely no source code), yet the only sample logic I've seen to do this input/output was through a WDM-based driver. Also, some fun facts I've learned along the way: kernel drivers don't like floating arithmetic and you can't use "windows.h" which really limits you to "ntddk.h", a special kernel-mode header. This also means I can't do all of my computations inside of kernel mode because I can't call functions like QueryPerformanceFrequency in there, so I had to get the mean performance ratio between timestamps and return them back to user mode for some computations (without QueryPerformanceFrequency, the values you get from CPU registers that store ticks like what QueryPerformanceCounter uses are useless because you don't know the step size; maybe there's a workaround to this but I opted to just use the mean since it works pretty damn well). Also, as per the one second sleep, the reason I used that is because otherwise you're almost spin-computing shit on multiple threads, which really messes up your calculations because your frequencies will go up per core constantly checking results from QueryPerformanceCounter (you drive your cores up as you do more computations) - NOT TO MENTION - its a ratio...so the delta time is not that important since its cycles per time...you can always increase the delta, it should still give you the same ratio relative to the step size. Furthermore, this is as minimalistic as I could get it to be. Good luck making it much smaller or shorter than this. Also, if you want to install the driver, you have two options unless you want to buy a Code Signing certificate from some third party, both suck, so pick one and suck it up. Let's start with the driver:

driver.c:

//
// Include files.
//

#include <ntddk.h> // various NT definitions
#include <string.h>
#include <intrin.h>

#include "driver.h"

#define NT_DEVICE_NAME L"\\Device\\KernelModeDriver"
#define DOS_DEVICE_NAME L"\\DosDevices\\KernelModeDriver"

#if DBG
#define DRIVER_PRINT(_x_) \
DbgPrint("KernelModeDriver.sys: ");\
DbgPrint _x_;

#else
#define DRIVER_PRINT(_x_)
#endif

//
// Device driver routine declarations.
//

DRIVER_INITIALIZE DriverEntry;

_Dispatch_type_(IRP_MJ_CREATE)
_Dispatch_type_(IRP_MJ_CLOSE)
DRIVER_DISPATCH DriverCreateClose;

_Dispatch_type_(IRP_MJ_DEVICE_CONTROL)
DRIVER_DISPATCH DriverDeviceControl;

DRIVER_UNLOAD DriverUnloadDriver;

VOID
PrintIrpInfo(
PIRP Irp
);
VOID
PrintChars(
_In_reads_(CountChars) PCHAR BufferAddress,
_In_ size_t CountChars
);

#ifdef ALLOC_PRAGMA
#pragma alloc_text( INIT, DriverEntry )
#pragma alloc_text( PAGE, DriverCreateClose)
#pragma alloc_text( PAGE, DriverDeviceControl)
#pragma alloc_text( PAGE, DriverUnloadDriver)
#pragma alloc_text( PAGE, PrintIrpInfo)
#pragma alloc_text( PAGE, PrintChars)
#endif // ALLOC_PRAGMA

NTSTATUS
DriverEntry(
_In_ PDRIVER_OBJECT DriverObject,
_In_ PUNICODE_STRING RegistryPath
)
/*++

Routine Description:
This routine is called by the Operating System to initialize the driver.

It creates the device object, fills in the dispatch entry points and
completes the initialization.

Arguments:
DriverObject - a pointer to the object that represents this device
driver.

RegistryPath - a pointer to our Services key in the registry.

Return Value:
STATUS_SUCCESS if initialized; an error otherwise.

--*/

{
NTSTATUS ntStatus;
UNICODE_STRING ntUnicodeString; // NT Device Name "\Device\KernelModeDriver"
UNICODE_STRING ntWin32NameString; // Win32 Name "\DosDevices\KernelModeDriver"
PDEVICE_OBJECT deviceObject = NULL; // ptr to device object

UNREFERENCED_PARAMETER(RegistryPath);

RtlInitUnicodeString( &ntUnicodeString, NT_DEVICE_NAME );

ntStatus = IoCreateDevice(
DriverObject, // Our Driver Object
0, // We don't use a device extension
&ntUnicodeString, // Device name "\Device\KernelModeDriver"
FILE_DEVICE_UNKNOWN, // Device type
FILE_DEVICE_SECURE_OPEN, // Device characteristics
FALSE, // Not an exclusive device
&deviceObject ); // Returned ptr to Device Object

if ( !NT_SUCCESS( ntStatus ) )
{
DRIVER_PRINT(("Couldn't create the device object\n"));
return ntStatus;
}

//
// Initialize the driver object with this driver's entry points.
//

DriverObject->MajorFunction[IRP_MJ_CREATE] = DriverCreateClose;
DriverObject->MajorFunction[IRP_MJ_CLOSE] = DriverCreateClose;
DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = DriverDeviceControl;
DriverObject->DriverUnload = DriverUnloadDriver;

//
// Initialize a Unicode String containing the Win32 name
// for our device.
//

RtlInitUnicodeString( &ntWin32NameString, DOS_DEVICE_NAME );

//
// Create a symbolic link between our device name and the Win32 name
//

ntStatus = IoCreateSymbolicLink(
&ntWin32NameString, &ntUnicodeString );

if ( !NT_SUCCESS( ntStatus ) )
{
//
// Delete everything that this routine has allocated.
//
DRIVER_PRINT(("Couldn't create symbolic link\n"));
IoDeleteDevice( deviceObject );
}

return ntStatus;
}

NTSTATUS
DriverCreateClose(
PDEVICE_OBJECT DeviceObject,
PIRP Irp
)
/*++

Routine Description:

This routine is called by the I/O system when the KernelModeDriver is opened or
closed.

No action is performed other than completing the request successfully.

Arguments:

DeviceObject - a pointer to the object that represents the device
that I/O is to be done on.

Irp - a pointer to the I/O Request Packet for this request.

Return Value:

NT status code

--*/

{
UNREFERENCED_PARAMETER(DeviceObject);

PAGED_CODE();

Irp->IoStatus.Status = STATUS_SUCCESS;
Irp->IoStatus.Information = 0;

IoCompleteRequest( Irp, IO_NO_INCREMENT );

return STATUS_SUCCESS;
}

VOID
DriverUnloadDriver(
_In_ PDRIVER_OBJECT DriverObject
)
/*++

Routine Description:

This routine is called by the I/O system to unload the driver.

Any resources previously allocated must be freed.

Arguments:

DriverObject - a pointer to the object that represents our driver.

Return Value:

None
--*/

{
PDEVICE_OBJECT deviceObject = DriverObject->DeviceObject;
UNICODE_STRING uniWin32NameString;

PAGED_CODE();

//
// Create counted string version of our Win32 device name.
//

RtlInitUnicodeString( &uniWin32NameString, DOS_DEVICE_NAME );

//
// Delete the link from our device name to a name in the Win32 namespace.
//

IoDeleteSymbolicLink( &uniWin32NameString );

if ( deviceObject != NULL )
{
IoDeleteDevice( deviceObject );
}

}

NTSTATUS
DriverDeviceControl(
PDEVICE_OBJECT DeviceObject,
PIRP Irp
)

/*++

Routine Description:

This routine is called by the I/O system to perform a device I/O
control function.

Arguments:

DeviceObject - a pointer to the object that represents the device
that I/O is to be done on.

Irp - a pointer to the I/O Request Packet for this request.

Return Value:

NT status code

--*/

{
PIO_STACK_LOCATION irpSp;// Pointer to current stack location
NTSTATUS ntStatus = STATUS_SUCCESS;// Assume success
ULONG inBufLength; // Input buffer length
ULONG outBufLength; // Output buffer length
void *inBuf; // pointer to input buffer
unsigned __int64 *outBuf; // pointer to the output buffer

UNREFERENCED_PARAMETER(DeviceObject);

PAGED_CODE();

irpSp = IoGetCurrentIrpStackLocation( Irp );
inBufLength = irpSp->Parameters.DeviceIoControl.InputBufferLength;
outBufLength = irpSp->Parameters.DeviceIoControl.OutputBufferLength;

if (!inBufLength || !outBufLength || outBufLength != sizeof(unsigned __int64)*2)
{
ntStatus = STATUS_INVALID_PARAMETER;
goto End;
}

//
// Determine which I/O control code was specified.
//

switch ( irpSp->Parameters.DeviceIoControl.IoControlCode )
{
case IOCTL_SIOCTL_METHOD_BUFFERED:

//
// In this method the I/O manager allocates a buffer large enough to
// to accommodate larger of the user input buffer and output buffer,
// assigns the address to Irp->AssociatedIrp.SystemBuffer, and
// copies the content of the user input buffer into this SystemBuffer
//

DRIVER_PRINT(("Called IOCTL_SIOCTL_METHOD_BUFFERED\n"));
PrintIrpInfo(Irp);

//
// Input buffer and output buffer is same in this case, read the
// content of the buffer before writing to it
//

inBuf = (void *)Irp->AssociatedIrp.SystemBuffer;
outBuf = (unsigned __int64 *)Irp->AssociatedIrp.SystemBuffer;

//
// Read the data from the buffer
//

DRIVER_PRINT(("\tData from User :"));
//
// We are using the following function to print characters instead
// DebugPrint with %s format because we string we get may or
// may not be null terminated.
//
PrintChars(inBuf, inBufLength);

//
// Write to the buffer
//

unsigned __int64 data[sizeof(unsigned __int64) * 2];
data[0] = __readmsr(232);
data[1] = __readmsr(231);

DRIVER_PRINT(("data[0]: %d", data[0]));
DRIVER_PRINT(("data[1]: %d", data[1]));

RtlCopyBytes(outBuf, data, outBufLength);

//
// Assign the length of the data copied to IoStatus.Information
// of the Irp and complete the Irp.
//

Irp->IoStatus.Information = sizeof(unsigned __int64)*2;

//
// When the Irp is completed the content of the SystemBuffer
// is copied to the User output buffer and the SystemBuffer is
// is freed.
//

break;

default:

//
// The specified I/O control code is unrecognized by this driver.
//

ntStatus = STATUS_INVALID_DEVICE_REQUEST;
DRIVER_PRINT(("ERROR: unrecognized IOCTL %x\n",
irpSp->Parameters.DeviceIoControl.IoControlCode));
break;
}

End:
//
// Finish the I/O operation by simply completing the packet and returning
// the same status as in the packet itself.
//

Irp->IoStatus.Status = ntStatus;

IoCompleteRequest( Irp, IO_NO_INCREMENT );

return ntStatus;
}

VOID
PrintIrpInfo(
PIRP Irp)
{
PIO_STACK_LOCATION irpSp;
irpSp = IoGetCurrentIrpStackLocation( Irp );

PAGED_CODE();

DRIVER_PRINT(("\tIrp->AssociatedIrp.SystemBuffer = 0x%p\n",
Irp->AssociatedIrp.SystemBuffer));
DRIVER_PRINT(("\tIrp->UserBuffer = 0x%p\n", Irp->UserBuffer));
DRIVER_PRINT(("\tirpSp->Parameters.DeviceIoControl.Type3InputBuffer = 0x%p\n",
irpSp->Parameters.DeviceIoControl.Type3InputBuffer));
DRIVER_PRINT(("\tirpSp->Parameters.DeviceIoControl.InputBufferLength = %d\n",
irpSp->Parameters.DeviceIoControl.InputBufferLength));
DRIVER_PRINT(("\tirpSp->Parameters.DeviceIoControl.OutputBufferLength = %d\n",
irpSp->Parameters.DeviceIoControl.OutputBufferLength ));
return;
}

VOID
PrintChars(
_In_reads_(CountChars) PCHAR BufferAddress,
_In_ size_t CountChars
)
{
PAGED_CODE();

if (CountChars) {

while (CountChars--) {

if (*BufferAddress > 31
&& *BufferAddress != 127) {

KdPrint (( "%c", *BufferAddress) );

} else {

KdPrint(( ".") );

}
BufferAddress++;
}
KdPrint (("\n"));
}
return;
}

driver.h:

//
// Device type -- in the "User Defined" range."
//
#define SIOCTL_TYPE 40000
//
// The IOCTL function codes from 0x800 to 0xFFF are for customer use.
//
#define IOCTL_SIOCTL_METHOD_IN_DIRECT \
CTL_CODE( SIOCTL_TYPE, 0x900, METHOD_IN_DIRECT, FILE_ANY_ACCESS )

#define IOCTL_SIOCTL_METHOD_OUT_DIRECT \
CTL_CODE( SIOCTL_TYPE, 0x901, METHOD_OUT_DIRECT , FILE_ANY_ACCESS )

#define IOCTL_SIOCTL_METHOD_BUFFERED \
CTL_CODE( SIOCTL_TYPE, 0x902, METHOD_BUFFERED, FILE_ANY_ACCESS )

#define IOCTL_SIOCTL_METHOD_NEITHER \
CTL_CODE( SIOCTL_TYPE, 0x903, METHOD_NEITHER , FILE_ANY_ACCESS )

#define DRIVER_FUNC_INSTALL 0x01
#define DRIVER_FUNC_REMOVE 0x02

#define DRIVER_NAME "ReadMSRDriver"

Now, here is the application that loads up and uses the driver (Win32 Console Application):

FrequencyCalculator.cpp:

#include "stdafx.h"
#include <iostream>
#include <windows.h>
#include <winioctl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <strsafe.h>
#include <process.h>
#include "..\KernelModeDriver\driver.h"

using namespace std;

BOOLEAN
ManageDriver(
_In_ LPCTSTR DriverName,
_In_ LPCTSTR ServiceName,
_In_ USHORT Function
);

HANDLE hDevice;
TCHAR driverLocation[MAX_PATH];

void InstallDriver()
{
DWORD errNum = 0;
GetCurrentDirectory(MAX_PATH, driverLocation);
_tcscat_s(driverLocation, _T("\\KernelModeDriver.sys"));

std::wcout << "Trying to install driver at " << driverLocation << std::endl;

//
// open the device
//

if ((hDevice = CreateFile(_T("\\\\.\\KernelModeDriver"),
GENERIC_READ | GENERIC_WRITE,
0,
NULL,
CREATE_ALWAYS,
FILE_ATTRIBUTE_NORMAL,
NULL)) == INVALID_HANDLE_VALUE) {

errNum = GetLastError();

if (errNum != ERROR_FILE_NOT_FOUND) {

printf("CreateFile failed! ERROR_FILE_NOT_FOUND = %d\n", errNum);

return;
}

//
// The driver is not started yet so let us the install the driver.
// First setup full path to driver name.
//

if (!ManageDriver(_T(DRIVER_NAME),
driverLocation,
DRIVER_FUNC_INSTALL
)) {

printf("Unable to install driver. \n");

//
// Error - remove driver.
//

ManageDriver(_T(DRIVER_NAME),
driverLocation,
DRIVER_FUNC_REMOVE
);

return;
}

hDevice = CreateFile(_T("\\\\.\\KernelModeDriver"),
GENERIC_READ | GENERIC_WRITE,
0,
NULL,
CREATE_ALWAYS,
FILE_ATTRIBUTE_NORMAL,
NULL);

if (hDevice == INVALID_HANDLE_VALUE){
printf("Error: CreatFile Failed : %d\n", GetLastError());
return;
}
}
}

void UninstallDriver()
{
//
// close the handle to the device.
//

CloseHandle(hDevice);

//
// Unload the driver. Ignore any errors.
//
ManageDriver(_T(DRIVER_NAME),
driverLocation,
DRIVER_FUNC_REMOVE
);
}

double GetPerformanceRatio()
{
BOOL bRc;
ULONG bytesReturned;

int input = 0;
unsigned __int64 output[2];
memset(output, 0, sizeof(unsigned __int64) * 2);

//printf("InputBuffer Pointer = %p, BufLength = %d\n", &input, sizeof(&input));
//printf("OutputBuffer Pointer = %p BufLength = %d\n", &output, sizeof(&output));

//
// Performing METHOD_BUFFERED
//

//printf("\nCalling DeviceIoControl METHOD_BUFFERED:\n");

bRc = DeviceIoControl(hDevice,
(DWORD)IOCTL_SIOCTL_METHOD_BUFFERED,
&input,
sizeof(&input),
output,
sizeof(unsigned __int64)*2,
&bytesReturned,
NULL
);

if (!bRc)
{
//printf("Error in DeviceIoControl : %d", GetLastError());
return 0;

}
//printf(" OutBuffer (%d): %d\n", bytesReturned, output);
if (output[1] == 0)
{
return 0;
}
else
{
return (float)output[0] / (float)output[1];
}
}

struct Core
{
int CoreNumber;
};

int GetNumberOfProcessorCores()
{
SYSTEM_INFO sysinfo;
GetSystemInfo(&sysinfo);
return sysinfo.dwNumberOfProcessors;
}

float GetCoreFrequency()
{
// __rdtsc: Returns the processor time stamp which records the number of clock cycles since the last reset.
// QueryPerformanceCounter: Returns a high resolution time stamp that can be used for time-interval measurements.
// Get the frequency which defines the step size of the QueryPerformanceCounter method.
LARGE_INTEGER frequency;
QueryPerformanceFrequency(&frequency);
// Get the number of cycles before we start.
ULONG cyclesBefore = __rdtsc();
// Get the Intel performance ratio at the start.
float ratioBefore = GetPerformanceRatio();
// Get the start time.
LARGE_INTEGER startTime;
QueryPerformanceCounter(&startTime);
// Give the CPU cores enough time to repopulate their __rdtsc and QueryPerformanceCounter registers.
Sleep(1000);
ULONG cyclesAfter = __rdtsc();
// Get the Intel performance ratio at the end.
float ratioAfter = GetPerformanceRatio();
// Get the end time.
LARGE_INTEGER endTime;
QueryPerformanceCounter(&endTime);
// Return the number of MHz. Multiply the core's frequency by the mean MSR (model-specific register) ratio (the APERF register's value divided by the MPERF register's value) between the two timestamps.
return ((ratioAfter + ratioBefore) / 2)*(cyclesAfter - cyclesBefore)*pow(10, -6) / ((endTime.QuadPart - startTime.QuadPart) / frequency.QuadPart);
}

struct CoreResults
{
int CoreNumber;
float CoreFrequency;
};

CRITICAL_SECTION printLock;

static void printResult(void *param)
{
EnterCriticalSection(&printLock);
CoreResults coreResults = *((CoreResults *)param);
std::cout << "Core " << coreResults.CoreNumber << " has a speed of " << coreResults.CoreFrequency << " MHz" << std::endl;
delete param;
LeaveCriticalSection(&printLock);
}

bool closed = false;

static void startMonitoringCoreSpeeds(void *param)
{
Core core = *((Core *)param);
SetThreadAffinityMask(GetCurrentThread(), 1 << core.CoreNumber);
while (!closed)
{
CoreResults *coreResults = new CoreResults();
coreResults->CoreNumber = core.CoreNumber;
coreResults->CoreFrequency = GetCoreFrequency();
_beginthread(printResult, 0, coreResults);
Sleep(1000);
}
delete param;
}

int _tmain(int argc, _TCHAR* argv[])
{
InitializeCriticalSection(&printLock);
InstallDriver();
for (int i = 0; i < GetNumberOfProcessorCores(); i++)
{
Core *core = new Core{ 0 };
core->CoreNumber = i;
_beginthread(startMonitoringCoreSpeeds, 0, core);
}
std::cin.get();
closed = true;
UninstallDriver();
DeleteCriticalSection(&printLock);
}

It uses install.cpp which you can get from the IOCTL sample. I will post a working, fully working and ready solution (with code, obviously) on my blog over the next few days, if not tonight.

Edit: Blogged it at http://www.dima.to/blog/?p=101 (full source code available there)...

Need some help in getting the CPU Frequency

Yes, that code sits and busy-waits for an entire second, which has causes that core to be 100% busy for a second. One second is more than enough time for dynamic clocking algorithms to detect load and kick the CPU frequency up out of power-saving states. I wouldn't be surprised if processors with boost actually show you a frequency above the labelled frequency.

The concept isn't bad, however. What you have to do is sleep for an interval of about one second. Then, instead of assuming the RDTSC invocations were exactly one second apart, divide by the actual time indicated by QueryPerformanceCounter.

Also, I recommend checking RDTSC both before and after the QueryPerformanceCounter call, to detect whether there was a context switch between RDTSC and QueryPerformanceCounter which would mess up your results.


Unfortunately, RDTSC on new processors doesn't actually count CPU clock cycles. So this doesn't reflect the dynamically changing CPU clock rate (it does measure the nominal rate without busy-waiting, though, so it is a big improvement over the code provided in the question).

  • Bruce Dawson explained this in a blog post

So it looks like you'd need to access model-specific registers after all. Which can't be done from user-mode. The OpenHardwareMonitor project has both a driver that can be used and code for the frequency calculations


float ProcSpeedCalc()
{
/*
RdTSC:
It's the Pentium instruction "ReaD Time Stamp Counter". It measures the
number of clock cycles that have passed since the processor was reset, as a
64-bit number. That's what the <CODE>_emit</CODE> lines do.
*/
// Microsoft inline assembler knows the rdtsc instruction. No need for emit.

// variables for the CPU cycle counter (unknown rate):
__int64 tscBefore, tscAfter, tscCheck;
// variables for the Performance Counter 9steady known rate):
LARGE_INTEGER hpetFreq, hpetBefore, hpetAfter;

// retrieve performance-counter frequency per second:
if (!QueryPerformanceFrequency(&hpetFreq)) return 0;

int retryLimit = 10;
do {
// read CPU cycle count
_asm
{
rdtsc
mov DWORD PTR tscBefore, eax
mov DWORD PTR [tscBefore + 4], edx
}

// retrieve the current value of the performance counter:
QueryPerformanceCounter(&hpetBefore);

// read CPU cycle count again, to detect context switch
_asm
{
rdtsc
mov DWORD PTR tscCheck, eax
mov DWORD PTR [tscCheck + 4], edx
}
} while ((tscCheck - tscBefore) > 800 && (--retryLimit) > 0);

Sleep(1000);

do {
// read CPU cycle count
_asm
{
rdtsc
mov DWORD PTR tscAfter, eax
mov DWORD PTR [tscAfter + 4], edx
}

// retrieve the current value of the performance counter:
QueryPerformanceCounter(&hpetAfter);

// read CPU cycle count again, to detect context switch
_asm
{
rdtsc
mov DWORD PTR tscCheck, eax
mov DWORD PTR [tscCheck + 4], edx
}
} while ((tscCheck - tscAfter) > 800 && (--retryLimit) > 0);

// stop-start is speed in Hz divided by 1,000,000 is speed in MHz
return (double)(tscAfter - tscBefore) / (double)(hpetAfter.QuadPart - hpetBefore.QuadPart) * (double)hpetFreq.QuadPart / 1.0e6;
}

Most compilers provide an __rdtsc() intrinsic, in which case you could use tscBefore = __rdtsc(); instead of the __asm block. Both methods are platform- and compiler-specific, unfortunately.



Related Topics



Leave a reply



Submit