Detecting CPU Architecture Compile-Time

Is there any way in C to check at compile time if you are on an architecture where multiplication is fast?


Is there any way for C code to tell whether it is being compiled on an architecture where multiplication is fast? Is there some macro __FAST_MULT__ or something which is defined on those architectures?

No, standard C does not provide any such facility. It is possible that particular compilers provide such a thing as an extension, but I am not specifically aware of any that actually do.

This sort of thing can be tested during build configuration, for example via Autoconf or CMake, in which case you can provide the symbol yourself where appropriate.

Alternatively, some C compilers definitely do provide macros that indicate the architecture for which the code is being compiled. You can use that in conjunction with knowledge of the details of various machine architectures to choose between the two algorithms -- that's what such macros are intended for, after all.

Or you can rely on the person building the program to choose, by configuration option, by defining a macro, or whatever.

Detect if processor has RDTSCP at compile time

GCC defines many macros to determine at compile-time whether a particular feature is supported by the microarchitecture specified using -march. You can find the full list in the source code here. It's clear that GCC does not define such a macro for RDTSCP (or even RDTSC for that matter). The processors that support RDTSCP are listed in: What is the gcc cpu-type that includes support for RDTSCP?.

So you can make your own (potentially incomplete) list microarchitectures that support RDTSCP. Then write a build script that checks the argument passed to -march and see if it is in the list. If it is, then define a macro such as __RDTSCP__ and use it in your code. I presume that even if your list is incomplete, this should not compromise the correctness of your code.

Unfortunately, the Intel datasheets do not seem to specify whether a particular processor supports RDTSCP even though they discuss other features such as AVX2.

One potential problem here is that there is no guarantee that every single processor that implements a particular microarchitecture, such as Skylake, supports RDTSCP. I'm not aware of such exceptions though.

Related: What is the gcc cpu-type that includes support for RDTSCP?.


To determine RDTSCP support at run-time, the following code can be used on compilers supporting GNU extensions (GCC, clang, ICC), on any x86 OS. cpuid.h comes with the compiler, not the OS.

#include <cpuid.h>

int rdtscp_supported(void) {
unsigned a, b, c, d;
if (__get_cpuid(0x80000001, &a, &b, &c, &d) && (d & (1<<27)))
{
// RDTSCP is supported.
return 1;
}
else
{
// RDTSCP is not supported.
return 0;
}
}

__get_cpuid() runs CPUID twice: once to check max level, once with the specified leaf value. It returns false if the requested level isn't even available, that's why it's part of a && expression. You probably don't want to use this every time before rdtscp, just as an initializer for a variable unless it's just a simple one-off program. See it on the Godbolt compiler explorer.

For MSVC, see How to detect rdtscp support in Visual C++? for code using its intrinsic.


For some CPU features that GCC does know about, you can use __builtin_cpu_supports to check a feature bitmap that's initialized early in startup.

// unfortunately no equivalent for RDTSCP
int sse42_supported() {
return __builtin_cpu_supports("sse4.2");
}

How to tell if program is running on x86/x64 or ARM Linux platforms

This has already been answered on these posts:

GCC predefined macros for architecture X, Detecting CPU architecture compile-time

You can have them here:

http://sourceforge.net/p/predef/wiki/Architectures/

Your approach should only be used for small portions of code or functions but it should work.

Edit:

Basically, because links can become invalid:

__arm__ should work on ARM.

__x86_64__ should work on x64 architecture.

And yes, you can do:

#ifdef __x86_64__    
// do x64 stuff
#elif __arm__
// do arm stuff
#endif

Compile time architecture detection in cmake

You are probably better off using try_run, which will compile and run the code for you. Be aware that you're introducing issues when cross-compiling, unless you're happy to fall back to generic x86 code (or force the person building your code to manually set cache variables). This will let you store the name of the subarchitecture in a CMake variable, which you can then substitute into a header with something like configure_file.

Programmatically detect CPU architecture at runtime

If you compile your executable for 64bit, the CPU must be 64bit only.

If you compile your executable for 32bit, the CPU may be 32bit or 64bit (if a 64bit CPU is capable of running 32bit code), so you MUST query the CPU to differentiate. Best to get that from the OS when possible, but the CPU may have its own query for that info.

For instance, on an x86 or x86-64 CPU, there is a CPUID instruction available:

  • On Intel CPUs, CPUID's "Processor Info and Feature Bits" query includes an ia64 feature flag (IA64 processor emulating x86).

  • On AMD CPUs, CPUID's "Extended Processor Info and Feature Bits" query includes a long mode feature flag.

CPUID has a "Get vendor ID" query to determine the CPU manufacturer.

What is a good technique for compile-time detection of mismatched preprocessor-definitions between library-code and user-code?

One way of implementing such a check is to provide definition/declaration pairs for global variables that change, according to whether or not particular macros/tokens are defined. Doing so will cause a linker error if a declaration in a header, when included by a client source, does not match that used when building the library.

As a brief illustration, consider the following section, to be added to the "MyLibrary.h" header file (included both when building the library and when using it):

#ifdef FOOFLAG
extern int fooflag;
static inline int foocheck = fooflag; // Forces a reference to the above external
#else
extern int nofooflag;
static inline int foocheck = nofooflag; // <ditto>
#endif

Then, in your library, add the following code, either in a separate ".cpp" module, or in an existing one:

#include "MyLibrary.h"

#ifdef FOOFLAG
int fooflag = 42;
#else
int nofooflag = 42;
#endif

This will (or should) ensure that all component source files for the executable are compiled using the same "state" for the FOOFLAG token. I haven't actually tested this when linking to an object library, but it works when building an EXE file from two separate sources: it will only build if both or neither have the -DFOOFLAG option; if one has but the other doesn't, then the linker fails with (in Visual Studio/MSVC):

error LNK2001: unresolved external symbol "int fooflag"
(?fooflag@@3HA)

The main problem with this is that the error message isn't especially helpful (to a third-party user of your library); that can be ameliorated (perhaps) by appropriate use of names for those check variables.1

An advantage is that the system is easily extensible: as many such check variables as required can be added (one for each critical macro token), and the same idea can also be used to check for actual values of said macros, with code like the following:

#if FOOFLAG == 1
int fooflag1 = 42;
#elif FOOFLAG == 2
int fooflag2 = 42;
#elif FOOFLAG == 3
int fooflag3 = 42;
#else
int fooflagX = 42;
#endif

1 For example, something along these lines (with suitable modifications in the header file):

#ifdef FOOFLAG
int CANT_DEFINE_FOOFLAG = 42;
#else
int MUST_DEFINE_FOOFLAG = 42;
#endif

Important Note: I have just tried this technique using the clang-cl compiler (in Visual Studio 2019) and the linker failed to catch a mismatch, because it is completely optimizing away all references to the foocheck variable (and, thus, to the dependent fooflag). However, there is a fairly trivial workaround, using clang's __attribute__((used)) directive (which also works for the GCC C++ compiler). Here is the header section for the last code snippet shown, with that workaround added:

#if defined(__clang__) || defined(__GNUC__)
#define KEEPIT __attribute__((used))
// Equivalent directives may be available for other compilers ...
#else
#define KEEPIT
#endif

#ifdef FOOFLAG
extern int CANT_DEFINE_FOOFLAG;
KEEPIT static inline int foocheck = CANT_DEFINE_FOOFLAG; // Forces reference to above
#else
extern int MUST_DEFINE_FOOFLAG;
KEEPIT static inline int foocheck = MUST_DEFINE_FOOFLAG; // <ditto>
#endif

Detect which target CPU a GCC configured for?

To detect the architecture at compile time in the source code use a predefined macro.

According to this article, it will always have a name in a form _arch_ or __arch__ where the arch is the name of the target architecture. To see what exactly defined, use the following command:

touch foo.h; cpp -dM foo.h; rm foo.h

It will print out all predefined macros.

To print out on the command line, try:

gcc -dumpmachine

It will show the target the GCC is is built for.



Related Topics



Leave a reply



Submit