Self-Unrolling MACro Loop in C/C++

Self-unrolling macro loop in C/C++

You can use templates to unroll.
See the disassembly for the sample Live on Godbolt

Sample Image

But -funroll-loops has the same effect for this sample.


Live On Coliru

template <unsigned N> struct faux_unroll {
template <typename F> static void call(F const& f) {
f();
faux_unroll<N-1>::call(f);
}
};

template <> struct faux_unroll<0u> {
template <typename F> static void call(F const&) {}
};

#include <iostream>
#include <cstdlib>

int main() {
srand(time(0));

double r = 0;
faux_unroll<10>::call([&] { r += 1.0/rand(); });

std::cout << r;
}

How to write a while loop with the C preprocessor?

Take a look at the Boost preprocessor library, which allows you to write loops in the preprocessor, and much more.

Optimizing a program with loop unrolling

You can do it like this:

int i = 0;
while (i<=n-4) {
doSomething();
doSomething();
doSomething();
doSomething();
i += 4;
}
while (i<n) {
doSomething();
i++;
}

You may want to replace the second loop with 3 ifs (as the body of the loop will be executed three times at most).

Note, that optimizing compilers usually do this kind of transformation automatically, so you don't have to (except when they don't: Why is an integer array search loop slower in C++ than Java?).

Is there a way to define a preprocessor macro that includes preprocessor directives?

You cannot define preprocessing directives the way you show in the question.

Yet you may be able to use the _Pragma operator for your purpose:

#if defined __GNUC__ && __GNUC__ >= 8
#define foo _Pragma("GCC unroll 128") _Pragma("GCC ivdep")
#elif defined __clang__
#define foo _Pragma("clang loop vectorize(enable) interleave(enable)")
#else
#define foo
#endif

Loop unrolling in Metal kernels

Metal is a subset C++11, and you can try using template metaprogramming to unroll loops. The following compiled in metal, though I don't have time to properly test it:

template <unsigned N> struct unroll {

template<class F>
static void call(F f) {
f();
unroll<N-1>::call(f);
}
};

template <> struct unroll<0u> {

template<class F>
static void call(F f) {}
};

kernel void test() {

unroll<3>::call(do_stuff);

}

Please let me know if it works! You'll probably have to add some arguments to call to pass arguments to do_stuff.

See also: Self-unrolling macro loop in C/C++

unrolling a while loop

In general, the answer is no. It works for 30 and 15 because 30 is even, but it would not work as easily for odd numbers. "Duff's device" was invented to deal with general case. It is quite ugly, though.



Related Topics



Leave a reply



Submit