Why Are Elementwise Additions Much Faster in Separate Loops Than in a Combined Loop

Why are elementwise additions much faster in separate loops than in a combined loop?

Why are elementwise additions much faster in separate loops than in a combined loop?

colon' and 'auto' in for loop c++? need some help understanding the syntax

As described in the answer by Chad, your for-loop iterates over your vector, using its begin and end iterators. That is the behavior of the colon : syntax.

Regarding your const auto & syntax: you should imagine what code comes out of it:

// "i" is an iterator
const auto& ioDev = *i;

The expression *i is (a reference to) the type of elements in the container: Device *. This is the deduced type of auto. Because you have const & appended to your auto, the variable ioDev is a const reference to the deduced type (a pointer), as if it were declared this way:

const Device *& ioDev = *i;

It seems needlessly complicated; if you need just normal iteration (and not e.g. manipulating the address of the pointer, which I think is highly unlikely), use a plain unmodified auto:

for (auto ioDev : deviceList)

or an explicit type:

for (Device* ioDev : deviceList)

Why is 1 for-loop slower than 2 for-loops in problem related to prefix sum matrix?

If you look at assembly you'll see the source of the difference:

  1. Single loop:
{
if (a[i][j] < x)
{
lower[i][j] = 0;
}
else
{
lower[i][j] = 1;
}
b[i][j] = b[i-1][j]
+ b[i][j-1]
- b[i-1][j-1]
+ lower[i][j];
}

In this case, there's a data dependency. The assignment to b depends on the value from the assignment to lower. So the operations go sequentially in the loop - first assignment to lower, then to b. The compiler can't optimize this code significantly because of the dependency.


  1. Separation of assignments into 2 loops:

The assignment to lower is now independent and the compiler can use SIMD instructions that leads to a performance boost in the first loop. The second loop stays more or less similar to the original assembly.



Related Topics



Leave a reply



Submit