What's Faster, Iterating an Stl Vector with Vector::Iterator or with At()

What's faster, iterating an STL vector with vector::iterator or with at()?

Why not write a test and find out?

Edit: My bad - I thought I was timing the optimised version but wasn't. On my machine, compiled with g++ -O2, the iterator version is slightly slower than the operator[] version, but probably not significantly so.

#include <vector>
#include <iostream>
#include <ctime>
using namespace std;

int main() {
const int BIG = 20000000;
vector <int> v;
for ( int i = 0; i < BIG; i++ ) {
v.push_back( i );
}

int now = time(0);
cout << "start" << endl;
int n = 0;
for(vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
n += *it;
}

cout << time(0) - now << endl;
now = time(0);
for(size_t i = 0; i < v.size(); ++i) {
n += v[i];
}
cout << time(0) - now << endl;

return n != 0;
}

Speed accessing a std::vector by iterator vs by operator[]/index?

The performance difference is likely negligable or none (the compiler might optimise them to be identical); you should worry about other things, like whether your program is correct (a slow but correct program is better than a fast and incorrect program). There are other advantages to using iterators though, such as being able to change the underlying container to one with no operator[] without modifying your loops. See this question for more.

const_iterators will most likely have none, or negligable, performance difference compared to ordinary iterators. They are designed to improve the correctness of your program by preventing modifying things that shouldn't be modified, not for performance. The same goes for the const keyword in general.

In short, optimisation should not be a concern of yours until two things have happened: 1) you have noticed it runs too slowly and 2) you have profiled the bottlenecks. For 1), if it ran ten times slower than it could, but is only ever run once and takes 0.1ms, who cares? For 2), make sure it's definitely the bottleneck, otherwise optimising it will have nearly no measurable effect on performance!

Performance - iterating over vector with iterator or pointer?

I'm currently writing an application which needs to perform as good as it can be.

Then grab a profiler and look where the real bottlenecks are. In optimized code (Release mode), of course.

-O2 is not everything in VS2012: there are several #defines that manipulate the behavior of standard container iterators wrt to bounds checking and other security checks. You might want to look them up ("checked iterators" and "secure SCL" might lead you to the right sites) and set them accordingly.

But I very much doubt that the iteration over containers will be your bottleneck, there will be other sections of code that are more sensitive to performance issues.

Why is iterating over a std::set so much slower than over a std::vector?

Isn't a set just a structured vector that checks whether each item is unique upon insertion?

No, by far not. These data structures are completely different, and the main distinction here is the memory layout: std::vector puts its element into a contiguous location in memory, while std::set is a node-based container, where every element is separately allocated and resides at distinct places in memory, possibly far away from each other and definitely in a way that pre-fetching data for fast traversal is impossible for the processor. This is quite the opposite for std::vector - as the next element is always just right "next to" the current one in memory, a CPU will load elements into its cache, and when actually processing the elements, it only has to go to the cache to retrieve the values - which is very fast compared to RAM access.

Note that it's a common need to have a sorted, unique collection of data that is laid out contiguously in memory, and C++2a or the version thereafter might actually ship with a flat_set, have a look at P1222.

Matt Austern's "Why you shouldn't use set (and what you should use instead)" is an interesting read, too.

Iteration speed of unordered_set vs vector

std::vector is the fastest STL container for linear iterations like in your scenario. This is simply because memory is contiguously allocated and therefore can benefit of caching mechanisms. A vector iterator just increments a pointer internally.

You might improve performance further by trying to use a smaller type than int when possible.

In your case I think parallelization/SIMD would give largest performance boost for reduction. This might be achieved by auto vectorization (check project compiler settings).

You could also try OMP to do this like this (not tested)

size_t end = VecofElems.size();
#pragma omp parallel for shared(vec_cumulative_sum ) reduction(+: vec_cumulative_sum )
for (size_t i = 0; i < end; ++i) {
vec_cumulative_sum += VecofElems[i];
}

This would run multiple threads in parallel. Therefore it's important to declare vec_cumulative_sum as being shared amongst the threads.

std::for_each provides some simple way to make use of parallelism including vectorization. This allows to simply use any STL container like std::vector (and any other container providing iterators) like this

std::for_each(std::execution::par, VecofElems.begin(), VecofElems.end(), 
, []() {
// have some fun using a lambda, or function
}
}

Quickest way to iterate in a C++ vector

Why is the algorithm slow?

Currently you're iterating over two vectors, but you're doing nested iteration. Let's call the lengths of the vectors m and n (arbitrarily). Your current algorithm is O(n*m), which is quite slow for large vectors.

If m = n = 1000, in the worst-case (no two elements sum to v), you're doing millions of operations -- whew!

How can I improve the performance?

I can think of one immense speedup: given a vector, a, a vector b, and a desired sum v, you could create a set of numbers from the element-wise difference of v and a and check if any of the values in b are in the set.

The pseudocode is:

if a.empty or b.empty:
return false

set = emptySet

for e in a:
set.add(v - e)

for e in b:
if e in set:
return true

return false

Lookups in an unordered set should be O(1), so this is now an O(max(m, n)) algorithm where m is length of a and n is length of b.

If m = n = 1000, in the worst-case you're now doing thousands of operations -- much better, right?

Implementing this in C++ is left to the reader. :)

Hint: passing by reference is a good idea for vectors.

Iterator Loop vs index loop

The special thing about iterators is that they provide the glue between algorithms and containers. For generic code, the recommendation would be to use a combination of STL algorithms (e.g. find, sort, remove, copy) etc. that carries out the computation that you have in mind on your data structure (vector, list, map etc.), and to supply that algorithm with iterators into your container.

Your particular example could be written as a combination of the for_each algorithm and the vector container (see option 3) below), but it's only one out of four distinct ways to iterate over a std::vector:

1) index-based iteration

for (std::size_t i = 0; i != v.size(); ++i) {
// access element as v[i]

// any code including continue, break, return
}

Advantages: familiar to anyone familiar with C-style code, can loop using different strides (e.g. i += 2).

Disadvantages: only for sequential random access containers (vector, array, deque), doesn't work for list, forward_list or the associative containers. Also the loop control is a little verbose (init, check, increment). People need to be aware of the 0-based indexing in C++.

2) iterator-based iteration

for (auto it = v.begin(); it != v.end(); ++it) {
// if the current index is needed:
auto i = std::distance(v.begin(), it);

// access element as *it

// any code including continue, break, return
}

Advantages: more generic, works for all containers (even the new unordered associative containers, can also use different strides (e.g. std::advance(it, 2));

Disadvantages: need extra work to get the index of the current element (could be O(N) for list or forward_list). Again, the loop control is a little verbose (init, check, increment).

3) STL for_each algorithm + lambda

std::for_each(v.begin(), v.end(), [](T const& elem) {
// if the current index is needed:
auto i = &elem - &v[0];

// cannot continue, break or return out of the loop
});

Advantages: same as 2) plus small reduction in loop control (no check and increment), this can greatly reduce your bug rate (wrong init, check or increment, off-by-one errors).

Disadvantages: same as explicit iterator-loop plus restricted possibilities for flow control in the loop (cannot use continue, break or return) and no option for different strides (unless you use an iterator adapter that overloads operator++).

4) range-for loop

for (auto& elem: v) {
// if the current index is needed:
auto i = &elem - &v[0];

// any code including continue, break, return
}

Advantages: very compact loop control, direct access to the current element.

Disadvantages: extra statement to get the index. Cannot use different strides.

What to use?

For your particular example of iterating over std::vector: if you really need the index (e.g. access the previous or next element, printing/logging the index inside the loop etc.) or you need a stride different than 1, then I would go for the explicitly indexed-loop, otherwise I'd go for the range-for loop.

For generic algorithms on generic containers I'd go for the explicit iterator loop unless the code contained no flow control inside the loop and needed stride 1, in which case I'd go for the STL for_each + a lambda.

Efficiency of vector index access vs iterator access

Without seeing the test harnesses, the compiler options, and how you
measured the time, it's hard to say anything. Also, a good compiler may
be able eliminate the loop in one case or the other, since the loop has
no effect on the value returned. Still, depending on the
implementation, it wouldn't surprise me to see iterators significantly
faster than indexing (or vice versa).

With regards to your "understanding", there's nothing inherent about the
type of iterator and its performance. You can write forward iterators
which are very fast, or very slow, just as you can write random access
iterators which are very fast or very slow. Globally, the types of data
structures which will support random access iterators are likely to have
better locality than those which don't, which might work in favor of
random access iterators; but that's really not enough to be able to make
any reasonable generalizations.



Related Topics



Leave a reply



Submit