Why Is Std::Vector::Operator[] 5 to 10 Times Faster Than Std::Vector::At()

Why is std::vector::operator[] 5 to 10 times faster than std::vector::at()?

The reason is that an unchecked access can probably be done with a single processor instruction. A checked access will also have to load the size from memory, compare it with the index, and (assuming it's in range) skip over a conditional branch to the error handler. There may be more faffing around to handle the possibility of throwing an exception. This will be many times slower, and this is precisely why you have both options.

If you can prove that the index is within range without a runtime check then use operator[]. Otherwise, use at(), or add your own check before access. operator[] should be more or less as fast as possible, but will explode messily if the index is invalid.

vector::at vs. vector::operator[]

I'd say the exceptions that vector::at() throws aren't really intended to be caught by the immediately surrounding code. They are mainly useful for catching bugs in your code. If you need to bounds-check at runtime because e.g. the index comes from user input, you're indeed best off with an if statement. So in summary, design your code with the intention that vector::at() will never throw an exception, so that if it does, and your program aborts, it's a sign of a bug. (just like an assert())

Why is the vector range constructer 10 times faster than the fill constructor?

Here's the compiled code on godbolt, with gcc -O0.

In the first test:

  • the loop to fill the array (lines 49-57 of the assembly) is compiled as a simple loop, with a store to memory on each iteration. It's not well optimized (the index variable is kept on the stack instead of in a register, and redundantly moved back and forth) but at least it's inline code and doesn't make any function calls.

  • the range constructor is a single call to the precompiled constructor in the library (line 69). We can assume the library function was compiled with aggressive optimizations; it probably calls a highly optimized memcpy in handwritten assembly. So it's probably about as fast as it could be.

In the second test:

  • The fill constructor is a library call (line 113). Again this is presumably about as fast as it can be (probably calling a hand-optimized memset).

  • Your loop to fill the vector (lines 118-130) generates a function call to std::vector<int>::operator[] on every iteration (line 126). Even though operator[] itself is probably pretty fast, having been precompiled, the overhead of a function call every time, including the code to reload registers with its arguments, is a killer. If you were compiling with optimizations, this call could be inlined; all that overhead would go away and you'd again just have a single store to memory per iteration. But you're not optimizing, so guess what? Performance is deeply sub-optimal.

With optimizations the second test is apparently faster. This makes sense as it only has to write the same block of memory twice, never reading; whereas the first one involves writing a block of memory, then reading it back to write the second block.

Moral: unoptimized code can be really bad, and two pieces of code that could be optimized into something very similar may turn out very different if such optimizations are not attempted.

Why is std::vector slower than an array?

Let us observe how GCC optimizes this test program:

#include <vector>

int main()
{
int len = 800000;
int* Data = new int[len];

int arr[3] = { 255, 0, 0 };
std::vector<int> vec = { 255, 0, 0 };

for (int i = 0; i < len; i++) {
Data[i] = vec[0];
}
for (int i = 0; i < len; i++) {
Data[i] = arr[0];
}
delete[] Data;
}

The compiler rightly notices that the vector is constant, and eliminates it. Exactly same code is generated for both loops. Therefore it should be irrelevant whether the first loop uses array or vector.

.L2:
movups XMMWORD PTR [rcx], xmm0
add rcx, 16
cmp rsi, rcx
jne .L2

What makes difference in your test program is the order of loops. The comments point out that when a third loop is added to the beginning, both loops take the same time.

I would expect that with a modern compiler accessing a vector would be approximately as fast as accessing an array, when optimization is enabled and debug is disabled. If there is an observable difference in your actual program, the problem lies somewhere else.

Is std::vector so much slower than plain arrays?

Using the following:

g++ -O3 Time.cpp -I <MyBoost>

./a.out

UseArray completed in 2.196 seconds

UseVector completed in 4.412 seconds

UseVectorPushBack completed in 8.017 seconds

The whole thing completed in 14.626 seconds

So array is twice as quick as vector.

But after looking at the code in more detail this is expected; as you run across the vector twice and the array only once. Note: when you resize() the vector you are not only allocating the memory but also running through the vector and calling the constructor on each member.

Re-Arranging the code slightly so that the vector only initializes each object once:

 std::vector<Pixel>  pixels(dimensions * dimensions, Pixel(255,0,0));

Now doing the same timing again:

g++ -O3 Time.cpp -I <MyBoost>

./a.out

UseVector completed in 2.216 seconds

The vector now performance only slightly worse than the array. IMO this difference is insignificant and could be caused by a whole bunch of things not associated with the test.

I would also take into account that you are not correctly initializing/Destroying the Pixel object in the UseArrray() method as neither constructor/destructor is not called (this may not be an issue for this simple class but anything slightly more complex (ie with pointers or members with pointers) will cause problems.

Why is std::vectorchar faster than std::string?

I used #define N 100000000. Tested 3 times for each scenario and in all scenarios string is faster. Not using Valgrind, it does not make sense.

OS: Ubuntu 14.04. Arch:x86_64 CPU: Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz.

$COMPILER -std=c++11 -O3 -Wall -Wextra -pedantic -pthread -o test x.cc -DVECTOR
$COMPILER -std=c++11 -O3 -Wall -Wextra -pedantic -pthread -o test x.cc -DSTRING

Times:

compiler/variant           | time(1) | time(2) | time(3)
---------------------------+---------+---------+--------
g++ 4.8.2/vector Times: | 1.724s | 1.704s | 1.669s
g++ 4.8.2/string Times: | 1.675s | 1.678s | 1.674s
clang++ 3.5/vector Times: | 1.929s | 1.934s | 1.905s
clang++ 3.5/string Times: | 1.616s | 1.612s | 1.619s

What's faster, iterating an STL vector with vector::iterator or with at()?

Why not write a test and find out?

Edit: My bad - I thought I was timing the optimised version but wasn't. On my machine, compiled with g++ -O2, the iterator version is slightly slower than the operator[] version, but probably not significantly so.

#include <vector>
#include <iostream>
#include <ctime>
using namespace std;

int main() {
const int BIG = 20000000;
vector <int> v;
for ( int i = 0; i < BIG; i++ ) {
v.push_back( i );
}

int now = time(0);
cout << "start" << endl;
int n = 0;
for(vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
n += *it;
}

cout << time(0) - now << endl;
now = time(0);
for(size_t i = 0; i < v.size(); ++i) {
n += v[i];
}
cout << time(0) - now << endl;

return n != 0;
}

What does std::vector's .at() function really do?

.at(...) is doing bound checking, meanwhile the [] operator does not, i.e. for out of range.

See the documentation from here:

http://www.cplusplus.com/reference/vector/vector/at/

"The function automatically checks whether n is within the bounds of valid elements in the vector, throwing an out_of_range exception if it is not (i.e., if n is greater or equal than its size). This is in contrast with member operator[], that does not check against bounds."

or:

http://www.cplusplus.com/reference/vector/vector/operator[]/

"A similar member function, vector::at, has the same behavior as this operator function, except that vector::at is bound-checked and signals if the requested position is out of range by throwing an out_of_range exception."

Slightly off-topic, but you should not use "vectorlist" term for a vector. At first, I thought you would be having a list data for some reason.

So, to give you a real world example: you could use the non-bound-checking variant when you are sure the index inside the range because that will result a slightly faster code.

Does the operator [] of std::vectortype walk through the vector from the beginning on each call

The access time in vector with the [] operator is constant.

That means it is not working its way through any kind of list.

There is no need at all, vector is implemented as a malloced array; all vector adds to it is the automatic resizing, which just means that it makes another malloc and copies the content for you, without introducing errors.
You can see that yourself by open the include <vector>.



Related Topics



Leave a reply



Submit