Is std::vector so much slower than plain arrays?
Using the following:
g++ -O3 Time.cpp -I <MyBoost>
./a.out
UseArray completed in 2.196 seconds
UseVector completed in 4.412 seconds
UseVectorPushBack completed in 8.017 seconds
The whole thing completed in 14.626 seconds
So array is twice as quick as vector.
But after looking at the code in more detail this is expected; as you run across the vector twice and the array only once. Note: when you resize()
the vector you are not only allocating the memory but also running through the vector and calling the constructor on each member.
Re-Arranging the code slightly so that the vector only initializes each object once:
std::vector<Pixel> pixels(dimensions * dimensions, Pixel(255,0,0));
Now doing the same timing again:
g++ -O3 Time.cpp -I <MyBoost>
./a.out
UseVector completed in 2.216 seconds
The vector now performance only slightly worse than the array. IMO this difference is insignificant and could be caused by a whole bunch of things not associated with the test.
I would also take into account that you are not correctly initializing/Destroying the Pixel object in the UseArrray()
method as neither constructor/destructor is not called (this may not be an issue for this simple class but anything slightly more complex (ie with pointers or members with pointers) will cause problems.
Slow std::vector vs [] in C++ - Why?
You are not compiling with optimizations.
Compare:
With
vector
ofvector
With array
To give you a small taste of what the optimizer might be doing for you, consider the following modification to your H()
function for the vector
of vector
case.
int H(std::vector<std::vector<int>> &arg) {
int h = 0;
auto sigma = arg.data();
for (int r = 0; r < grid_e_rows; ++r) {
int r2 = (r + 1) % grid_e_rows;
auto sr = sigma[r].data();
auto sr2 = sigma[r2].data();
for (int c = 0; c < grid_e_cols; ++c) {
int c2 = (c + 1) % grid_e_cols;
h += 1 * sr[c] * sr[c2] + 1 * sr[c] * sr2[c];
}
}
return -h;
}
You will find that without optimizations, this version will run closer to the performance of your array version.
why std::vector::push_back much slower than a manual implementation?
The difference is mainly due to less favorable codegen in the std::vector
version. We can see it if we compare the generated assembly (godbolt link).
Your loop (skipping the reallocation part):
$LL4@test:
xor esi, esi
xor edi, edi
$LL7@test:
mov r15, rbx
cmp rdi, rbx
jne SHORT $LN27@test
<...skip...>
$LN27@test:
mov DWORD PTR [r14+rdi*4], esi
inc rdi
inc esi
cmp esi, 100000 ; 000186a0H
jl $LL7@test
sub r12, r13
jne $LL4@test
std::vector::push_back
loop (again, skipping the reallocation part):
$LL4@test:
xor ebx, ebx
mov DWORD PTR i$1[rsp], ebx
$LL7@test:
cmp rcx, QWORD PTR v$[rsp+16]
je SHORT $LN26@test
mov DWORD PTR [rcx], ebx
mov rcx, QWORD PTR v$[rsp+8]
add rcx, 4
mov QWORD PTR v$[rsp+8], rcx
jmp SHORT $LN5@test
$LN26@test:
<...skip...>
$LN5@test:
inc ebx
mov DWORD PTR i$1[rsp], ebx
cmp ebx, 100000 ; 000186a0H
jl SHORT $LL7@test
mov rcx, QWORD PTR v$[rsp]
mov QWORD PTR v$[rsp+8], rcx
sub rdi, 1
jne SHORT $LL4@test
Clearly we can see more code (11 vs. 8 instructions in the hot path) and more indirection to memory (5 vs. 1). So it's no surprise that it's slower.
More generally, more complex code == slower code.
Could the two versions be optimized identically? I see no reason why not. MSVC 19.28 just can't do it.
Using arrays or std::vectors in C++, what's the performance gap?
Using C++ arrays with new
(that is, using dynamic arrays) should be avoided. There is the problem you have to keep track of the size, and you need to delete them manually and do all sort of housekeeping.
Using arrays on the stack is also discouraged because you don't have range checking, and passing the array around will lose any information about its size (array to pointer conversion). You should use boost::array
in that case, which wraps a C++ array in a small class and provides a size
function and iterators to iterate over it.
Now the std::vector vs. native C++ arrays (taken from the internet):
// Comparison of assembly code generated for basic indexing, dereferencing,
// and increment operations on vectors and arrays/pointers.
// Assembly code was generated by gcc 4.1.0 invoked with g++ -O3 -S on a
// x86_64-suse-linux machine.
#include <vector>
struct S
{
int padding;
std::vector<int> v;
int * p;
std::vector<int>::iterator i;
};
int pointer_index (S & s) { return s.p[3]; }
// movq 32(%rdi), %rax
// movl 12(%rax), %eax
// ret
int vector_index (S & s) { return s.v[3]; }
// movq 8(%rdi), %rax
// movl 12(%rax), %eax
// ret
// Conclusion: Indexing a vector is the same damn thing as indexing a pointer.
int pointer_deref (S & s) { return *s.p; }
// movq 32(%rdi), %rax
// movl (%rax), %eax
// ret
int iterator_deref (S & s) { return *s.i; }
// movq 40(%rdi), %rax
// movl (%rax), %eax
// ret
// Conclusion: Dereferencing a vector iterator is the same damn thing
// as dereferencing a pointer.
void pointer_increment (S & s) { ++s.p; }
// addq $4, 32(%rdi)
// ret
void iterator_increment (S & s) { ++s.i; }
// addq $4, 40(%rdi)
// ret
// Conclusion: Incrementing a vector iterator is the same damn thing as
// incrementing a pointer.
Note: If you allocate arrays with new
and allocate non-class objects (like plain int
) or classes without a user defined constructor and you don't want to have your elements initialized initially, using new
-allocated arrays can have performance advantages because std::vector
initializes all elements to default values (0 for int, for example) on construction (credits to @bernie for reminding me).
Why are vectors significantly slower than arrays in this implementation of a sorting algorithm?
Code that calculates rightPartSize
is not consistent with how you create rightPart
vector. You can easily check it by adding this statement:
if( static_cast<size_t>( rightPartSize ) != rightPart.size() )
cout << rightPartSize << " != " << rightPart.size() << endl;
after rightPart
creation:
1 != 99999
1 != 99997
2 != 99998
1 != 99995
1 != 99994
3 != 99996
1 != 99992
1 != 99991
1 != 99989
1 != 99988
3 != 99990
6 != 99993
1 != 99986
1 != 99985
1 != 99983
...
so amount of work done by vector is significantly bigger than what you do with your dynamic array.
Related Topics
What Are the Complexity Guarantees of the Standard Containers
Should I Use Static_Cast or Reinterpret_Cast When Casting a Void* to Whatever
What Happens to a Detached Thread When Main() Exits
Why Is Transposing a Matrix of 512X512 Much Slower Than Transposing a Matrix of 513X513
Strptime() Equivalent on Windows
Calling Objective-C Method from C++ Member Function
Fastest Method of Screen Capturing on Windows
C++ Callback Using Class Member
Store Derived Class Objects in Base Class Variables
How to Clear a Stringstream Variable
How to Convert an Enum Type Variable to a String
Why Should the Copy Constructor Accept Its Parameter by Reference in C++
Programmatically Find the Number of Cores on a Machine