C++ Std::Vector VS Array in the Real World

C++ std::vector vs array in the real world

A: Almost always [use a vector instead of an array]. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.

That's an over-simplification. It's fairly common to use arrays, and can be attractive when:

  • the elements are specified at compile time, e.g. const char project[] = "Super Server";, const Colours colours[] = { Green, Yellow };

    • with C++11 it will be equally concise to initialise std::vectors with values

  • the number of elements is inherently fixed, e.g. const char* const bool_to_str[] = { "false", "true" };, Piece chess_board[8][8];

  • first-use performance is critical: with arrays of constants the compiler can often write a memory snapshot of the fully pre-initialised objects into the executable image, which is then page-faulted directly into place ready for use, so it's typically much faster that run-time heap allocation (new[]) followed by serialised construction of objects

    • compiler-generated tables of const data can always be safely read by multiple threads, whereas data constructed at run-time must complete construction before other code triggered by constructors for non-function-local static variables attempts to use that data: you end up needing some manner of Singleton (possibly threadsafe which will be even slower)

    • In C++03, vectors created with an initial size would construct one prototypical element object then copy construct each data member. That meant that even for types where construction was deliberately left as a no-operation, there was still a cost to copy the data elements - replicating their whatever-garbage-was-left-in-memory values. Clearly an array of uninitialised elements is faster.

  • One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.

  • There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?

  • classes/structures containing arrays can still be POD types

  • arrays facilitate access in shared memory from multiple processes (by default vector's internal pointers to the actual dynamically allocated data won't be in shared memory or meaningful across processes, and it was famously difficult to force C++03 vectors to use shared memory like this even when specifying a custom allocator template parameter).

  • embedding arrays can localise memory access requirement, improving cache hits and therefore performance

That said, if it's not an active pain to use a vector (in code concision, readability or performance) then you're better off doing so: they've size(), checked random access via at(), iterators, resizing (which often becomes necessary as an application "matures") etc.. It's also often easier to change from vector to some other Standard container should there be a need, and safer/easier to apply Standard algorithms (x.end() is better than x + sizeof x / sizeof x[0] any day).

UPDATE: C++11 introduced a std::array<>, which avoids some of the costs of vectors - internally using a fixed-sized array to avoid an extra heap allocation/deallocation - while offering some of the benefits and API features: http://en.cppreference.com/w/cpp/container/array.

Using arrays or std::vectors in C++, what's the performance gap?

Using C++ arrays with new (that is, using dynamic arrays) should be avoided. There is the problem you have to keep track of the size, and you need to delete them manually and do all sort of housekeeping.

Using arrays on the stack is also discouraged because you don't have range checking, and passing the array around will lose any information about its size (array to pointer conversion). You should use boost::array in that case, which wraps a C++ array in a small class and provides a size function and iterators to iterate over it.

Now the std::vector vs. native C++ arrays (taken from the internet):

// Comparison of assembly code generated for basic indexing, dereferencing, 
// and increment operations on vectors and arrays/pointers.

// Assembly code was generated by gcc 4.1.0 invoked with g++ -O3 -S on a
// x86_64-suse-linux machine.

#include <vector>

struct S
{
int padding;

std::vector<int> v;
int * p;
std::vector<int>::iterator i;
};

int pointer_index (S & s) { return s.p[3]; }
// movq 32(%rdi), %rax
// movl 12(%rax), %eax
// ret

int vector_index (S & s) { return s.v[3]; }
// movq 8(%rdi), %rax
// movl 12(%rax), %eax
// ret

// Conclusion: Indexing a vector is the same damn thing as indexing a pointer.

int pointer_deref (S & s) { return *s.p; }
// movq 32(%rdi), %rax
// movl (%rax), %eax
// ret

int iterator_deref (S & s) { return *s.i; }
// movq 40(%rdi), %rax
// movl (%rax), %eax
// ret

// Conclusion: Dereferencing a vector iterator is the same damn thing
// as dereferencing a pointer.

void pointer_increment (S & s) { ++s.p; }
// addq $4, 32(%rdi)
// ret

void iterator_increment (S & s) { ++s.i; }
// addq $4, 40(%rdi)
// ret

// Conclusion: Incrementing a vector iterator is the same damn thing as
// incrementing a pointer.

Note: If you allocate arrays with new and allocate non-class objects (like plain int) or classes without a user defined constructor and you don't want to have your elements initialized initially, using new-allocated arrays can have performance advantages because std::vector initializes all elements to default values (0 for int, for example) on construction (credits to @bernie for reminding me).

What advantages do arrays hold over vectors?

This is not a full answer, but one thing I can think of is, that the "ability to grow and shrink" is not such a good thing if you know what you want. For example: assume you want to save memory of 1000 objects, but the memory will be filled at a rate that will cause the vector to grow each time. The overhead you'll get from growing will be costly when you can simply define a fixed array

Generally speaking: if you will use an array over a vector - you will have more power at your hands, meaning no "background" function calls you don't actually need (resizing), no extra memory saved for things you don't use (size of vector...).

Additionally, using memory on the stack (array) is faster than heap (vector*) as shown here

*as shown here it's not entirely precise to say vectors reside on the heap, but they sure hold more memory on the heap than the array (that holds none on the heap)

Why should someone prefer vectors over arrays?

Vectors have a dimension that may be provided at run-time, such as a user-provided value or something derived from a file's contents.

The size of an array, by contrast, must be hard-coded into your program. It is fixed.

why doesn't everyone simply use vectors for everything?

There is a run-time performance cost for using vectors over arrays when you don't need to. Personally I've never written code in which this is a bottleneck and therefore rarely use raw arrays; still, such scenarios do exist.


The exception is when you create a block of memory (an "array" in some sense) dynamically with new[]; this is the exact language feature that is encapsulated by std::vector.

What is the difference between std::array and std::vector? When do you use one over other?

std::array is just a class version of the classic C array. That means its size is fixed at compile time and it will be allocated as a single chunk (e.g. taking space on the stack). The advantage it has is slightly better performance because there is no indirection between the object and the arrayed data.

std::vector is a small class containing pointers into the heap. (So when you allocate a std::vector, it always calls new.) They are slightly slower to access because those pointers have to be chased to get to the arrayed data... But in exchange for that, they can be resized and they only take a trivial amount of stack space no matter how large they are.

[edit]

As for when to use one over the other, honestly std::vector is almost always what you want. Creating large objects on the stack is generally frowned upon, and the extra level of indirection is usually irrelevant. (For example, if you iterate through all of the elements, the extra memory access only happens once at the start of the loop.)

The vector's elements are guaranteed to be contiguous, so you can pass &vec[0] to any function expecting a pointer to an array; e.g., C library routines. (As an aside, std::vector<char> buf(8192); is a great way to allocate a local buffer for calls to read/write or similar without directly invoking new.)

That said, the lack of that extra level of indirection, plus the compile-time constant size, can make std::array significantly faster for a very small array that gets created/destroyed/accessed a lot.

So my advice would be: Use std::vector unless (a) your profiler tells you that you have a problem and (b) the array is tiny.

A Real World Example of Vector vs List showing scenario where each is more efficient than the other

He is asking for real examples about that, so I assume you are asking about real applications of these data sctructures.

Vector better than list: any aplication that requires a grid representation for example (any computer vision algorithm, for instance) in which the size of the grid is not known in advance (otherwise you would use array). In this case, you would need random access to container elements and vector is O(1) (constant time). Another example is for example when you have to build a tree (for a tree-based path planning algorithm, for instance). You do not know the number of nodes in advance, so you just push_back nodes all the time and store in their nodes the indices of the parents.

List better than vector: when you are going to do a lot of insertions/deletions in the middle of the container (note middle I mean not first/last elements). Imagine you are implementing a videogame in which many enemies are appearing and you have to kill them. Everytime you kill one, the proper thing to do is to detele that enemy so that it does not consume memory. However, it is very unlikely that you kill enemy 1. Therefore, you would need to remove that enemy from a list of enemies. If you are implementing insertion-sort-like algorithms a list can be also very helpful since you are all the time moving elements within the list.

std::vector versus std::array in C++

std::vector is a template class that encapsulate a dynamic array1, stored in the heap, that grows and shrinks automatically if elements are added or removed. It provides all the hooks (begin(), end(), iterators, etc) that make it work fine with the rest of the STL. It also has several useful methods that let you perform operations that on a normal array would be cumbersome, like e.g. inserting elements in the middle of a vector (it handles all the work of moving the following elements behind the scenes).

Since it stores the elements in memory allocated on the heap, it has some overhead in respect to static arrays.

std::array is a template class that encapsulate a statically-sized array, stored inside the object itself, which means that, if you instantiate the class on the stack, the array itself will be on the stack. Its size has to be known at compile time (it's passed as a template parameter), and it cannot grow or shrink.

It's more limited than std::vector, but it's often more efficient, especially for small sizes, because in practice it's mostly a lightweight wrapper around a C-style array. However, it's more secure, since the implicit conversion to pointer is disabled, and it provides much of the STL-related functionality of std::vector and of the other containers, so you can use it easily with STL algorithms & co. Anyhow, for the very limitation of fixed size it's much less flexible than std::vector.

For an introduction to std::array, have a look at this article; for a quick introduction to std::vector and to the the operations that are possible on it, you may want to look at its documentation.



  1. Actually, I think that in the standard they are described in terms of maximum complexity of the different operations (e.g. random access in constant time, iteration over all the elements in linear time, add and removal of elements at the end in constant amortized time, etc), but AFAIK there's no other method of fulfilling such requirements other than using a dynamic array. As stated by @Lucretiel, the standard actually requires that the elements are stored contiguously, so it is a dynamic array, stored where the associated allocator puts it.

Vectors vs Array in C++

In programming as in life there is no free meal... Keep that in mind all the time. If you want nice and convenient features you have to pay a price.

std::vector will add some complexity to your code you don't see. Adding items to your std::vector does more, then just writing a value. It may allocate new memory and copy the old values to it. And several more things you won't really see.

Switching to std::array won't give you the boost you might looking for. It is a bit simpler then std::vector. It is the way to got, when you are looking for an supplement of an plain c array. But still, it will add complexity too.

So my advice is and you will find similar once in good books. Try to optimize your code on algorithm base and not on the implementation. There is much more potential in possible flawed algorithms or there may be much better once. The implementation won't give you the ground braking boost.

When do we use arrays over vectors in C++ and vice versa?

If vectors can do so much, under what circumstances do we still prefer arrays?

A good design is not when there is nothing left to add, rather when there is nothing left to remove. Or introducing extra complexity only when it is needed.

Arrays vs Vectors: Introductory Similarities and Differences

arrays:

  • are a builtin language construct;
  • come almost unmodified from C89;
  • provide just a contiguous, indexable sequence of elements; no bells and whistles;
  • are of fixed size; you can't resize an array in C++ (unless it's an array of POD and it's allocated with malloc);
  • their size must be a compile-time constant unless they are allocated dynamically;
  • they take their storage space depending from the scope where you declare them;
  • if dynamically allocated, you must explicitly deallocate them;
  • if they are dynamically allocated, you just get a pointer, and you can't determine their size; otherwise, you can use sizeof (hence the common idiom sizeof(arr)/sizeof(*arr), that however fails silently when used inadvertently on a pointer);
  • automatically decay to a pointers in most situations; in particular, this happens when passing them to a function, which usually requires passing a separate parameter for their size;
  • can't be returned from a function; (Unless it is std::array)
  • can't be copied/assigned directly;
  • dynamical arrays of objects require a default constructor, since all their elements must be constructed first;

std::vector:

  • is a template class;
  • is a C++ only construct;
  • is implemented as a dynamic array;
  • grows and shrinks dynamically;
  • automatically manage their memory, which is freed on destruction;
  • can be passed to/returned from functions (by value);
  • can be copied/assigned (this performs a deep copy of all the stored elements);
  • doesn't decay to pointers, but you can explicitly get a pointer to their data (&vec[0] is guaranteed to work as expected);
  • always brings along with the internal dynamic array its size (how many elements are currently stored) and capacity (how many elements can be stored in the currently allocated block);
  • the internal dynamic array is not allocated inside the object itself (which just contains a few "bookkeeping" fields), but is allocated dynamically by the allocator specified in the relevant template parameter; the default one gets the memory from the freestore (the so-called heap), independently from how where the actual object is allocated;
  • for this reason, they may be less efficient than "regular" arrays for small, short-lived, local arrays;
  • when reallocating, the objects are copied (moved, in C++11);
  • does not require a default constructor for the objects being stored;
  • is better integrated with the rest of the so-called STL (it provides the begin()/end() methods, the usual STL typedefs, ...)

Also consider the "modern alternative" to arrays - std::array; I already described in another answer the difference between std::vector and std::array, you may want to have a look at it.



Related Topics



Leave a reply



Submit