When Would You Use an Array Rather Than a Vector/String

When would you use an array rather than a vector/string?

When writing code that should used in other projects, in particular if you target special platforms (embedded, game consoles, etc.) where STL might not be present.

Old projects or projects with special requirements might not want to introduce dependencies on STL libraries. An interface depending on arrays, char* or whatever will be compatible with anything since it's part of the language. STL however is not guaranteed to be present in all build environments.

C++ std::vector vs array in the real world

A: Almost always [use a vector instead of an array]. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.

That's an over-simplification. It's fairly common to use arrays, and can be attractive when:

  • the elements are specified at compile time, e.g. const char project[] = "Super Server";, const Colours colours[] = { Green, Yellow };

    • with C++11 it will be equally concise to initialise std::vectors with values

  • the number of elements is inherently fixed, e.g. const char* const bool_to_str[] = { "false", "true" };, Piece chess_board[8][8];

  • first-use performance is critical: with arrays of constants the compiler can often write a memory snapshot of the fully pre-initialised objects into the executable image, which is then page-faulted directly into place ready for use, so it's typically much faster that run-time heap allocation (new[]) followed by serialised construction of objects

    • compiler-generated tables of const data can always be safely read by multiple threads, whereas data constructed at run-time must complete construction before other code triggered by constructors for non-function-local static variables attempts to use that data: you end up needing some manner of Singleton (possibly threadsafe which will be even slower)

    • In C++03, vectors created with an initial size would construct one prototypical element object then copy construct each data member. That meant that even for types where construction was deliberately left as a no-operation, there was still a cost to copy the data elements - replicating their whatever-garbage-was-left-in-memory values. Clearly an array of uninitialised elements is faster.

  • One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.

  • There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?

  • classes/structures containing arrays can still be POD types

  • arrays facilitate access in shared memory from multiple processes (by default vector's internal pointers to the actual dynamically allocated data won't be in shared memory or meaningful across processes, and it was famously difficult to force C++03 vectors to use shared memory like this even when specifying a custom allocator template parameter).

  • embedding arrays can localise memory access requirement, improving cache hits and therefore performance

That said, if it's not an active pain to use a vector (in code concision, readability or performance) then you're better off doing so: they've size(), checked random access via at(), iterators, resizing (which often becomes necessary as an application "matures") etc.. It's also often easier to change from vector to some other Standard container should there be a need, and safer/easier to apply Standard algorithms (x.end() is better than x + sizeof x / sizeof x[0] any day).

UPDATE: C++11 introduced a std::array<>, which avoids some of the costs of vectors - internally using a fixed-sized array to avoid an extra heap allocation/deallocation - while offering some of the benefits and API features: http://en.cppreference.com/w/cpp/container/array.

Using arrays or std::vectors in C++, what's the performance gap?

Using C++ arrays with new (that is, using dynamic arrays) should be avoided. There is the problem that you have to keep track of the size, and you need to delete them manually and do all sorts of housekeeping.

Using arrays on the stack is also discouraged because you don't have range checking, and passing the array around will lose any information about its size (array to pointer conversion). You should use std::array in that case, which wraps a C++ array in a small class and provides a size function and iterators to iterate over it.

Now, std::vector vs. native C++ arrays (taken from the internet):

// Comparison of assembly code generated for basic indexing, dereferencing, 
// and increment operations on vectors and arrays/pointers.

// Assembly code was generated by gcc 4.1.0 invoked with g++ -O3 -S on a
// x86_64-suse-linux machine.

#include <vector>

struct S
{
int padding;

std::vector<int> v;
int * p;
std::vector<int>::iterator i;
};

int pointer_index (S & s) { return s.p[3]; }
// movq 32(%rdi), %rax
// movl 12(%rax), %eax
// ret

int vector_index (S & s) { return s.v[3]; }
// movq 8(%rdi), %rax
// movl 12(%rax), %eax
// ret

// Conclusion: Indexing a vector is the same damn thing as indexing a pointer.

int pointer_deref (S & s) { return *s.p; }
// movq 32(%rdi), %rax
// movl (%rax), %eax
// ret

int iterator_deref (S & s) { return *s.i; }
// movq 40(%rdi), %rax
// movl (%rax), %eax
// ret

// Conclusion: Dereferencing a vector iterator is the same damn thing
// as dereferencing a pointer.

void pointer_increment (S & s) { ++s.p; }
// addq $4, 32(%rdi)
// ret

void iterator_increment (S & s) { ++s.i; }
// addq $4, 40(%rdi)
// ret

// Conclusion: Incrementing a vector iterator is the same damn thing as
// incrementing a pointer.

Note: If you allocate arrays with new and allocate non-class objects (like plain int) or classes without a user defined constructor and you don't want to have your elements initialized initially, using new-allocated arrays can have performance advantages because std::vector initializes all elements to default values (0 for int, for example) on construction (credits to @bernie for reminding me).

Vector, string, or array?

What represents your view on the actual data the best.

Is the actual data a string? Use a std::string. Is it a fixed-size array of fixed-size elements? Use std::array. Is it a variable-size array of fixed-size elements? Use std::vector.

I would choose std::string (or a unicode-supporting variant of it) as you're dealing with text, and vector/array operations typically assume uniform elements.

What is the difference between std::array and std::vector? When do you use one over other?

std::array is just a class version of the classic C array. That means its size is fixed at compile time and it will be allocated as a single chunk (e.g. taking space on the stack). The advantage it has is slightly better performance because there is no indirection between the object and the arrayed data.

std::vector is a small class containing pointers into the heap. (So when you allocate a std::vector, it always calls new.) They are slightly slower to access because those pointers have to be chased to get to the arrayed data... But in exchange for that, they can be resized and they only take a trivial amount of stack space no matter how large they are.

[edit]

As for when to use one over the other, honestly std::vector is almost always what you want. Creating large objects on the stack is generally frowned upon, and the extra level of indirection is usually irrelevant. (For example, if you iterate through all of the elements, the extra memory access only happens once at the start of the loop.)

The vector's elements are guaranteed to be contiguous, so you can pass &vec[0] to any function expecting a pointer to an array; e.g., C library routines. (As an aside, std::vector<char> buf(8192); is a great way to allocate a local buffer for calls to read/write or similar without directly invoking new.)

That said, the lack of that extra level of indirection, plus the compile-time constant size, can make std::array significantly faster for a very small array that gets created/destroyed/accessed a lot.

So my advice would be: Use std::vector unless (a) your profiler tells you that you have a problem and (b) the array is tiny.

Arrays Or Vectors?

According to Bjarne Stroustrup, you should use vector over Array unless you have a really good reason to use an array.

Arrays vs Vectors: Introductory Similarities and Differences

arrays:

  • are a builtin language construct;
  • come almost unmodified from C89;
  • provide just a contiguous, indexable sequence of elements; no bells and whistles;
  • are of fixed size; you can't resize an array in C++ (unless it's an array of POD and it's allocated with malloc);
  • their size must be a compile-time constant unless they are allocated dynamically;
  • they take their storage space depending from the scope where you declare them;
  • if dynamically allocated, you must explicitly deallocate them;
  • if they are dynamically allocated, you just get a pointer, and you can't determine their size; otherwise, you can use sizeof (hence the common idiom sizeof(arr)/sizeof(*arr), that however fails silently when used inadvertently on a pointer);
  • automatically decay to a pointers in most situations; in particular, this happens when passing them to a function, which usually requires passing a separate parameter for their size;
  • can't be returned from a function; (Unless it is std::array)
  • can't be copied/assigned directly;
  • dynamical arrays of objects require a default constructor, since all their elements must be constructed first;

std::vector:

  • is a template class;
  • is a C++ only construct;
  • is implemented as a dynamic array;
  • grows and shrinks dynamically;
  • automatically manage their memory, which is freed on destruction;
  • can be passed to/returned from functions (by value);
  • can be copied/assigned (this performs a deep copy of all the stored elements);
  • doesn't decay to pointers, but you can explicitly get a pointer to their data (&vec[0] is guaranteed to work as expected);
  • always brings along with the internal dynamic array its size (how many elements are currently stored) and capacity (how many elements can be stored in the currently allocated block);
  • the internal dynamic array is not allocated inside the object itself (which just contains a few "bookkeeping" fields), but is allocated dynamically by the allocator specified in the relevant template parameter; the default one gets the memory from the freestore (the so-called heap), independently from how where the actual object is allocated;
  • for this reason, they may be less efficient than "regular" arrays for small, short-lived, local arrays;
  • when reallocating, the objects are copied (moved, in C++11);
  • does not require a default constructor for the objects being stored;
  • is better integrated with the rest of the so-called STL (it provides the begin()/end() methods, the usual STL typedefs, ...)

Also consider the "modern alternative" to arrays - std::array; I already described in another answer the difference between std::vector and std::array, you may want to have a look at it.

C++ function return a vector / string but not an array

It is making a copy of the std::vector object. The memory the std::vector uses for storing its data is allocated on the heap (and that is also copied).

(Some compiler optimizations mean the copies don't always happen, behind the scenes; e.g. in your sample code, I think most compilers will copy from a to b, inside foo(), but b will become the c in main() rather than be copied again.)

Further Reading: http://en.wikipedia.org/wiki/Copy_elision and http://en.wikipedia.org/wiki/Return_value_optimization (thanks to millsj for the suggestion)
Item 20 of More Effective C++, by Scott Meyers, also covers this.

Why does vector of same size takes more memory than array in leetcode

An array (I assume you used plain C arrays) uses only as much memory as its elements. A vector uses some memory to store some housekeeping information like the length and location of the data.

Because you made a vector of vector of vectors, this housekeeping information is created for all of the nested vectors, which occupies a lot of space. This gets worse and worse if you increase the "dimension" of your "multidimensional" vector.



Related Topics



Leave a reply



Submit