What Is the Most Efficient Way to Concatenate N Arrays

What is the most efficient way to concatenate N arrays?

If you're concatenating more than two arrays, concat() is the way to go for convenience and likely performance.

var a = [1, 2], b = ["x", "y"], c = [true, false];
var d = a.concat(b, c);
console.log(d); // [1, 2, "x", "y", true, false];

For concatenating just two arrays, the fact that push accepts multiple arguments consisting of elements to add to the array can be used instead to add elements from one array to the end of another without producing a new array. With slice() it can also be used instead of concat() but there appears to be no performance advantage from doing this.

var a = [1, 2], b = ["x", "y"];
a.push.apply(a, b);
console.log(a); // [1, 2, "x", "y"];

In ECMAScript 2015 and later, this can be reduced even further to

a.push(...b)

However, it seems that for large arrays (of the order of 100,000 members or more), the technique passing an array of elements to push (either using apply() or the ECMAScript 2015 spread operator) can fail. For such arrays, using a loop is a better approach. See https://stackoverflow.com/a/17368101/96100 for details.

Efficient way to concatenate multiple numpy arrays

Use np.tile

Code

np.tile(np.array([1, 1]), (10, 1))
# Output: Same as OP
array([[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1]])

Explanation

Function np.tile

Usage: np.tile(A, reps)

parameter A

Object which acts as a “tile” that gets copied and repeated

  • Numpy array
  • Python list
  • Python tuple

parameter reps

Indicates how to repeat the input array if reps is an integer then A is
repeated n times, horizontally. If reps is a tuple (e.g. (r, c))

  • r is the number of repeats downwards and
  • c is the number of repeats across.

Adding Different Arrays Vertically

Using vstack

arr1 = np.array([1, 1])
arr2 = np.array([2, 2])
arr3 = np.array([3, 3])
arr = [arr1, arr2, arr3]
np.vstack(arr)

# Output
array([[1, 1],
[2, 2],
[3, 3]])

fastest way to concatenate large numpy arrays

Every time you append a new array, new memory is being allocated to create a bigger one and record data into it. This is very expensive. A better solution is to allocate a specific size of memory once and then record your date using np.concatenate only once:

np.concatenate([np.zeros(arraySize) for i in range(100)])

How to efficiently concatenate multiple arrays in Ruby?

If you've already determined that multiple concatenation is the fastest method, you can write it nicer using reduce:

[foo, bar, baz].reduce([], :concat)

Most efficient way to append arrays in C#?

You can't append to an actual array - the size of an array is fixed at creation time. Instead, use a List<T> which can grow as it needs to.

Alternatively, keep a list of arrays, and concatenate them all only when you've grabbed everything.

See Eric Lippert's blog post on arrays for more detail and insight than I could realistically provide :)

Most computationally efficient way to get average of particular pairs of rows, and concatenate all of the results with a particular row

The problem is not that concatenate is slow. In fact, it is not so slow. The problem is to use it in a loop so to produce a growing array. This pattern is very inefficient because it produces many temporary array and copies. However, in your case you do not use such a pattern so this is fine. Here, concatenate is properly used and perfectly match with your intent. You could create an array and fill the left and the right part separately, but this is what concatenate should do in the end. That being said, concatenate has a quite big overhead mainly for small arrays (like most Numpy functions) because of many internal checks** (so to adapt its behaviour regarding the shape of the input arrays). Moreover, the implicit casting from np.int_ to np.float64 of np.tile(a[0],(3,1)) introduces another overhead. Moreover, note that mean is not very optimized for such a case. It is faster to use (a[b[:,0]] + a[b[:,1]]) * 0.5 although the intent is less clear.

n, m = a.shape[1], b.shape[0]
res = np.empty((n, m*2), dtype=np.float64)
res[:,m] = a[0] # Note: implicit conversion done here
res[:,m:] = (a[b[:,0]] + a[b[:,1]]) * 0.5 # Also here

The resulting operation is about 3 times faster on my machine with your example. It may not be the case for big input arrays (although I expect a speed up too).

For big arrays, the best solution is to use a Numba (or Cython) code with loops so to avoid the creation/filling of big expensive temporary arrays. Numba should also speed up the computation of small arrays because it mostly removes the overhead of Numpy functions (I expect a speed up of about 5x-10x here).



Related Topics



Leave a reply



Submit