Why Is Array.Push Sometimes Faster Than Array[N] = Value

Why is array.push sometimes faster than array[n] = value?

All sorts of factors come into play, most JS implementations use a flat array that converts to sparse storage if it becomes necessary later on.

Basically the decision to become sparse is a heuristic based on what elements are being set, and how much space would be wasted in order to remain flat.

In your case you are setting the last element first, which means the JS engine will see an array that needs to have a length of n but only a single element. If n is large enough this will immediately make the array a sparse array -- in most engines this means that all subsequent insertions will take the slow sparse array case.

You should add an additional test in which you fill the array from index 0 to index n-1 -- it should be much, much faster.

In response to @Christoph and out of a desire to procrastinate, here's a description of how arrays are (generally) implemented in JS -- specifics vary from JS engine to JS engine but the general principle is the same.

All JS Objects (so not strings, numbers, true, false, undefined, or null) inherit from a base object type -- the exact implementation varies, it could be C++ inheritance, or manually in C (there are benefits to doing it in either way) -- the base Object type defines the default property access methods, eg.

interface Object {
put(propertyName, value)
get(propertyName)
private:
map properties; // a map (tree, hash table, whatever) from propertyName to value
}

This Object type handles all the standard property access logic, the prototype chain, etc.
Then the Array implementation becomes

interface Array : Object {
override put(propertyName, value)
override get(propertyName)
private:
map sparseStorage; // a map between integer indices and values
value[] flatStorage; // basically a native array of values with a 1:1
// correspondance between JS index and storage index
value length; // The `length` of the js array
}

Now when you create an Array in JS the engine creates something akin to the above data structure. When you insert an object into the Array instance the Array's put method checks to see if the property name is an integer (or can be converted into an integer, e.g. "121", "2341", etc.) between 0 and 2^32-1 (or possibly 2^31-1, i forget exactly). If it is not, then the put method is forwarded to the base Object implementation, and the standard [[Put]] logic is done. Otherwise the value is placed into the Array's own storage, if the data is sufficiently compact then the engine will use the flat array storage, in which case insertion (and retrieval) is just a standard array indexing operation, otherwise the engine will convert the array to sparse storage, and put/get use a map to get from propertyName to value location.

I'm honestly not sure if any JS engine currently converts from sparse to flat storage after that conversion occurs.

Anyhoo, that's a fairly high level overview of what happens and leaves out a number of the more icky details, but that's the general implementation pattern. The specifics of how the additional storage, and how put/get are dispatched differs from engine to engine -- but this is the clearest i can really describe the design/implementation.

A minor addition point, while the ES spec refers to propertyName as a string JS engines tend to specialise on integer lookups as well, so someObject[someInteger] will not convert the integer to a string if you're looking at an object that has integer properties eg. Array, String, and DOM types (NodeLists, etc).

Why push method is significantly slower than putting values via array indices in Javascript

Could you explain why this is the case?

Because your test is flawed. The push does always append to the existing a array making it much larger, while the second test does only use the first 1000 indices.
Using the setup is not enough here, you would have to reset the a array before every for-loop: http://jsperf.com/push-method-vs-setting-via-key/3.

Apart from that, the method call to push might have a little overhead, and determining the current array length might need additional time compared to using the index of the for-loop.

Usually there is no reason not to use push - the method is there for exactly that operation and makes some code easier to read. While a few people think one version is faster than the other, both are equally optimized in browsers. See Why is array.push sometimes faster than array[n] = value? and Using the push method or .length when adding to array? - results vary so wide that it's actually irrelevant. Use what is better to understand.

Why is array.push sometimes faster than array[n] = value?

All sorts of factors come into play, most JS implementations use a flat array that converts to sparse storage if it becomes necessary later on.

Basically the decision to become sparse is a heuristic based on what elements are being set, and how much space would be wasted in order to remain flat.

In your case you are setting the last element first, which means the JS engine will see an array that needs to have a length of n but only a single element. If n is large enough this will immediately make the array a sparse array -- in most engines this means that all subsequent insertions will take the slow sparse array case.

You should add an additional test in which you fill the array from index 0 to index n-1 -- it should be much, much faster.

In response to @Christoph and out of a desire to procrastinate, here's a description of how arrays are (generally) implemented in JS -- specifics vary from JS engine to JS engine but the general principle is the same.

All JS Objects (so not strings, numbers, true, false, undefined, or null) inherit from a base object type -- the exact implementation varies, it could be C++ inheritance, or manually in C (there are benefits to doing it in either way) -- the base Object type defines the default property access methods, eg.

interface Object {
put(propertyName, value)
get(propertyName)
private:
map properties; // a map (tree, hash table, whatever) from propertyName to value
}

This Object type handles all the standard property access logic, the prototype chain, etc.
Then the Array implementation becomes

interface Array : Object {
override put(propertyName, value)
override get(propertyName)
private:
map sparseStorage; // a map between integer indices and values
value[] flatStorage; // basically a native array of values with a 1:1
// correspondance between JS index and storage index
value length; // The `length` of the js array
}

Now when you create an Array in JS the engine creates something akin to the above data structure. When you insert an object into the Array instance the Array's put method checks to see if the property name is an integer (or can be converted into an integer, e.g. "121", "2341", etc.) between 0 and 2^32-1 (or possibly 2^31-1, i forget exactly). If it is not, then the put method is forwarded to the base Object implementation, and the standard [[Put]] logic is done. Otherwise the value is placed into the Array's own storage, if the data is sufficiently compact then the engine will use the flat array storage, in which case insertion (and retrieval) is just a standard array indexing operation, otherwise the engine will convert the array to sparse storage, and put/get use a map to get from propertyName to value location.

I'm honestly not sure if any JS engine currently converts from sparse to flat storage after that conversion occurs.

Anyhoo, that's a fairly high level overview of what happens and leaves out a number of the more icky details, but that's the general implementation pattern. The specifics of how the additional storage, and how put/get are dispatched differs from engine to engine -- but this is the clearest i can really describe the design/implementation.

A minor addition point, while the ES spec refers to propertyName as a string JS engines tend to specialise on integer lookups as well, so someObject[someInteger] will not convert the integer to a string if you're looking at an object that has integer properties eg. Array, String, and DOM types (NodeLists, etc).

Is there a reason JavaScript developers don't use Array.push()?

I actually asked myself the same question at the start of this year. UPDATED with new test cases http://jsperf.com/array-push-vs-unshift-vs-direct-assignment/2

It appears that push is much faster in chrome, and about equal in FF. Also direct is faster in IE9, but I would be interested to see how it performs in IE10.

I would say that most developers would assume setting the length of the array, and then using direct assignment is faster, as is the case with most programming languages. But JavaScript is different. Javascript arrays aren't really arrays, they're just key/value maps just like all other JavaScript objects. So the pre-allocation is essentially falling on deaf ears.

Personally I prefer push (:

why is javascript array index set by array length

In the example you have posted, the two uses are the same. Both append a new element to the end of the array.

However, the push() method can take multiple arguments and so you can append multiple elements to the end of the array in one statement. push() then returns the new length of the array. push() is also shorter and arguably easier to read.

Another thing to consider is that if thisarray has been incorrectly defined (ie. it is not an Array object) then thisarray[thisarray.length] = int; is likely to fail silently since .length will simply be undefined. Whereas thisarray.push(int) will fail with a trappable TypeError exception.

Regarding PHP's thisarray[] (square bracket) syntax. There is nothing quite the same in JavaScript, as far as syntax goes. However, thisarray[thisarray.length] = int; performs the same action.

JavaScript runtime complexity of Array functions

The ECMA specification does not specify a bounding complexity, however, you can derive one from the specification's algorithms.

push is O(1), however, in practice it will encounter an O(N) copy costs at engine defined boundaries as the slot array needs to be reallocated. These boundaries are typically logarithmic.

pop is O(1) with a similar caveat to push but the O(N) copy is rarely encountered as it is often folded into garbage collection (e.g. a copying collector could only copy the used part of an array).

shift is at worst O(N) however it can, in specially cases, be implemented as O(1) at the cost of slowing down indexing so your mileage may vary.

slice is O(N) where N is end - start. Not a tremendous amount of optimization opportunity here without significantly slowing down writes to both arrays.

splice is, worst case, O(N). There are array storage techniques that divide N by a constant but they significantly slow down indexing. If an engine uses such techniques you might notice unusually slow operations as it switches between storage techniques triggered by access pattern changes.

One you didn't mention, is sort. It is, in the average case, O(N log N). However, depending on the algorithm chosen by the engine, you could get O(N^2) in some cases. For example, if the engine uses QuickSort (even with an late out to InsertionSort), it has well-known N^2 cases. This could be a source of DoS for your application. If this is a concern either limit the size of the arrays you sort (maybe merging the sub-arrays) or bail-out to HeapSort.

In JavaScript, which is the more efficient way of adding a new element to an array?

The tests yielded the following results in the following browsers:

Array.push

  • Google Chrome 6.0.472.63: 0.4 ms
  • Mozilla Firefox 3.6.10: 5 ms
  • Apple Safari 5.0 (7533.16): 2 ms
  • Internet Explorer 8: 21.7 ms
  • Internet Explorer 7: 66.7 ms
  • Opera 10.62: 2.7 ms

array[array.length]

  • Google Chrome 6.0.472.63: 1.2 ms
  • Mozilla Firefox 3.6.10: 0.9 ms
  • Apple Safari 5.0 (7533.16): 0.9 ms
  • Internet Explorer 8: 10.9 ms
  • Internet Explorer 7: 32.6 ms
  • Opera 10.62: 1 ms

The results speak for themselves: using an index outperforms using Push in every browser with the exception of Google's own. If cross-compatibility is a big concern for you, the utilitarian approach would be to use an index wherever possible.

Source: http://www.scottlogic.com/blog/2010/10/15/javascript-array-performance.html



Related Topics



Leave a reply



Submit