Are JavaScript Arrays Sparse

Are Javascript arrays sparse?

How exactly JavaScript arrays are implemented differs from browser to browser, but they generally fall back to a sparse implementation - most likely the same one used for property access of regular objects - if using an actual array would be inefficient.

You'll have to ask someone with more knowledge about specific implementations to answer what excatly triggers the shift from dense to sparse, but your example should be perfectly safe. If you want to get a dense array, you should call the constructor with an explicit length argument and hope you'll actually get one.

See this answer for a more detailed description by olliej.

How does JavaScript create sparse arrays?

  1. Is Javascript allocating memory for the first 57 "empty" entries in the array?

1. No

JS Arrays may or may not be dense based on the engine, how you construct them, and how you use them afterwards.

Popular engines will create dense arrays whenever you construct them with all its elements defined, and no holes.

Dense array definition:

[0,1,2,3,4];

Array.apply(null, Array(4)); // This is creating an Array expanding the internal Sparse Array as it initial values.

Sparse Array Definition:

[0,,2,3,4];

new Array();

new Array(4);

  1. If not, how does Javascript keep O(1) for lookups using fragmented memory addresses?

2. Depends on the Array

For dense arrays it will store everything inside the same block of memory.

For sparse arrays it will create a hash table and store pointers to each of the elements independently.


  1. Is a Javascript Array really an Array or just an Object with integers as keys?

3. They are a type of Object

Arrays are regular Objects, with some properties that let them behave as you would expect.

Accessing elements with integer keys such as 1 (or strings that represent integers such as "1") will retrieve whatever is store under that key. Using non-integers (nor string representations of integers) will still set/retrieve Object properties.

Array length is just an attribute of the Object and its value it's not affected by modifying Object properties.


  1. What is this design pattern called so that I can research it further?

4. Not sure

How to work around inefficiencies for large sparse arrays of javascript engines' Array implementations?

The answer is: yes, find, findIndex, findLast, findLastIndex are indeed not efficient on large sparse arrays, but no, that isn't true for all the iterating array functions.

In fact, forEach, filter, map, and reduce, even some and every, do not call the callback function on the holes in the sparse arrays, which I tested in Chromium, Firefox, and Node.JS. Therefore if the intention of using findLastIndex is to replace naive use of the array length property which is unreliable (and therefore generally useless if often practically useful when ignoring many sparse array scenarios) then the following can be done:

(function(arr) {
try {
return arr.reduceRight(
function(r,x,i,a) {
console.log(r,x,i,a);
throw i;
}, -1);
} catch(r) {
if(typeof r == 'number')
return r;
else
throw r;
}
})([,,,3,,,6,,,9,,,])

I don't know if there is a gentler way of breaking out of such an iterator function loop other than throwing the result value, but it works and it cuts short what would otherwise continue after the result has already been found.

In case of find(Index) from the beginning (instead of the end), where a sparse array could also cause millions of iterations before even the first element is found, the same can be achieved with forEach:

let testArray = [];
testArray[20000000] = "first";
testArray[40000000] = "last";
(function(arr) {
try {
arr.forEach(
function(x,i,a) {
console.log(x,i,a);
throw i;
});
return -1;
} catch(r) {
if(typeof r == "number")
return r;
else
throw r;
}
})(testArray)

As you run this with different index values (creating different size initial sparseness) you notice that (again on all three, Chromium, Firefox, and Node.JS) the implementation is inefficient as it seems to internally iterate instead of maintaining a direct pointer to the first actual index, last actual index, and potentially even next-pointers to skip over large holes. But this is OK, those are optimizations that can be added, at least the specification doesn't require that the call-back is invoked on the holes of a sparse array as it seems to be the case with find (which I think is a bad idea).

The following goes into my own general toolbox from now on:

Array.prototype.findActualLastIndex = function(f) {
try {
return this.reduceRight(
function(r, x, i, a) {
if(f(x, i, a))
throw i;
else
return r;
}, -1);
} catch(r) {
if(typeof r == 'number')
return r;
else
throw r;
}
}

Array.prototype.findActualIndex = function(f) {
try {
return this.reduce(
function(r, x, i, a) {
if(f(x, i, a))
throw i;
else
return r;
}, -1);
} catch(r) {
if(typeof r == 'number')
return r;
else
throw r;
}
}

Hope this can help others. It's not running optimally with huge spare regions at the ends, but it is the best we can get out of the present version of javascript engines.

Does 'array[i]=undefined' produce a sparse array?

No, the latter does not do the same, it just sets that index to the value undefined, it does not delete that index.

delete will delete the index, but will not update the length, so you end up with an index that is truly not defined, and doesn't just contain the value undefined.

In other words, a "sparse" array where the length is greater than the number of items.

You can test this easily with most array methods, as they will iterate over the value undefined, but not over indexes that aren't defined (there's a difference)

var arr = [0, 1, 2, 3];
delete arr[1]; // delete the second index
arr.forEach(function(item) { console.log(item); // 0, 2, 3});
console.log('--------')
var arr2 = [0, 1, 2, 3];
arr2[1] = undefined; // set the second index to "undefined"
arr2.forEach(function(item) { console.log(item); // 0, undefined, 2, 3});

Sparse arrays and reduce in JavaScript

array1 has three numerically-named properties: 0, 1, and 3.

array2 has four numerically-named properties: 0, 1, 2, and 3. The value of the property named 2 happens to be undefined.

When you ask an object for the value of a property it doesn't have, the result is undefined.

In the for loop, you ask each array for the values of its properties named 0, 1, 2, and 3. For array1, the property named 2 does not exist, so the property access produces undefined. For array2, the property does exist, but its value actually is undefined, so you get the same result.

On the other hand, reduce only operates on properties that actually exist. From the ECMAScript specification, this is how reduce loops over arrays, using a counter k:


  1. Repeat, while k < len
    • Let Pk be ToString(k).
    • Let kPresent be the result of calling the [[HasProperty]] internal method of O with argument Pk.
    • If kPresent is true, then... [use the value at index k for the reduce call]

So, we can see that an index is only used if passes a [[HasProperty]] check. array1 does not have a property named 2, so that index is skipped.

If I set only a high index in an array, does it waste memory?

See this topic:
are-javascript-arrays-sparse

In most implementations of Javascript (probably all modern ones) arrays are sparse. That means no, it's not going to allocate memory up to the maximum index.

If it's anything like a Lua implementation there is actually an internal array and dictionary. Densely populated parts from the starting index will be stored in the array, sparse portions in the dictionary.

Javascript Array.find() has problems with sparse arrays

The behavior is as per specification. Looking at the current specification draft, comparing, e.g.:

  • The specification for Array.prototype.filter() tests for the presence of the value before processing (step 8c):


    1. Repeat, while k < len
      a. Let Pk be ! ToString(k).

      b. Let kPresent be ? HasProperty(O, Pk).

      c. If kPresent is true, then [...]
  • The specification for Array.prototype.find() does not test for the presence of the value before processing:


    1. Repeat, while k < len
      a. Let Pk be ! ToString(k).

      b. Let kValue be ? Get(O, Pk).

      c. [...]

So, as you state, the behavior is indeed different, and it's different by specification.

As to the design considerations that went into this, I can only guess at this point, but if you feel that this is a bug, you can file a bug report here.


Update

It seems that the original specification draft of the find() and findIndex() methods did indeed have array hole checking/skipping implemented. These drafts were, according to the meeting minutes, proposed at the March 14th 2013 ECMA Technical Committee 39 meeting.

In August 2014, a bug was raised, arguing that array holes should be treated as undefined rather than skipping them.

Array#find and Array#findIndex should treat holes as undefined
rather than skipping them, for consistency with the recent trend in
TC39 to treat holes as undefined.

The issue seems to have propped up in a discussion about the proposed behavior for Array.prototype.includes() revolving around the question of whether

[,,,].includes(undefined)

should return true or false.

Also in this discussion, it is stated that [...] The recent trend in TC39 has been to treat holes as undefined.

Hole checking for the find() and findIndex() methods was subsequently removed in revision 27 of the ECMAScript 6 Draft.

Reading through the linked resources, you can get a more complete view on what led to removing the hole checking/skipping from the find() and findIndex() specification, but the general consensus at the time seemed to be "holes: nobody wants them".



Related Topics



Leave a reply



Submit