How Are JavaScript Arrays Represented in Physical Memory

How are JavaScript arrays represented in physical memory?

Normally, arrays allocate a contiguous block of memory of fixed length. However, in Javascript, arrays are Object types with special constructors and accessor methods.

Which means, a statement like:

var arr = new Array(100000);

does not allocate any memory! In fact, it simply sets the value of the length property in the array. When you construct an array, you don't need to declare a size as they grow automatically. So, you should use this instead:

var arr = [];

Arrays in Javascript are sparse which means not all the elements in the array may contain data. In other words, only the elements that actually contain data exist in the array. This reduces the amount of memory used by the array. The values are located by a key and not by an offset. They're simply a method of convenience and not intended to be used for complex numerical analysis.

Arrays in Javascript are not typed so the value of an element can be an object, string, number, boolean, function or an array. The main difference between an array and an object is the length property which has a value greater than the largest integer key in the array.

For example:

You could have an create an empty array and add two elements at index 0 and index 99. The length would be 100, but the number of elements in the array would be 2.

var arr = [];
arr[0] = 0;
arr[99] = {name: "John"};
console.log(arr.length); // prints 100
arr; // prints something like [0, undefined × 98, Object { name: "John"}]

To answer your questions directly:

Q. It is my understanding that I can store mixed data in a JavaScript array, as well as change any element in the array to some other type. How does the interpreter keep track of what place in physical memory any element is at? Also, how is the overwriting of the data in the next element prevented if I change an element to a larger data type?

A. You probably know this by now if you've read my comments above. In Javascript, an array is a Hashtable Object type so the interpreter doesn't need to keep track of physical memory and changing the value of an element doesn't affect other elements as they're not stored in a contiguous block of memory.

--

Q. I assume that arrays only store references to actual objects, and primitives are wrapped behind the scenes when placed in arrays. Assuming this is the case, if I have a different handle on the primitive variable and change the value stored in the array is synchronicity maintained?

A. No, primitives are not wrapped. Changing a primitive that was assigned to an array will not change the value in the array as they're stored by value. Objects on the other hand are stored by reference, so changing the objects value will reflect that change in that array.

Here's an example you can try:

var arr = [];
var obj = { name: "John" };
var isBool = true;

arr.push(obj);
arr[1] = isBool;

console.log(arr[0]); // print obj.name
console.log(arr[1]); // print true

obj.age = 40; // add age to obj
isBool = false; // change value for isBool

console.log(arr[0]); // value here will contain age
console.log(arr[1]); // value here will still be true

Also, note that when you initialize an array in the following two ways, it has a different behavior:

var arr = new Array(100);
console.log(arr.length); // prints 100
console.log(arr); // prints []

var arr2 = new Array(100, 200);
console.log(arr2.length); // prints 2
console.log(arr2); // prints [100, 200]

If you want to use Javascript Arrays as contiguous blocks of memory, you should look into using TypedArray. TypedArray's allow you to allocate a block of memory as a byte array and access the raw binary data more efficiently.

You can learn more about the intricacies of Javascript by reading the ECMA-262 spec (ver 5.1).

How are JavaScript arrays stored in memory

One thing I would recommend everyone is that node.js recently became a first-class citizen of Chrome V8, so I would recommend studying V8 to see not only how it handles these implementation details but also why.

First, This article should prove beneficial to readers because of its focus on writing optimized isomorphic JavaScript:

https://blog.sessionstack.com/how-javascript-works-inside-the-v8-engine-5-tips-on-how-to-write-optimized-code-ac089e62b12e

The above article goes into details about how the JIT (Just In Time) compiler works, so you should be able to derive the exact questions you have after reading it.

Here is an exerpt:

Arrays: avoid sparse arrays where keys are not incremental numbers. Sparse arrays which don’t have every element inside them are a hash table. Elements in such arrays are more expensive to access. Also, try to avoid pre-allocating large arrays. It’s better to grow as you go. Finally, don’t delete elements in arrays. It makes the keys sparse.

Second, I would also recommend reading this and then working outward with respect to V8:
http://www.jayconrod.com/posts/52/a-tour-of-v8-object-representation

Third, as a matter of critical bonus facts, I read this answer a while ago and I mentally revisit it from time to time. I am extremely surprised I just found it now. I literally Googled "stack overflow optimize train tracks" and found it. Thanks Google: Why is it faster to process a sorted array than an unsorted array?

Yes, that answer does have 27,000 positive votes.

That article talks about branch prediction, and I would like you to be aware of that because it could have some implications on how you work with data in general not just arrays. Again, note the first article I linked, and pay attention while it is describing the order of keys on an Object.

Performance can be optimized by understanding the implementation details and understanding why the problems were solved that way.

Finally, everything is an Object in JavaScript unless it is a scalar value, which we call primitives--String, Number, Boolean, etc.

Here is an example for thought provoking purposes:

const arr = ['one', 'two', 'three']

const sameArr = {
0: 'one',
1: 'two',
2: 'three',
}

We could then destructure our Array as if it were an Object:

const yolo = ['one', 'two', 'three']
const { 0: one, 1: two, 2: three,} = yolo
console.log('Pretty cool:', one, two, three)

Memory allocation of a Javascript array?

Due to JavaScript arrays actually being objects, memory is not contiguous. Thus, by your example, accessing array[1000] without ever storing anything elsewhere will take the same amount of memory as whatever you're storing (not 1000 * size of value).

In fact, doing var arr = new Array(1000); per Danny's answer does not create an array of 1000 empty slots. It creates an array with a length property of the value 1000.

Think about it this way: How is JavaScript to know how much memory to set aside if it doesn't know the type and size of what it's storing?

For example:

var arr = [];
arr[1000] = 'String value';

What's saying I can't come by and store an integer at another index?

arr[0] = 2;

Source:
https://stackoverflow.com/a/20323491/2506594

How are untyped javascript arrays laid out in memory considering they're not homogeneous?

It depends on the JavaScript engine implementation.

But in general in JavaScript arrays, integers and floats are stored by-value and all other objects by-reference.

In V8 the array type will be either PACKED_ELEMENTS or HOLEY_ELEMENTS (depending on how the array was created/populated) and each string will additionally be stored separately on the heap.

To verify, use the %DebugPrint function in a debug version of the V8 engine (you can get one using jsvu tool):

d8> var a = [1, 2, 'aaa']; %DebugPrint(a);
DebugPrint: 000003B13FECFC89: [JSArray]
- elements: 0x03b13fecfc31 <FixedArray[3]> {
0: 1
1: 2
2: 0x00c73b3e0fe1 <String[#3]: aaa>
}

How are the JavaScript Arrays internally resizing?

In JavaScript, an Array is an abstraction. How it is implemented (and when allocation and resizing is performed) is left up to the JavaScript engine - the ECMAScript specification does not dictate how this is done. So there is basically no precise way to know.

In practice, JavaScript engines are very clever about how the allocate memory and the make sure not to allocate too much. In my opinion, they are far more sophisticated than C#'s List -- because JavaScript engines can dynamically change the underlying data structure depending on the situation. The algorithms vary, but most will consider whether there are any "holes" in your array:

var array = [];
array[0] = "foo" // Is a resizable array
array[1] = "bar" // Is a resizable array
array[2] = "baz" // Is a resizable array
array[1000000] = "hello"; // Is now a hash table
console.log(array[1000000]) // "hello"

If you use arrays normally and use contiguous keys starting at zero, then there are no "holes" and most JavaScript engines will represent the JavaScript array by using a resizable array data structure. Now consider the fourth assignment, I've created a so-called "hole" of roughly a size of a million (the hole spans slots 3-999999). It turns out, JavaScript engines are clever enough not to allocate ~1 million slots in memory for this massive hole. It detects that we have a hole, it will now, represent the JavaScript array using a Dictionary / hash-table like data structure (it uses a binary search tree where the keys are hashed) to save space. It won't store space for the hole, just four mappings: (0, "foo"), (1, "bar"), (2, "baz"), (1000000, "hello").

Unfortunately, accessing the Array is now slower for the engine because it will now have to compute a hash and traverse a tree. When there are no holes, we use a resizable array and we have quicker access times, but when we have a hole the Array's performance is slower. The common terminology is to say an Array is a dense array, when it is without any holes (it uses a resizable array = better performance), and an Array is a sparse array, when it with one or more holes (it uses a hash table = slower performance). For best performance in general, try to use dense arrays.

Now to finish off, let me tell you that the following is a bad idea:

var array = new Array(1000000);
array[0] = "foo"; // Is a hash table

The array above has a hole of size ~1 million (it's like this: ["foo", undefined, undefined, ... undefined]) and so therefore, it is using a hash-table as the underlying data structure. So implementing the resizing yourself is a bad idea - it will create a hole and cause worst performance than better. You're only confusing the JavaScript engine.

This is what your code was doing, your array always had a hole in it and therefore was using a hash table as the underlying data structure; giving slower performance compared to an array without any holes (aka the first version of your code).

Am I correct to assume that not much can be done to speed this process up?

Yes, there is little to be done on the user's side regarding pre-allocation of space. To speed up JavaScript arrays in general you want to avoid creating sparse arrays (avoid created holes):

  1. Don't pre-allocate using new Array(size). Instead "grow as you go". The engine will work out the size of the underlying resizable array itself.
  2. Use contiguous integer keys starting at 0. Don't start from a big integer. Don't add keys that are not integers (e.g. don't use strings as keys).
  3. Try not to delete keys in the middle of arrays (don't delete the element at index 5 from an array with indices 0-9 filled in).
  4. Don't convert to and from dense and sparse arrays (i.e. don't repeatedly add and remove holes). There's an overhead for the engine to convert to and from the resizable array vs hash-table representations.

The disadvantage of [JavaScript Arrays over C# Lists is that they] dynamically allocate more memory each time a new item is added

No, not necessarily. C# Lists and JavaScript Arrays are basically the same when the JavaScript array has no holes. Both are resizable arrays. The difference is that:

  1. C# Lists give the user more control over the behaviour of the resizable array. In JavaScript, you have no control over it -- it's inside the engine.
  2. C# Lists allow the user preallocate memory for better performance, whereas in JavaScript, you should let the engine automatically work out how to preallocate memory in the underlying resizable array for better performance.


Related Topics



Leave a reply



Submit