Es6 Map and Set Complexity, V8 Implementation

es6 Map and Set complexity, v8 implementation

Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

Yes. V8 uses a variant of hash tables that generally have O(1) complexity for these operations.

For details, you might want to have a look at https://codereview.chromium.org/220293002/ where OrderedHashTable is implemented based on https://wiki.mozilla.org/User:Jorend/Deterministic_hash_tables.

Javascript ES6 computational/time complexity of collections

Right from that very paragraph your linked to:

Set objects must be implemented using [mechanisms] that, on average, provide access times that are sublinear on the number of elements in the collection.

You will find the same sentence for Maps, WeakMaps and WeakSets.

It looks the ECMA spec mandates that the implementations (e.g. Set.prototype.has) are to use a linear time (O(n)) algorithm.

No:

The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects. It is not intended to be a viable implementation model.

The observable semantics are mostly related to the predictable iteration order (which still can be implemented efficient and fast). It is indeed expected by the specification that an implementation uses a hash table or something similar with constant access, though trees (with logarithmic access complexity) are allowed as well.

ES6 Maps and Sets: how are object keys indexed efficiently?

Yes, the implementation is based on hashing, and has (amortized) constant access times.

"they use object identity" is a simplification; the full story is that ES Maps and Sets use the SameValueZero algorithm for determining equality.

In line with this specification, V8's implementation computes "real" hashes for strings and numbers, and chooses a random number as "hash" for objects, which it stores as a private (hidden) property on these objects for later accesses. (That's not quite ideal and might change in the future, but for now that's what it is.)

Using memoryAddress % keySpace cannot work because the garbage collector moves objects around, and rehashing all Maps and Sets every time any object might have moved would be prohibitively complicated and expensive.

Hashmaps and Time Complexity

My understanding is ES6 Map object can implement a Hashmap in Javascript. Is that correct?

No, it's not correct the way it is phrased. When you say implement you remind me of interfaces and languages like java. In Java, there is the interface map and then there are different implementations like hashmap, treemap, etc.

In the case of javascript, you could use map (or you could even use an array), to implement a hashmap (or hashtable) algorithm.

Do note that in some browsers like v8, map is already built using hashtables so it is already a hashmap. Unfortunately, the ECMAScript 2015 specification (see https://262.ecma-international.org/6.0/#sec-map-objects ) does not specify or dictate the exact implementation. Instead, it is quite open ended allowing each browser or javascript engine to have a compatible implementation.

Map object must be implemented using either hash tables or other
mechanisms
that, on average, provide access times that are sublinear
on the number of elements in the collection. The data structures used
in this Map objects specification is only intended to describe the
required observable semantics of Map objects. It is not intended to be
a viable implementation model.

If a hashtable is used, then it is a hashmap (not the same exactly as java), but the underlying implementation depends on the actual browser.

Note/FYI: Maps in google's V8 engine are built on top of hash tables.

indexOf method in arrays has an O(n) time complexity.

Yes. You cannot do better than O(n) in a random unsorted array.

Does has method in Maps have O(1) time complexity?

Usually yes. Considering that the map uses hashtables (or somethling like hashtables).
Additionally, there has to be:

  • descent hash function that is constant time and doesn't take long to compute
  • not too many collisions because we would then have to iterate through multiple items that had the same hash code.

O(1) isn't always guaranteed, and the worst case scenario of O(N) is quite unlikely.

Have a look @ the theory behind hashmaps and hashtables :

  • https://en.wikipedia.org/wiki/Hash_table
  • https://www.tutorialspoint.com/data_structures_algorithms/hash_data_structure.htm

How JS can find an element in a Map object in one step?

Map uses a hashtable (or a similar mechanism) as specified in the spec.
Therefore, when an object is requested a hash is calculated.
Then the specific location in an internal table is located.
This is the basic theory about hash tables and why they are usualy O(1).
Do note, that if there are many collisions performance can move towards O(N).

See: https://stackoverflow.com/a/9214594/1688441 from Hash table runtime complexity (insert, search and delete)

Hash tables are O(1) average and amortized case complexity, however it suffers from O(n) > worst case time complexity. [And I think this is where your confusion is]

That excellent answer lists the situations of worst cases.

How it works differently than indexOf

Hashtables use a hash and do a lookup inside a table/array.
Instead, indexOf searches step by step within an array/collection/string. See: https://262.ecma-international.org/5.1/#sec-15.4.4.14

If not having an O(1) then Es6 Map is not a real Hashmap

This is not correct. A hashmap can have a worst-case performance of O(N).
There is an excellent wikipedia article about hashmap/hashtable with an excellent table: (from: https://en.wikipedia.org/wiki/Hash_table )

Sample Image

  1. https://en.wikipedia.org/wiki/Hash_table
  2. Hash table runtime complexity (insert, search and delete)

Are JavaScript Map Objects Indexed to Optimize map.get?

Yes, as you would expect from such a data type, a Map does utilize a hash-table under the hood.

Proof by source:

As always, the proof is in the source:

Excerpt from V8 source file src/objects.h class JSMap:

// The JSMap describes EcmaScript Harmony maps
class JSMap : public JSCollection {
public:
DECLARE_CAST(JSMap)

static void Initialize(Handle<JSMap> map, Isolate* isolate);
static void Clear(Handle<JSMap> map);

// Dispatched behavior.
DECLARE_PRINTER(JSMap)
DECLARE_VERIFIER(JSMap)

private:
DISALLOW_IMPLICIT_CONSTRUCTORS(JSMap);
};

As we can see, JSMap extends JSCollection.

Now if we take a look at the declaration for JSCollection:

Excerpt from V8 source file src/objects.h class JSCollection:

class JSCollection : public JSObject {
public:
// [table]: the backing hash table
DECL_ACCESSORS(table, Object)

static const int kTableOffset = JSObject::kHeaderSize;
static const int kSize = kTableOffset + kPointerSize;

private:
DISALLOW_IMPLICIT_CONSTRUCTORS(JSCollection);
};

Here we can see that it uses a hash table, with a nice comment to clarify it.


There was some question as to if that hash table refers only to object properties, and not to the get method. As we can from the source code to Map.prototype.get, a hash map is being used.

Excerpt from V8 source file src/js/collection.js MapGet:

function MapGet(key) {
if (!IS_MAP(this)) {
throw MakeTypeError(kIncompatibleMethodReceiver,
'Map.prototype.get', this);
}
var table = %_JSCollectionGetTable(this);
var numBuckets = ORDERED_HASH_TABLE_BUCKET_COUNT(table);
var hash = GetExistingHash(key);
if (IS_UNDEFINED(hash)) return UNDEFINED;
var entry = MapFindEntry(table, numBuckets, key, hash);
if (entry === NOT_FOUND) return UNDEFINED;
return ORDERED_HASH_MAP_VALUE_AT(table, entry, numBuckets);
}

Excerpt from V8 source file src/js/collection.js MapFindEntry:

function MapFindEntry(table, numBuckets, key, hash) {
var entry = HashToEntry(table, hash, numBuckets);
if (entry === NOT_FOUND) return entry;
var candidate = ORDERED_HASH_MAP_KEY_AT(table, entry, numBuckets);
if (key === candidate) return entry;
var keyIsNaN = NumberIsNaN(key);
while (true) {
if (keyIsNaN && NumberIsNaN(candidate)) {
return entry;
}
entry = ORDERED_HASH_MAP_CHAIN_AT(table, entry, numBuckets);
if (entry === NOT_FOUND) return entry;
candidate = ORDERED_HASH_MAP_KEY_AT(table, entry, numBuckets);
if (key === candidate) return entry;
}
return NOT_FOUND;
}

Proof by benchmarking:

There is another way to test if it uses a hash map. Make many entries, and test what the longest and shortest lookup times are. Something like this:

'use strict';

let m = new Map();
let a = [];

for (let i = 0; i < 10000000; i++) {
let o = {};
m.set(o, i);
a.push(o);
}

let lookupLongest = null;
let lookupShortest = null;

a.forEach(function(item) {
let dummy;
let before = Date.now();
dummy = m.get(item);
let after = Date.now();
let diff = after - before;
if (diff > lookupLongest || lookupLongest === null) {
lookupLongest = diff;
}
if (diff < lookupShortest || lookupShortest === null) {
lookupShortest = diff;
}
});
console.log('Longest Lookup Time:', lookupLongest);
console.log('Shortest Lookup Time:', lookupShortest);

After a few seconds, I get the following output:

$ node test.js
Longest Lookup Time: 1
Shortest Lookup Time: 0

Such close lookup times would certainly not be possible if it where looping though every entry.

How is a JavaScript hash map implemented?

every javascript object is a simple hashmap which accepts a string or a Symbol as its key, so you could write your code as:

var map = {};
// add a item
map[key1] = value1;
// or remove it
delete map[key1];
// or determine whether a key exists
key1 in map;

javascript object is a real hashmap on its implementation, so the complexity on search is O(1), but there is no dedicated hashcode() function for javascript strings, it is implemented internally by javascript engine (V8, SpiderMonkey, JScript.dll, etc...)

2020 Update:

javascript today supports other datatypes as well: Map and WeakMap. They behave more closely as hash maps than traditional objects.



Related Topics



Leave a reply



Submit