How Is a JavaScript Hash Map Implemented

How is a JavaScript hash map implemented?

every javascript object is a simple hashmap which accepts a string or a Symbol as its key, so you could write your code as:

var map = {};
// add a item
map[key1] = value1;
// or remove it
delete map[key1];
// or determine whether a key exists
key1 in map;

javascript object is a real hashmap on its implementation, so the complexity on search is O(1), but there is no dedicated hashcode() function for javascript strings, it is implemented internally by javascript engine (V8, SpiderMonkey, JScript.dll, etc...)

2020 Update:

javascript today supports other datatypes as well: Map and WeakMap. They behave more closely as hash maps than traditional objects.

JavaScript hashmap equivalent

Hash your objects yourself manually, and use the resulting strings as keys for a regular JavaScript dictionary. After all, you are in the best position to know what makes your objects unique. That's what I do.

Example:

var key = function(obj){
// Some unique object-dependent key
return obj.totallyUniqueEmployeeIdKey; // Just an example
};

var dict = {};

dict[key(obj1)] = obj1;
dict[key(obj2)] = obj2;

This way you can control indexing done by JavaScript without heavy lifting of memory allocation, and overflow handling.

Of course, if you truly want the "industrial-grade solution", you can build a class parameterized by the key function, and with all the necessary API of the container, but … we use JavaScript, and trying to be simple and lightweight, so this functional solution is simple and fast.

The key function can be as simple as selecting right attributes of the object, e.g., a key, or a set of keys, which are already unique, a combination of keys, which are unique together, or as complex as using some cryptographic hashes like in DojoX encoding, or DojoX UUID. While the latter solutions may produce unique keys, personally I try to avoid them at all costs, especially, if I know what makes my objects unique.

Update in 2014: Answered back in 2008 this simple solution still requires more explanations. Let me clarify the idea in a Q&A form.

Your solution doesn't have a real hash. Where is it???

JavaScript is a high-level language. Its basic primitive (Object) includes a hash table to keep properties. This hash table is usually written in a low-level language for efficiency. Using a simple object with string keys we use an efficiently implemented hash table without any efforts on our part.

How do you know they use a hash?

There are three major ways to keep a collection of objects addressable by a key:

  • Unordered. In this case to retrieve an object by its key we have to go over all keys stopping when we find it. On average it will take n/2 comparisons.
  • Ordered.
    • Example #1: a sorted array — doing a binary search we will find our key after ~log2(n) comparisons on average. Much better.
    • Example #2: a tree. Again it'll be ~log(n) attempts.
  • Hash table. On average, it requires a constant time. Compare: O(n) vs. O(log n) vs. O(1). Boom.

Obviously JavaScript objects use hash tables in some form to handle general cases.

Do browser vendors really use hash tables???

Really.

  • Chrome/node.js/V8:
    JSObject. Look for
    NameDictionary and
    NameDictionaryShape with
    pertinent details in objects.cc
    and objects-inl.h.
  • Firefox/Gecko:
    JSObject,
    NativeObject, and
    PlainObject with pertinent details in
    jsobj.cpp and
    vm/NativeObject.cpp.

Do they handle collisions?

Yes. See above. If you found a collision on unequal strings, please do not hesitate to file a bug with a vendor.

So what is your idea?

If you want to hash an object, find what makes it unique and use it as a key. Do not try to calculate a real hash or emulate hash tables — it is already efficiently handled by the underlying JavaScript object.

Use this key with JavaScript's Object to leverage its built-in hash table while steering clear of possible clashes with default properties.

Examples to get you started:

  • If your objects include a unique user name — use it as a key.
  • If it includes a unique customer number — use it as a key.
    • If it includes unique government-issued numbers like US SSNs, or a passport number, and your system doesn't allow duplicates — use it as a key.
  • If a combination of fields is unique — use it as a key.
    • US state abbreviation + driver license number makes an excellent key.
    • Country abbreviation + passport number is an excellent key too.
  • Some function on fields, or a whole object, can return a unique value — use it as a key.

I used your suggestion and cached all objects using a user name. But some wise guy is named "toString", which is a built-in property! What should I do now?

Obviously, if it is even remotely possible that the resulting key will exclusively consists of Latin characters, you should do something about it. For example, add any non-Latin Unicode character you like at the beginning or at the end to un-clash with default properties: "#toString", "#MarySmith". If a composite key is used, separate key components using some kind of non-Latin delimiter: "name,city,state".

In general, this is the place where we have to be creative and select the easiest keys with given limitations (uniqueness, potential clashes with default properties).

Note: unique keys do not clash by definition, while potential hash clashes will be handled by the underlying Object.

Why don't you like industrial solutions?

IMHO, the best code is no code at all: it has no errors, requires no maintenance, easy to understand, and executes instantaneously. All "hash tables in JavaScript" I saw were >100 lines of code, and involved multiple objects. Compare it with: dict[key] = value.

Another point: is it even possible to beat a performance of a primordial object written in a low-level language, using JavaScript and the very same primordial objects to implement what is already implemented?

I still want to hash my objects without any keys!

We are in luck: ECMAScript 6 (released in June 2015) defines map and set.

Judging by the definition, they can use an object's address as a key, which makes objects instantly distinct without artificial keys. OTOH, two different, yet identical objects, will be mapped as distinct.

Comparison breakdown from MDN:

Objects are similar to Maps in that both let you set keys to values,
retrieve those values, delete keys, and detect whether something is
stored at a key. Because of this (and because there were no built-in
alternatives), Objects have been used as Maps historically; however,
there are important differences that make using a Map preferable in
certain cases:

  • The keys of an Object are Strings and Symbols, whereas they can be any value for a Map, including functions, objects, and any primitive.
  • The keys in Map are ordered while keys added to object are not. Thus, when iterating over it, a Map object returns keys in order of
    insertion.
  • You can get the size of a Map easily with the size property, while the number of properties in an Object must be determined manually.
  • A Map is an iterable and can thus be directly iterated, whereas iterating over an Object requires obtaining its keys in some fashion
    and iterating over them.
  • An Object has a prototype, so there are default keys in the map that could collide with your keys if you're not careful. As of ES5 this can
    be bypassed by using map = Object.create(null), but this is seldom
    done.
  • A Map may perform better in scenarios involving frequent addition and removal of key pairs.

How to implement Java's HashMap.equals(HashMap) in JavaScript

Here is the code that I created that works:

const addToMap = (map, s) => {
if(map.has(s)){
map.set(s, map.get(s)+1);
} else{ map.set(s, 1);}
}

const perm = (a, b) => {
if(a.length != b.length){ return false; }
let aMap = new Map();
let bMap = new Map();
for(let i = 0; i < a.length; i++){
addToMap(aMap, a.substring(i, i+1));
addToMap(bMap, b.substring(i, i+1));
}
return aMap.toString() == bMap.toString();
}
const s1 = "abcda";
const s2 = "cdbaa";
console.log(perm(s1, s1));

What happened is I was using the wrong syntax for Map(). I was using bracket notation to create my key value pairs instead or Map.prototype methods.

See this code sample:

let m = new Map();
m.set(1, "a");
m[2] = "b";
console.log(m);
//Output: Map { 1 => 'a', 2: 'b' }

Using the Map.get and Map.set methods works for me.

What are the advantages of implementing a hash map over using a Map in JavaScript?

Rolling your own Map implementation does allow you to control its internals. This does allow you to use a custom equality comparison between keys, and use a hash function that is optimised for the keys in your application.

Hashmaps and Time Complexity

My understanding is ES6 Map object can implement a Hashmap in Javascript. Is that correct?

No, it's not correct the way it is phrased. When you say implement you remind me of interfaces and languages like java. In Java, there is the interface map and then there are different implementations like hashmap, treemap, etc.

In the case of javascript, you could use map (or you could even use an array), to implement a hashmap (or hashtable) algorithm.

Do note that in some browsers like v8, map is already built using hashtables so it is already a hashmap. Unfortunately, the ECMAScript 2015 specification (see https://262.ecma-international.org/6.0/#sec-map-objects ) does not specify or dictate the exact implementation. Instead, it is quite open ended allowing each browser or javascript engine to have a compatible implementation.

Map object must be implemented using either hash tables or other
mechanisms
that, on average, provide access times that are sublinear
on the number of elements in the collection. The data structures used
in this Map objects specification is only intended to describe the
required observable semantics of Map objects. It is not intended to be
a viable implementation model.

If a hashtable is used, then it is a hashmap (not the same exactly as java), but the underlying implementation depends on the actual browser.

Note/FYI: Maps in google's V8 engine are built on top of hash tables.

indexOf method in arrays has an O(n) time complexity.

Yes. You cannot do better than O(n) in a random unsorted array.

Does has method in Maps have O(1) time complexity?

Usually yes. Considering that the map uses hashtables (or somethling like hashtables).
Additionally, there has to be:

  • descent hash function that is constant time and doesn't take long to compute
  • not too many collisions because we would then have to iterate through multiple items that had the same hash code.

O(1) isn't always guaranteed, and the worst case scenario of O(N) is quite unlikely.

Have a look @ the theory behind hashmaps and hashtables :

  • https://en.wikipedia.org/wiki/Hash_table
  • https://www.tutorialspoint.com/data_structures_algorithms/hash_data_structure.htm

How JS can find an element in a Map object in one step?

Map uses a hashtable (or a similar mechanism) as specified in the spec.
Therefore, when an object is requested a hash is calculated.
Then the specific location in an internal table is located.
This is the basic theory about hash tables and why they are usualy O(1).
Do note, that if there are many collisions performance can move towards O(N).

See: https://stackoverflow.com/a/9214594/1688441 from Hash table runtime complexity (insert, search and delete)

Hash tables are O(1) average and amortized case complexity, however it suffers from O(n) > worst case time complexity. [And I think this is where your confusion is]

That excellent answer lists the situations of worst cases.

How it works differently than indexOf

Hashtables use a hash and do a lookup inside a table/array.
Instead, indexOf searches step by step within an array/collection/string. See: https://262.ecma-international.org/5.1/#sec-15.4.4.14

If not having an O(1) then Es6 Map is not a real Hashmap

This is not correct. A hashmap can have a worst-case performance of O(N).
There is an excellent wikipedia article about hashmap/hashtable with an excellent table: (from: https://en.wikipedia.org/wiki/Hash_table )

Sample Image

  1. https://en.wikipedia.org/wiki/Hash_table
  2. Hash table runtime complexity (insert, search and delete)

Hash table and Hash map in javascript

Your picture is one way to implement a hash table data structure - there are others. Your image is an example of "separate chaining", "open addressing" is another common strategy. As of Java 7 both Hashtable and HashMap used a form of chaining (look at their Entry classes), however Java 8 introduced a massive rewrite of HashMap (but not Hashtable, as it's a legacy class) to use a tree structure rather than a linked list for its chains. The point being the exact algorithm used by both classes is an implementation detail and is subject to change between releases.

"HashMap" and "Hashtable" are just class names defined by the JDK, they do not necessarily correspond to particular hashing algorithms. JavaScript does not have separate "HashMap" and "Hashtable" concepts, because they don't need them. Java needed to create a separate HashMap class because the Hashtable contract was problematic and couldn't be safely corrected.

So, to answer your questions:

1) Sort-of; it does not properly represent Java 8's HashMap, but it's conceptually not far off. Hashtable does still use this algorithm, but you should almost never use Hashtable.

2) Like you say, JavaScript objects use a hash table data structure under the covers. This does (generally) provide O(1) field access, but like any hash table implementation it must deal with the possibility of conflicts. As discussed in the question @overexchange linked to the exact implementation is browser-dependent.

es6 Map and Set complexity, v8 implementation

Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

Yes. V8 uses a variant of hash tables that generally have O(1) complexity for these operations.

For details, you might want to have a look at https://codereview.chromium.org/220293002/ where OrderedHashTable is implemented based on https://wiki.mozilla.org/User:Jorend/Deterministic_hash_tables.

How is object lookup implemented by Map.get(obj) in the javascript v8 engine?

Short answer: ES6 Maps and Sets: how are object keys indexed efficiently? (with a minor update: four years later, the precise way how the random "hash" is stored on objects has indeed been improved; the principle remains the same).

Long answer: as Bergi suggested, read the source. Note that the details could change at any time (which is one reason why there isn't a whole lot of documentation about what the current implementation does).



Related Topics



Leave a reply



Submit