JavaScript Es6 Computational/Time Complexity of Collections

Javascript ES6 computational/time complexity of collections

Right from that very paragraph your linked to:

Set objects must be implemented using [mechanisms] that, on average, provide access times that are sublinear on the number of elements in the collection.

You will find the same sentence for Maps, WeakMaps and WeakSets.

It looks the ECMA spec mandates that the implementations (e.g. Set.prototype.has) are to use a linear time (O(n)) algorithm.

No:

The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects. It is not intended to be a viable implementation model.

The observable semantics are mostly related to the predictable iteration order (which still can be implemented efficient and fast). It is indeed expected by the specification that an implementation uses a hash table or something similar with constant access, though trees (with logarithmic access complexity) are allowed as well.

Is the Set.has() method O(1) and Array.indexOf O(n)?

If one read the specification of has(), there is an algorithm describing it:

Algorithm for Set.prototype.has(value):

The following steps are taken:

  • Let S be the this value.
  • If Type(S) is not Object, throw a TypeError exception.
  • If S does not have a [[SetData]] internal slot, throw a TypeError exception.
  • Let entries be the List that is the value of S’s [[SetData]] internal slot.
  • Repeat for each e that is an element of entries,

    • If e is not empty and SameValueZero(e, value) is true, return true.
  • Return false.

And apparently, based on that algorithm and the presence of the word REPEAT one can have some confusion about it to be O(1) (we could think it could be O(n)). However, on the specification we can read that:

Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.

Thanks to @CertainPerformance for pointing this.

So, we can create a test to compare Array.indexOf() and Set.has() in the worst case, i.e. look for an item that isn't in the array at all (thanks to @aquinas for pointing this test):

// Initialize array.let a = [];
for (let i = 1; i < 500; i++){ a.push(i);}
// Initialize set.let s = new Set(a);
// Initialize object.let o = {};a.forEach(x => o[x] = true);
// Test Array.indexOf().console.time("Test_Array.indexOf()");for (let i = 0; i <= 10000000; i++){ a.indexOf(1000);}console.timeEnd("Test_Array.indexOf()");
// Test Set.has().console.time("Test_Set.has()");for (let i = 0; i <= 10000000; i++){ s.has(1000);}console.timeEnd("Test_Set.has()");
// Test Object.hasOwnProperty().console.time("Test_Object.hasOwnProperty()");for (let i = 0; i <= 10000000; i++){ o.hasOwnProperty(1000);}console.timeEnd("Test_Object.hasOwnProperty()");
.as-console {background-color:black !important; color:lime;}.as-console-wrapper {max-height:100% !important; top:0;}

Time complexity of Set and Map in javascript

According to the ECMA documentation for Set and Maps (http://www.ecma-international.org/ecma-262/6.0/index.html#sec-set-objects):

Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection. The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects. It is not intended to be a viable implementation model.

Sample Image

Sample Image

Sample Image

You will find similar sentences for Maps, WeakMaps and WeakSets. So, you should expect the time-complexity to be sublinear. Also, you can check out a solution on Javascript ES6 computational/time complexity of collections

es6 Map and Set complexity, v8 implementation

Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

Yes. V8 uses a variant of hash tables that generally have O(1) complexity for these operations.

For details, you might want to have a look at https://codereview.chromium.org/220293002/ where OrderedHashTable is implemented based on https://wiki.mozilla.org/User:Jorend/Deterministic_hash_tables.

Time complexity between two iterate methods of immutable collection

As you pointed out, the semantics are slightly different. In the example case they both provide an intersection of the ids, so


> JSON.stringify(s1())
'{"id1":{"id":"id1"},"id100":{"id":"id100"}}'
> JSON.stringify(s2())
'{"id1":{"id":"id1"},"id100":{"id":"id100"}}'

However, there are edge cases with the data structure which do not produce like for like results, such as the example you gave in the comment:

const data = Map({
id1: Map({id: 'id1'}),
id2: Map({id: 'id1'}),
id100: Map({id: 'id100'}),
});

const ids = List(['id1', 'id100']);

...

> JSON.stringify(s1())
'{"id1":{"id":"id1"},"id2":{"id":"id1"},"id100":{"id":"id100"}}'
> JSON.stringify(s2())
'{"id1":{"id":"id1"},"id100":{"id":"id100"}}'

Note. The above case looks like a bug as the id in the (value) map doesn't match the id of the key; but who knows?

In general

  • approach 1 produces 1 item for each item in the top level map (data) whose value has an id item that is contained in the list.
  • approach 2 produces 1 item for each value in the list that has an entry in the top level map (data)

As the two approaches differ in terms of lookup in the amp (data) -- one goes by the key, the other by the value of id in the value map -- if there is an inconsistency in these two values, as per the second example, you will get a difference.

In general you may be better with the second approach as the lookup into the Map will likely be cheaper than lookup into the list if both collections are of similar size. If the collections are of largely different size, you would need to take that into account.

Lookup into the list will be O(N) whereas lookup into the Map is documented as O(log32 N) (so some kind of wide tree implementation). So for a map M and list L, the cost of apprach 1 would be O(L * log32 M) whereas the cost of the second approach would be O(M * L), if M == L (or is close), then of course approach 1 wins, on paper.

It's almost always best to profile these things, rather than worry about the theoretical time complexity.

Practically, there may be another nice approach that relies on the already sorted nature of the map. If you sort the list first, (O(L log L)), then you can just use a sliding window over elements of both to find the intersection...

Time Space Complexity of k smallest of unsorted array

Your solution uses a characteristic of JavaScript objects: keys that are decimal representations of indexes will be iterated in sorted order when calling functions like Object.entries.

From the specification we can only learn that setting and getting object properties must have sub-linear time complexity (see Javascript ES6 computational/time complexity of collections), so it is not an absolute requirement of the language that these operations run in constant time.

If these were constant in time, and iteration over these properties would take linear time, then we would have found a method to sort numbers in linear time, which is not possible unless some restrictions apply which would allow for a non-comparative sorting algorithm such as radix sorting algorithms.

And there are restrictions here: object keys are only iterated in their numerical order when these numbers are integers in the range 0 to 231-1. So this does not apply for:

  • negative values
  • fractional numbers
  • numbers greater than 231-1 (See also Object.keys order for large numerical indexes?)

Such keys will be iterated after other numbers, in the order they were inserted (which is what also happens with keys that are not numeric representations at all). So your solution can produce wrong results when such cases occur.

Here is a run of your code on slightly adapted inputs that violate one of the above conditions:

let input, k;

input = [7, 10, 4, -3, 20, 15]; // Notice -3
console.log(kSmallest(input, 3)); // 10 (should be 7)

input = [7, 10, 4, 3.1, 20, 15]; // Notice 3.1
console.log(kSmallest(input, 4)); // 15 (should be 10)

input = [12000000000, 3000000000, 5000000000, 7000000000, 19000000000]; // Big numbers
console.log(kSmallest(input, 2)); // 12000000000 (should be 5000000000)

// Your functions (unchanged)
function orderedMapFrequency(array) {
const map = {};
for (let i = 0; i < array.length; i++) {
if (!map[array[i]]) {
map[array[i]] = 1;
} else {
map[array[i]]++;
}
}
return map;
}

function kSmallest(arr, k) {
let map = orderedMapFrequency(arr);
let frequencies = 0;
for (const [key, val] of Object.entries(map)) {
frequencies = frequencies + val;
if (frequencies >= k) {
return key;
}
}
}

Computational complexity for DFS

The algorithmic complexity of both methods is the same: O(m * n), where m is the total number of elements to visit and n is the number of neighbors (assuming it's the same for every element).

In first case the "n" part of the complexity is attributed to adding and removing elements to the queue + checking the visited status of the element. Each element would be added to the queue exactly n+1 times.

In the second case the "n" part of the complexity comes from the visited + queued check. These checks would be performed exactly n times per element.

This difference in algorithmic complexity is constant, and O notation omits constants.



Related Topics



Leave a reply



Submit