Set VS Array , Difference

The difference between Arrays and Sets in Kotlin

Short answer: an Array is an indexed data structure holding a fixed number of items; a Set is an unordered data structure holding items with no duplicates. Both are typed.

Long answer:

In basic terms, an array in Kotlin is like most other languages: a data structure holding values of the same type. Arrays have a fixed length, are ordered, and accessed randomly by index (starting from 0).

An Array maps directly to a Java array. This is the only data structure that the JVM provides, other than Object. (In fact, array types are treated as special subtypes of Object.) These are typed: you can have an array of a primitive type (e.g. an array of ints), or of a reference type (e.g. an array of Numbers). The type is available at run-time, and enforced. Arrays are treated as being covariant (which can cause run-time errors).

In Kotlin/JVM, Array inherits almost all those features (except for covariance). It's used only for arrays of reference types; Kotlin provides separate classes for arrays of primitives (IntArray &c).

And in basic terms, a Set is like several other languages: an unordered data structure holding items of the same type, none of which are equal.

Set is an interface that's part of the Java Collections framework (which also includes Collection, List, and Map). A Set value can point to any object implementing that interface. It's generic, with a single type parameter specifying what values can be stored in the set; this happens at compile time (only).

In Kotlin, Set is covariant, and read-only; there's a MutableSet subinterface which adds mutator methods. MutableSets are variable-size, and grow and shrink as needed.

There are many different implementations of the Set (and MutableSet) interface, with different performance characteristics: insertion, removal, and testing for presernce may be O(1) or O(n) or something in-between, and the memory usage and concurrency differ too.

Sets can be iterated, but the order is not specified in general. (Some implementations may make guarantees about whether the order is consistent, and if so, whether/how it relates to the values and/or how they were added.)

So, which should you use? If you need to prevent duplicates, then a Set is the obvious choice. Whereas if you need the values to be ordered, then an Array would be more suitable -- though because it's not part of the Collections framework, it doesn't always play well with other collection types. (Kotlin provides many extension methods to try to smooth over the gap, but there are still many corner cases.) So in general, it's usually better to use a List instead: that gives you finer control over mutability, avoids some awkward situations (especially regarding type parameters), doesn't fix the length, gives you many more extension methods, and is one character shorter!

Javascript Set vs. Array performance

Ok, I have tested adding, iterating and removing elements from both an array and a set. I ran a "small" test, using 10 000 elements and a "big" test, using 100 000 elements. Here are the results.

Adding elements to a collection

It would seem that the .push array method is about 4 times faster than the .add set method, no matter the number of elements being added.

Iterating over and modifying elements in a collection

For this part of the test I used a for loop to iterate over the array and a for of loop to iterate over the set. Again, iterating over the array was faster. This time it would seem that it is exponentially so as it took twice as long during the "small" tests and almost four times longer during the "big" tests.

Removing elements from a collection

Now this is where it gets interesting. I used a combination of a for loop and .splice to remove some elements from the array and I used for of and .delete to remove some elements from the set. For the "small" tests, it was about three times faster to remove items from the set (2.6 ms vs 7.1 ms) but things changed drastically for the "big" test where it took 1955.1 ms to remove items from the array while it only took 83.6 ms to remove them from the set, 23 times faster.

Conclusions

At 10k elements, both tests ran comparable times (array: 16.6 ms, set: 20.7 ms) but when dealing with 100k elements, the set was the clear winner (array: 1974.8 ms, set: 83.6 ms) but only because of the removing operation. Otherwise the array was faster. I couldn't say exactly why that is.

I played around with some hybrid scenarios where an array was created and populated and then converted into a set where some elements would be removed, the set would then be reconverted into an array. Although doing this will give much better performance than removing elements in the array, the additional processing time needed to transfer to and from a set outweighs the gains of populating an array instead of a set. In the end, it is faster to only deal with a set. Still, it is an interesting idea, that if one chooses to use an array as a data collection for some big data that doesn't have duplicates, it could be advantageous performance wise, if there is ever a need to remove many elements in one operation, to convert the array to a set, perform the removal operation, and convert the set back to an array.

Array code:

var timer = function(name) {

  var start = new Date();

  return {

    stop: function() {

      var end = new Date();

      var time = end.getTime() - start.getTime();

      console.log('Timer:', name, 'finished in', time, 'ms');

    }

  }

};

var getRandom = function(min, max) {

  return Math.random() * (max - min) + min;

};

var lastNames = ['SMITH', 'JOHNSON', 'WILLIAMS', 'JONES', 'BROWN', 'DAVIS', 'MILLER', 'WILSON', 'MOORE', 'TAYLOR', 'ANDERSON', 'THOMAS'];

var genLastName = function() {

  var index = Math.round(getRandom(0, lastNames.length - 1));

  return lastNames[index];

};

var sex = ["Male", "Female"];

var genSex = function() {

  var index = Math.round(getRandom(0, sex.length - 1));

  return sex[index];

};

var Person = function() {

  this.name = genLastName();

  this.age = Math.round(getRandom(0, 100))

  this.sex = "Male"

};

var genPersons = function() {

  for (var i = 0; i < 100000; i++)

    personArray.push(new Person());

};

var changeSex = function() {

  for (var i = 0; i < personArray.length; i++) {

    personArray[i].sex = genSex();

  }

};

var deleteMale = function() {

  for (var i = 0; i < personArray.length; i++) {

    if (personArray[i].sex === "Male") {

      personArray.splice(i, 1)

      i--

    }

  }

};

var t = timer("Array");

var personArray = [];

genPersons();

changeSex();

deleteMale();

t.stop();

console.log("Done! There are " + personArray.length + " persons.")

Javascript set vs array vs object definition

Every value in a set has to be unique, but in Array you can push same value as many times as you'd like.

"Set objects are collections of values. You can iterate through the elements of a set in insertion order. A value in the Set may only occur once; it is unique in the Set's collection."
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set

Removing values, interestingly enough, when dealing with small amounts of data, there isn't much performance difference between the two (Array and Set), but when you start to deal with big data, removal is a lot faster in Sets vs Arrays.

Adding values to Array is 4 times faster than adding them to set.

Iterating through the values Array is here also better performer (Growing exponentially based on the amount of data).

You can read more about the performance differences at;
Javascript Set vs. Array performance

"At 10k elements, both tests ran comparable times (array: 16.6 ms, set: 20.7 ms) but when dealing with 100k elements, the set was the clear winner (array: 1974.8 ms, set: 83.6 ms) but only because of the removing operation. Otherwise the array was faster. I couldn't say exactly why that is."

Count unique items in object Set Vs Object write-up https://github.com/anvaka/set-vs-object

TL:DR Set is almost two times faster than Object.

Why Set was introduced (My guess)?

There are sometimes when you need an easy way to build up a collection of values and you have to make sure each value is unique. This brings a lot overhead away from the old fashioned check if array has the value or not and then push or skip.

This scenario was not covered in the performance post I linked earlier, but doing fast experiment on Array of 10,000 items and then adding 15 items where 5 were duplicates to the combined collection, adding to array with checking for duplicate values and skipping if duplicate it took 74ms and with set just using the set.add(value) it took 9ms.

When would you need this?

One real life example that came to my mind is that if you have an email marketing letter and you for example would have options;
- TV
- Music
- Movies
- Sports

One day your client or you would like to have only one email option for TV, Music and Movies called "Entertainment". You would need to build up a new collection based on the data from these 3 different values and you would need to make sure each email are added to the new entertainment emailing list only once. Meaning that users could've opted-in for Movies as well as TV if they are interested in both.

Here you could just make set Entertainment and then iterate through all the lists TV/Music/Movies and use Entertainment.add(email) and the set would take care of skipping values that already exists in the list. Hence avoiding duplicate emails in the emailing list.

Also set would be optimal format to store your email subscriber addresses, as nobody could accidentally opt-in twice for you emails.

Using a set vs usings an array in c++

It doesn't make sense to talk about which is better array or set without understanding what you are trying to accomplish.

There's a consideration you should think about: what kind of container that the iterators are pointing at, and is that container is expected to be being updated?

For example, if you are storing iterators to a vector (doesn't matter where you put these iterators), and you update the vector, the previously store iterators are possibly invalid. Be very careful of storing iterators.

It sounds like you are caching results for speed. If you need to cache results, you might be better off using one of the unordered containers. Use the same key for your search results as you do for your cache. Don't store the iterators, just the query key and the actual result. The unordered_set has O(1) lookup time and they don't take up very much space as they hash the key and store the results.

Difference between Array, Set and Dictionary in Swift

Here are the practical differences between the different types:

Arrays are effectively ordered lists and are used to store lists of information in cases where order is important.

For example, posts in a social network app being displayed in a tableView may be stored in an array.

Sets are different in the sense that order does not matter and these will be used in cases where order does not matter.

Sets are especially useful when you need to ensure that an item only appears once in the set.

Dictionaries are used to store key, value pairs and are used when you want to easily find a value using a key, just like in a dictionary.
For example, you could store a list of items and links to more information about these items in a dictionary.

Hope this helps :)

(For more information and to find Apple's own definitions, check out Apple's guides at https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/CollectionTypes.html)

difference between javascript array and jquery set

As the person who was responsible for this jQuery feature, I thought I would share some historical notes.

If you study the jQuery API, you may notice something odd: the object returned by $()/jQuery() is not only an "array-like" object with a .length property and [i] access to its elements, but it also has a couple of fairly redundant methods: .get(i) and .size().

.get(i) is very similar to an array's [i]: it returns one of the elements of the jQuery array/object. And .size() is the same as .length.

In fact, if you look at the implementation of .size() you will see that it simply returns the .length property:

// The number of elements contained in the matched element set
size: function() {
    return this.length;
},

There is a little more to .get(). If you don't pass an argument, it is the same as .toArray(), and if you do pass an argument it allows both positive and negative indexes. Negative indexes count backwards from the end of the array similar to Python or Ruby.

But for the simple case of a non-negative index, .get(i) boils down to:

// Get the Nth element in the matched element set
get: function( num ) {
    return this[ num ];
},

Why all this redundancy? .size() and .get(i) just do the same thing as the usual .length or [i], so why have both?

In the very first jQuery release in January 2006 (long before 1.0), the object returned by $() was not an "array-like" object. It was just a JavaScript object with .get(i) and .size() methods. The actual list of DOM elements was a separate "private" property of the jQuery object, and you were supposed to use those methods to access its elements and length.

As I worked with that initial jQuery release, it seemed a bit clumsy to have to call .get(i) and .size() to access the elements of the returned jQuery object. This object seemed to be a lot like an array, but you couldn't access its elements the same way as a normal array. So I thought, why not make it act more like a real array with [i] and .length?

It was a fairly simple change to do that, so we ran with it, but we kept the old .get(i) and size() methods for compatibility with code that already used those.

Now a confession: I don't remember the reason for making the return object an array-like object instead of directly inheriting from Array. (It was 11 years ago after all!)

We would have preferred making this object an actual array, but there was a good reason why that would work. Perhaps I will remember after sleeping on it, but in the meantime, this is how we got to the place where the jQuery return object is "array-like" but not a true array.