Hashset That Preserves Ordering

Does HashSet preserve insertion order?

This HashSet MSDN page specifically says:

A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

How can I preserve insertion order in a HashSet?

Stating the HashSet documentation

It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.

If you want to use a Set that remains natural order, use a SortedSet instead.

A Set that further provides a total ordering on its elements. The elements are ordered using their natural ordering, or by a Comparator typically provided at sorted set creation time.

Edit:
A Set does by definiton not say anything about the order of its elements. For exmaple, two Sets are equals if both contain the same elements and have the same size. The iteration order of a Set is dependent on the implementations and may change between versions, so you should not make any assumptions on it. It may depend on the value of #hashCode, but it may as well depend on the insertion order or something else in the future. However, we should not care, because if you do you should use a List or a SortedSet instead.

Java Set retain order?

The Set interface does not provide any ordering guarantees.

Its sub-interface SortedSet represents a set that is sorted according to some criterion. In Java 6, there are two standard containers that implement SortedSet. They are TreeSet and ConcurrentSkipListSet.

In addition to the SortedSet interface, there is also the LinkedHashSet class. It remembers the order in which the elements were inserted into the set, and returns its elements in that order.

Hashset ordering issue

Coincidence. It just happens that the hashCode of Character is it’s numeric value. If you keep adding more characters than there are hash buckets in the HashSet, you’ll see that they get out of order.

Does HashSet preserve order between enumerations?

Practically speaking, it might always be the same between enumerations, but that assumption is not provided for in the description of IEnumerable and the implementor could decide to return then in whichever order it wants.

Who knows what it is doing under the hood, and whether it will keep doing it the same way in the future. For example, a future implementation of HashSet might be optimized to detect low memory conditions and rearrange its contents in memory, thereby affecting the order in which they are returned. So 99.9% of the time they would come back the same order, but if you started exhausting memory resources, it would suddenly return things in a different order.

Bottom line is I would not rely on the order of enumeration to be consistent over time. If the order is important to you then do your foreach over set.OrderBy(x => x) so that you can make sure it is in the order you want.

Does LinkedHashSet constructor preserve order

Looking at the Java 8 implementation of java.util.LinkedHashSet you have this constructor:

public LinkedHashSet(Collection<? extends E> c) {
super(Math.max(2*c.size(), 11), .75f, true);
addAll(c);
}

So what is the content of addAll?

public boolean addAll(Collection<? extends E> c) {
boolean modified = false;
for (E e : c)
if (add(e))
modified = true;
return modified;
}

addAll uses a loop through the collection used in the constructor:

for (E e : c)

This means that if the collection implementation used in the constructor is ordered (e.g. java.util.TreeSet), then the content of the new LinkedHashSet instance will also be ordered.

The implementation in Java 9 is very much the same.

Yes, the order is preserved in case the incoming collection is ordered.

You can only be sure about this by checking the implementation in this specific case.

Ordering of elements in Java HashSet

The second one (just using HashSet) is only a coincidence. From the JavaDocs:

This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.

The third one (LinkedHashSet) is designed to be like that:

Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)

Why do we not get the ordered sequence in HashSet

This is just the contract for a Set in java, from the javadoc

Returns an iterator over the elements in this set.
The elements are returned in no particular order (unless this set is an instance of some class that provides a guarantee). So an implementation of Set isn't required to maintain any order in the values.

In order to return values in order the Set needs to maintain the order. This has costs for speed and space.

A LinkedHashSet maintains insertion order.



Related Topics



Leave a reply



Submit