Does HashSet preserve insertion order?
This HashSet MSDN page specifically says:
A set is a collection that contains no duplicate elements, and whose elements are in no particular order.
How can I preserve insertion order in a HashSet?
Stating the HashSet documentation
It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.
If you want to use a Set
that remains natural order, use a SortedSet instead.
A Set that further provides a total ordering on its elements. The elements are ordered using their natural ordering, or by a Comparator typically provided at sorted set creation time.
Edit:
A Set
does by definiton not say anything about the order of its elements. For exmaple, two Sets are equals if both contain the same elements and have the same size. The iteration order of a Set
is dependent on the implementations and may change between versions, so you should not make any assumptions on it. It may depend on the value of #hashCode
, but it may as well depend on the insertion order or something else in the future. However, we should not care, because if you do you should use a List
or a SortedSet
instead.
Java Set retain order?
The Set
interface does not provide any ordering guarantees.
Its sub-interface SortedSet
represents a set that is sorted according to some criterion. In Java 6, there are two standard containers that implement SortedSet
. They are TreeSet
and ConcurrentSkipListSet
.
In addition to the SortedSet
interface, there is also the LinkedHashSet
class. It remembers the order in which the elements were inserted into the set, and returns its elements in that order.
Hashset ordering issue
Coincidence. It just happens that the hashCode of Character is it’s numeric value. If you keep adding more characters than there are hash buckets in the HashSet, you’ll see that they get out of order.
Does HashSet preserve order between enumerations?
Practically speaking, it might always be the same between enumerations, but that assumption is not provided for in the description of IEnumerable and the implementor could decide to return then in whichever order it wants.
Who knows what it is doing under the hood, and whether it will keep doing it the same way in the future. For example, a future implementation of HashSet might be optimized to detect low memory conditions and rearrange its contents in memory, thereby affecting the order in which they are returned. So 99.9% of the time they would come back the same order, but if you started exhausting memory resources, it would suddenly return things in a different order.
Bottom line is I would not rely on the order of enumeration to be consistent over time. If the order is important to you then do your foreach over set.OrderBy(x => x)
so that you can make sure it is in the order you want.
Does LinkedHashSet constructor preserve order
Looking at the Java 8 implementation of java.util.LinkedHashSet you have this constructor:
public LinkedHashSet(Collection<? extends E> c) {
super(Math.max(2*c.size(), 11), .75f, true);
addAll(c);
}
So what is the content of addAll
?
public boolean addAll(Collection<? extends E> c) {
boolean modified = false;
for (E e : c)
if (add(e))
modified = true;
return modified;
}
addAll
uses a loop through the collection used in the constructor:
for (E e : c)
This means that if the collection implementation used in the constructor is ordered (e.g. java.util.TreeSet
), then the content of the new LinkedHashSet
instance will also be ordered.
The implementation in Java 9 is very much the same.
Yes, the order is preserved in case the incoming collection is ordered.
You can only be sure about this by checking the implementation in this specific case.
Ordering of elements in Java HashSet
The second one (just using HashSet
) is only a coincidence. From the JavaDocs:
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
The third one (LinkedHashSet
) is designed to be like that:
Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)
Why do we not get the ordered sequence in HashSet
This is just the contract for a Set
in java, from the javadoc
Returns an iterator over the elements in this set.
The elements are returned in no particular order (unless this set is an instance of some class that provides a guarantee). So an implementation ofSet
isn't required to maintain any order in the values.
In order to return values in order the Set
needs to maintain the order. This has costs for speed and space.
A LinkedHashSet
maintains insertion order.
Related Topics
Returning in the Middle of a Using Block
Xmlwriter to Write to a String Instead of to a File
Waiting for Async/Await Inside a Task
How to Hide "Chrome Is Being Controlled by Automated Software" Infobar Within Chrome V76
Adding Unknown (At Design Time) Properties to an Expandoobject
Fluent API, Many-To-Many in Entity Framework Core
Md5 Hash with Salt for Keeping Password in Db in C#
Am I Misunderstanding Linq to SQL .Asenumerable()
In .Net/C# Test If Process Has Administrative Privileges
How to Read the Data in a Wav File to an Array
How Expensive Is the Lock Statement
Displaying Standard Datatables in MVC
Azure Asp .Net Webapp the Request Timed Out
How to Select Xml Nodes with Xml Namespaces from an Xmldocument