What Is the Most Efficient Java Collections Library

What is the most efficient Java Collections library?

From inspection, it looks like Trove is just a library of collections for primitive types - it's not like it's meant to be adding a lot of functionality over the normal collections in the JDK.

Personally (and I'm biased) I love Guava (including the former Google Java Collections project). It makes various tasks (including collections) a lot easier, in a way which is at least reasonably efficient. Given that collection operations rarely form a bottleneck in my code (in my experience) this is "better" than a collections API which may be more efficient but doesn't make my code as readable.

Given that the overlap between Trove and the Guava is pretty much nil, perhaps you could clarify what you're actually looking for from a collections library.

Which Java Collection should I use?

Since I couldn't find a similar flowchart I decided to make one myself.

This flow chart does not try and cover things like synchronized access, thread safety etc or the legacy collections, but it does cover the 3 standard Sets, 3 standard Maps and 2 standard Lists.

Sample Image

This image was created for this answer and is licensed under a Creative Commons Attribution 4.0 International License. The simplest attribution is by linking to either this question or this answer.

Other resources

Probably the most useful other reference is the following page from the oracle documentation which describes each Collection.

HashSet vs TreeSet

There is a detailed discussion of when to use HashSet or TreeSet here:
Hashset vs Treeset

ArrayList vs LinkedList

Detailed discussion: When to use LinkedList over ArrayList?

Fastest Java HashSetInteger library

Have you tried working with the initial capacity and load factor parameters while creating your HashSet?

HashSet doc

Initial capacity, as you might think, refers to how big will the empty hashset be when created, and loadfactor is a threshhold that determines when to grow the hash table. Normally you would like to keep the ratio between used buckets and total buckets, below two thirds, which is regarded as the best ratio to achieve good stable performance in a hash table.

Dynamic rezing of a hash table

So basically, try to set an initial capacity that will fit your needs (to avoid re-creating and reassigning the values of a hash table when it grows), as well as fiddling with the load factor until you find a sweet spot.

It might be that for your particular data distribution and setting/getting values, a lower loadfactor could help (hardly a higher one will, but your milage may vary).

What is the fastest java collection for retrieving large numbers of DTOs?

As irreputable says: If you need a simple collection, than ArrayList should perform good because it is based on an Array which is fast by definition using the System functions.

If you set the initial capacity to a higher value (don't know what you call a large number), than it will be even faster because it reduces the amount of incremental reallocation.

Any other collection has some kind of an overhead like looking for hashcodes or beeing synchronized.

Java library with more data structures

Guava has a number of additional data structures, as well as the Apache Commons Collections library.



Related Topics



Leave a reply



Submit