Maximum size of HashSet, Vector, LinkedList
There is no specified maximum size of these structures.
The actual practical size limit is probably somewhere in the region of Integer.MAX_VALUE
(i.e. 2147483647, roughly 2 billion elements), as that's the maximum size of an array in Java.
- A
HashSet
uses aHashMap
internally, so it has the same maximum size as that- A
HashMap
uses an array which always has a size that is a power of two, so it can be at most 230 = 1073741824 elements big (since the next power of two is bigger thanInteger.MAX_VALUE
). - Normally the number of elements is at most the number of buckets multiplied by the load factor (0.75 by default). However, when the
HashMap
stops resizing, then it will still allow you to add elements, exploiting the fact that each bucket is managed via a linked list. Therefore the only limit for elements in aHashMap
/HashSet
is memory.
- A
- A
Vector
uses an array internally which has a maximum size of exactlyInteger.MAX_VALUE
, so it can't support more than that many elements - A
LinkedList
doesn't use an array as the underlying storage, so that doesn't limit the size. It uses a classical doubly linked list structure with no inherent limit, so its size is only bounded by the available memory. Note that aLinkedList
will report the size wrongly if it is bigger thanInteger.MAX_VALUE
, because it uses aint
field to store the size and the return type ofsize()
isint
as well.
Note that while the Collection
API does define how a Collection
with more than Integer.MAX_VALUE
elements should behave. Most importantly it states this the size()
documentation:
If this collection contains more than
Integer.MAX_VALUE
elements, returnsInteger.MAX_VALUE
.
Note that while HashMap
, HashSet
and LinkedList
seem to support more than Integer.MAX_VALUE
elements, none of those implement the size()
method in this way (i.e. they simply let the internal size
field overflow).
This leads me to believe that other operations also aren't well-defined in this condition.
So I'd say it's safe to use those general-purpose collections with up to Integer.MAX_VLAUE
elements. If you know that you'll need to store more than that, then you should switch to dedicated collection implementations that actually support this.
Maximum size of HashSet, Vector, LinkedList
There is no specified maximum size of these structures.
The actual practical size limit is probably somewhere in the region of Integer.MAX_VALUE
(i.e. 2147483647, roughly 2 billion elements), as that's the maximum size of an array in Java.
- A
HashSet
uses aHashMap
internally, so it has the same maximum size as that- A
HashMap
uses an array which always has a size that is a power of two, so it can be at most 230 = 1073741824 elements big (since the next power of two is bigger thanInteger.MAX_VALUE
). - Normally the number of elements is at most the number of buckets multiplied by the load factor (0.75 by default). However, when the
HashMap
stops resizing, then it will still allow you to add elements, exploiting the fact that each bucket is managed via a linked list. Therefore the only limit for elements in aHashMap
/HashSet
is memory.
- A
- A
Vector
uses an array internally which has a maximum size of exactlyInteger.MAX_VALUE
, so it can't support more than that many elements - A
LinkedList
doesn't use an array as the underlying storage, so that doesn't limit the size. It uses a classical doubly linked list structure with no inherent limit, so its size is only bounded by the available memory. Note that aLinkedList
will report the size wrongly if it is bigger thanInteger.MAX_VALUE
, because it uses aint
field to store the size and the return type ofsize()
isint
as well.
Note that while the Collection
API does define how a Collection
with more than Integer.MAX_VALUE
elements should behave. Most importantly it states this the size()
documentation:
If this collection contains more than
Integer.MAX_VALUE
elements, returnsInteger.MAX_VALUE
.
Note that while HashMap
, HashSet
and LinkedList
seem to support more than Integer.MAX_VALUE
elements, none of those implement the size()
method in this way (i.e. they simply let the internal size
field overflow).
This leads me to believe that other operations also aren't well-defined in this condition.
So I'd say it's safe to use those general-purpose collections with up to Integer.MAX_VLAUE
elements. If you know that you'll need to store more than that, then you should switch to dedicated collection implementations that actually support this.
Excessive LinkedList using
For buffer overflows in languages like Java, this thread might be interesting to you: Does Java have buffer overflows?
Amount of elements in ArrayList
ArrayList
can't hold more than Integer.MAX_VALUE
elements.
So 2147483647 is the max.
How to fix size of LinkedHashSet?
By default LinkedHashSet is not capable for this. You can read about the reasons here: https://stackoverflow.com/a/7632240/9337345
The only thing what you can do is to create an inherited LinkedHashSet which is responsible for the fixed size as you desire.
For example:
import java.util.Iterator;
import java.util.LinkedHashSet;
public class MyLinkedHashSet<T> extends LinkedHashSet<T> {
private long maxSize;
public MyLinkedHashSet(long maxSize) {
this.maxSize = maxSize;
}
@Override
public boolean add(T item) {
if(size() == maxSize) {
removeFirst();
}
return super.add(item);
}
private void removeFirst() {
if(size() > 0) {
Iterator<T> iterator = iterator();
T item = iterator.next();
remove(item);
}
}
public static void main(String[] args) {
LinkedHashSet<Integer> set = new MyLinkedHashSet<>(3);
set.add(1);
set.add(2);
set.add(3);
set.add(4);
System.out.println(set); // [2, 3, 4]
set.clear();
System.out.println(set); // []
set.addAll(Arrays.asList(1, 2, 3, 4, 5));
System.out.println(set); // [3, 4, 5]
}
}
I hope this is what you meant.
What is the default capacity of collection framework classes?
There is no one correct answer here as it will depend on the Java version. For example RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap made ArrayList
and HashMap
empty by default in Java 8.
You would have to check the default constructor for each of the mentioned classes in your JDK. In theory this could also vary between JDK build (e.g. Oracle, IBM, Azul...) as default ArrayList
capacity is not part of Java Language Specification.
When to use LinkedList over ArrayList in Java?
Summary ArrayList
with ArrayDeque
are preferable in many more use-cases than LinkedList
. If you're not sure — just start with ArrayList
.
TLDR, in ArrayList
accessing an element takes constant time [O(1)] and adding an element takes O(n) time [worst case]. In LinkedList
inserting an element takes O(n) time and accessing also takes O(n) time but LinkedList
uses more memory than ArrayList
.
LinkedList
and ArrayList
are two different implementations of the List
interface. LinkedList
implements it with a doubly-linked list. ArrayList
implements it with a dynamically re-sizing array.
As with standard linked list and array operations, the various methods will have different algorithmic runtimes.
For LinkedList<E>
get(int index)
is O(n) (with n/4 steps on average), but O(1) whenindex = 0
orindex = list.size() - 1
(in this case, you can also usegetFirst()
andgetLast()
). One of the main benefits ofLinkedList<E>
add(int index, E element)
is O(n) (with n/4 steps on average), but O(1) whenindex = 0
orindex = list.size() - 1
(in this case, you can also useaddFirst()
andaddLast()
/add()
). One of the main benefits ofLinkedList<E>
remove(int index)
is O(n) (with n/4 steps on average), but O(1) whenindex = 0
orindex = list.size() - 1
(in this case, you can also useremoveFirst()
andremoveLast()
). One of the main benefits ofLinkedList<E>
Iterator.remove()
is O(1). One of the main benefits ofLinkedList<E>
ListIterator.add(E element)
is O(1). One of the main benefits ofLinkedList<E>
Note: Many of the operations need n/4 steps on average, constant number of steps in the best case (e.g. index = 0), and n/2 steps in worst case (middle of list)
For ArrayList<E>
get(int index)
is O(1). Main benefit ofArrayList<E>
add(E element)
is O(1) amortized, but O(n) worst-case since the array must be resized and copiedadd(int index, E element)
is O(n) (with n/2 steps on average)remove(int index)
is O(n) (with n/2 steps on average)Iterator.remove()
is O(n) (with n/2 steps on average)ListIterator.add(E element)
is O(n) (with n/2 steps on average)
Note: Many of the operations need n/2 steps on average, constant number of steps in the best case (end of list), n steps in the worst case (start of list)
LinkedList<E>
allows for constant-time insertions or removals using iterators, but only sequential access of elements. In other words, you can walk the list forwards or backwards, but finding a position in the list takes time proportional to the size of the list. Javadoc says "operations that index into the list will traverse the list from the beginning or the end, whichever is closer", so those methods are O(n) (n/4 steps) on average, though O(1) for index = 0
.
ArrayList<E>
, on the other hand, allow fast random read access, so you can grab any element in constant time. But adding or removing from anywhere but the end requires shifting all the latter elements over, either to make an opening or fill the gap. Also, if you add more elements than the capacity of the underlying array, a new array (1.5 times the size) is allocated, and the old array is copied to the new one, so adding to an ArrayList
is O(n) in the worst case but constant on average.
So depending on the operations you intend to do, you should choose the implementations accordingly. Iterating over either kind of List is practically equally cheap. (Iterating over an ArrayList
is technically faster, but unless you're doing something really performance-sensitive, you shouldn't worry about this -- they're both constants.)
The main benefits of using a LinkedList
arise when you re-use existing iterators to insert and remove elements. These operations can then be done in O(1) by changing the list locally only. In an array list, the remainder of the array needs to be moved (i.e. copied). On the other side, seeking in a LinkedList
means following the links in O(n) (n/2 steps) for worst case, whereas in an ArrayList
the desired position can be computed mathematically and accessed in O(1).
Another benefit of using a LinkedList
arises when you add or remove from the head of the list, since those operations are O(1), while they are O(n) for ArrayList
. Note that ArrayDeque
may be a good alternative to LinkedList
for adding and removing from the head, but it is not a List
.
Also, if you have large lists, keep in mind that memory usage is also different. Each element of a LinkedList
has more overhead since pointers to the next and previous elements are also stored. ArrayLists
don't have this overhead. However, ArrayLists
take up as much memory as is allocated for the capacity, regardless of whether elements have actually been added.
The default initial capacity of an ArrayList
is pretty small (10 from Java 1.4 - 1.8). But since the underlying implementation is an array, the array must be resized if you add a lot of elements. To avoid the high cost of resizing when you know you're going to add a lot of elements, construct the ArrayList
with a higher initial capacity.
If the data structures perspective is used to understand the two structures, a LinkedList is basically a sequential data structure which contains a head Node. The Node is a wrapper for two components : a value of type T [accepted through generics] and another reference to the Node linked to it. So, we can assert it is a recursive data structure (a Node contains another Node which has another Node and so on...). Addition of elements takes linear time in LinkedList as stated above.
An ArrayList is a growable array. It is just like a regular array. Under the hood, when an element is added, and the ArrayList is already full to capacity, it creates another array with a size which is greater than previous size. The elements are then copied from previous array to new one and the elements that are to be added are also placed at the specified indices.
Performance differences between ArrayList and LinkedList
ArrayList is faster than LinkedList if I randomly access its elements. I think random access means "give me the nth element". Why ArrayList is faster?
ArrayList
has direct references to every element in the list, so it can get the n-th element in constant time. LinkedList
has to traverse the list from the beginning to get to the n-th element.
LinkedList is faster than ArrayList for deletion. I understand this one. ArrayList's slower since the internal backing-up array needs to be reallocated.
ArrayList
is slower because it needs to copy part of the array in order to remove the slot that has become free. If the deletion is done using the ListIterator.remove()
API, LinkedList
just has to manipulate a couple of references; if the deletion is done by value or by index, LinkedList
has to potentially scan the entire list first to find the element(s) to be deleted.
If it means move some elements back and then put the element in the middle empty spot, ArrayList should be slower.
Yes, this is what it means. ArrayList
is indeed slower than LinkedList
because it has to free up a slot in the middle of the array. This involves moving some references around and in the worst case reallocating the entire array. LinkedList
just has to manipulate some references.
Related Topics
Java's Fork/Join VS Executorservice - When to Use Which
Retrofit 2.0 How to Get Deserialised Error Response.Body
Jaxb Creating Context and Marshallers Cost
Apache Tomcat Not Showing in Eclipse Server Runtime Environments
Keystore Type: Which One to Use
How to Merge Documents Correctly
Compiler Error: "Class, Interface, or Enum Expected"
Best Practice to Use Httpclient in Multithreaded Environment
How to Draw a Decent Looking Circle in Java
Calling a Java Method to Draw Graphics
How to Set Same Scale for Domain and Range Axes Jfreechart
Java 8 Streams - Collect VS Reduce
Java: Jackson Polymorphic JSON Deserialization of an Object with an Interface Property
Check If File Exists on Remote Server Using Its Url
Tomcat in Idea. War Exploded: Server Is Not Connected. Deploy Is Not Available