Performance Differences Between Arraylist and Linkedlist

Performance differences between ArrayList and LinkedList

ArrayList is faster than LinkedList if I randomly access its elements. I think random access means "give me the nth element". Why ArrayList is faster?

ArrayList has direct references to every element in the list, so it can get the n-th element in constant time. LinkedList has to traverse the list from the beginning to get to the n-th element.

LinkedList is faster than ArrayList for deletion. I understand this one. ArrayList's slower since the internal backing-up array needs to be reallocated.

ArrayList is slower because it needs to copy part of the array in order to remove the slot that has become free. If the deletion is done using the ListIterator.remove() API, LinkedList just has to manipulate a couple of references; if the deletion is done by value or by index, LinkedList has to potentially scan the entire list first to find the element(s) to be deleted.

If it means move some elements back and then put the element in the middle empty spot, ArrayList should be slower.

Yes, this is what it means. ArrayList is indeed slower than LinkedList because it has to free up a slot in the middle of the array. This involves moving some references around and in the worst case reallocating the entire array. LinkedList just has to manipulate some references.

When to use LinkedList over ArrayList in Java?

Summary ArrayList with ArrayDeque are preferable in many more use-cases than LinkedList. If you're not sure — just start with ArrayList.


TLDR, in ArrayList accessing an element takes constant time [O(1)] and adding an element takes O(n) time [worst case]. In LinkedList inserting an element takes O(n) time and accessing also takes O(n) time but LinkedList uses more memory than ArrayList.

LinkedList and ArrayList are two different implementations of the List interface. LinkedList implements it with a doubly-linked list. ArrayList implements it with a dynamically re-sizing array.

As with standard linked list and array operations, the various methods will have different algorithmic runtimes.

For LinkedList<E>

  • get(int index) is O(n) (with n/4 steps on average), but O(1) when index = 0 or index = list.size() - 1 (in this case, you can also use getFirst() and getLast()). One of the main benefits of LinkedList<E>
  • add(int index, E element) is O(n) (with n/4 steps on average), but O(1) when index = 0 or index = list.size() - 1 (in this case, you can also use addFirst() and addLast()/add()). One of the main benefits of LinkedList<E>
  • remove(int index) is O(n) (with n/4 steps on average), but O(1) when index = 0 or index = list.size() - 1 (in this case, you can also use removeFirst() and removeLast()). One of the main benefits of LinkedList<E>
  • Iterator.remove() is O(1). One of the main benefits of LinkedList<E>
  • ListIterator.add(E element) is O(1). One of the main benefits of LinkedList<E>

Note: Many of the operations need n/4 steps on average, constant number of steps in the best case (e.g. index = 0), and n/2 steps in worst case (middle of list)

For ArrayList<E>

  • get(int index) is O(1). Main benefit of ArrayList<E>
  • add(E element) is O(1) amortized, but O(n) worst-case since the array must be resized and copied
  • add(int index, E element) is O(n) (with n/2 steps on average)
  • remove(int index) is O(n) (with n/2 steps on average)
  • Iterator.remove() is O(n) (with n/2 steps on average)
  • ListIterator.add(E element) is O(n) (with n/2 steps on average)

Note: Many of the operations need n/2 steps on average, constant number of steps in the best case (end of list), n steps in the worst case (start of list)

LinkedList<E> allows for constant-time insertions or removals using iterators, but only sequential access of elements. In other words, you can walk the list forwards or backwards, but finding a position in the list takes time proportional to the size of the list. Javadoc says "operations that index into the list will traverse the list from the beginning or the end, whichever is closer", so those methods are O(n) (n/4 steps) on average, though O(1) for index = 0.

ArrayList<E>, on the other hand, allow fast random read access, so you can grab any element in constant time. But adding or removing from anywhere but the end requires shifting all the latter elements over, either to make an opening or fill the gap. Also, if you add more elements than the capacity of the underlying array, a new array (1.5 times the size) is allocated, and the old array is copied to the new one, so adding to an ArrayList is O(n) in the worst case but constant on average.

So depending on the operations you intend to do, you should choose the implementations accordingly. Iterating over either kind of List is practically equally cheap. (Iterating over an ArrayList is technically faster, but unless you're doing something really performance-sensitive, you shouldn't worry about this -- they're both constants.)

The main benefits of using a LinkedList arise when you re-use existing iterators to insert and remove elements. These operations can then be done in O(1) by changing the list locally only. In an array list, the remainder of the array needs to be moved (i.e. copied). On the other side, seeking in a LinkedList means following the links in O(n) (n/2 steps) for worst case, whereas in an ArrayList the desired position can be computed mathematically and accessed in O(1).

Another benefit of using a LinkedList arises when you add or remove from the head of the list, since those operations are O(1), while they are O(n) for ArrayList. Note that ArrayDeque may be a good alternative to LinkedList for adding and removing from the head, but it is not a List.

Also, if you have large lists, keep in mind that memory usage is also different. Each element of a LinkedList has more overhead since pointers to the next and previous elements are also stored. ArrayLists don't have this overhead. However, ArrayLists take up as much memory as is allocated for the capacity, regardless of whether elements have actually been added.

The default initial capacity of an ArrayList is pretty small (10 from Java 1.4 - 1.8). But since the underlying implementation is an array, the array must be resized if you add a lot of elements. To avoid the high cost of resizing when you know you're going to add a lot of elements, construct the ArrayList with a higher initial capacity.

If the data structures perspective is used to understand the two structures, a LinkedList is basically a sequential data structure which contains a head Node. The Node is a wrapper for two components : a value of type T [accepted through generics] and another reference to the Node linked to it. So, we can assert it is a recursive data structure (a Node contains another Node which has another Node and so on...). Addition of elements takes linear time in LinkedList as stated above.

An ArrayList is a growable array. It is just like a regular array. Under the hood, when an element is added, and the ArrayList is already full to capacity, it creates another array with a size which is greater than previous size. The elements are then copied from previous array to new one and the elements that are to be added are also placed at the specified indices.

Why does ArrayList seriously outperform LinkedList?

From LinkedList source code:

/**
* Appends the specified element to the end of this list.
*
* <p>This method is equivalent to {@link #addLast}.
*
* @param e element to be appended to this list
* @return {@code true} (as specified by {@link Collection#add})
*/
public boolean add(E e) {
linkLast(e);
return true;
}

/**
* Links e as last element.
*/
void linkLast(E e) {
final Node<E> l = last;
final Node<E> newNode = new Node<>(l, e, null);
last = newNode;
if (l == null)
first = newNode;
else
l.next = newNode;
size++;
modCount++;
}

From ArrayList source code:

/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}

private void ensureExplicitCapacity(int minCapacity) {
modCount++;

// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}

So linked list has to create new node for each element added, while array list does not. ArrayList does not reallocate/resize for each new element, so most of time array list simply set object in array and increment size, while linked list does much more work.

You also commented:

When I wrote a linked list in college, I allocated blocks at a time and then farmed them out.

I do not think this would work in Java. You cannot do pointer tricks in Java, so you would have to allocate a lot of small arrays, or create empty nodes ahead. In both cases overhead would probably be a bit higher.

How to prove the difference between arraylist and linkedlist performance

An important note before you read this answer: These are only potential implementations and not the actual ones, the principles behind it should still hold however.

An ArrayList uses an array as its inner structure, consider this array:

int[] array = new int[1];

This array will hold one element. Let's pretend we can add to it using the method defined in ArrayList, it would look like this:

array.add(0);

What happens when we have to add another element to this array? It clearly only holds one item. Well it will have to be resized, a common way of resizing is to double it's capacity. So let's do that:

int[] temp = new int[2];
for(int i = 0; i < array.length; i++) {
temp[i] = array[i];
}
array = temp;

What would happen if we were to add two more elements? We would have to create yet another temporary array, copy the values over, and insert them again. This is one of the reasons adding to a LinkedList is faster. A LinkedList is not built with the help of an underlying array.

A LinkedList consists of nodes that all hold references to either the previous and the next node or just the next node, so inserting into the end of it is very cheap if you save a reference to the last element of the LinkedList

Depending on how far you've come in learning Java I'd highly recommend that you implement these data structures from scratch, it's very helpful to understand how they work!

@Makoto and @CommuSoft mention very important aspects of the differences that I did not cover, so read those as well.

Difference between ArrayList and LinkedList in Java - the whys for performance

The explanation for your first two (weird) test numbers is:

Inserting into ArrayList is generally slower because it has to grow once you hit its boundaries. It will have to create a new bigger array, and copy data from the original one.

But when you create an ArrayList that is already huge enough to fit all your inserts (which is your case since you're doing new ArrayList(n+10)) - it will obviously not involve any array copying operations. Adding to it will be even faster than with LinkedList because LinkedList will have to deal with its "links" (pointers), while huge ArrayList just sets value at given (last) index.

Also your tests are not good because each iteration involves autoboxing (conversion from int to Integer) - it will both take additional time to do that and will also screw up the results because of the Integers cache that will get filled on the first pass.

Performance of LinkedList vs ArrayList in maintaining an ordered list

You could use Collections#binarySearch on the sorted list to find the correct insertion point. ArrayList would probably perform better than a LinkedList, especially for larg-ish sizes, but that is easy to test.

I ran a micro benchmark of various methods: using a sort after each insertion or a binarySearch to insert in the right place, both with ArrayList (AL) and LinkedList (LL). I also added Commons TreeList and guava's TreeMultiset.

Conclusions

  • the best algo among those tested is using TreeMultiset, but it is not a list strictly speaking - the next best option is to use an ArrayList + binarySearch
  • ArrayList performs better than LinkedList in all situations and the latter took several minutes to complete with 100,000 elements (ArrayList took less than one second).

Code of the best performer, for reference:

@Benchmark public ArrayList<Integer> binarySearchAL() {
ArrayList<Integer> list = new ArrayList<> ();

Random r = new Random();
for (int i = 0; i < n; i++) {
int num = r.nextInt();
int index = Collections.binarySearch(list, num);
if (index >= 0) list.add(index, num);
else list.add(-index - 1, num);
current = list.get(0); //O(1), to make sure the sort is not optimised away
}
return list;
}

Full code on bitbucket.

Full results

The "Benchmark" column contains the name of the method under test (baseLine just fills a list without sorting it, the other methods have explicit names: AL=ArrayList, LL=LinkedList,TL=Commons TreeList,treeMultiSet=guava), (n) is the size of the list, Score is the time taken in milliseconds.

Benchmark                            (n)  Mode  Samples     Score     Error  Units
c.a.p.SO28164665.baseLine 100 avgt 10 0.002 ± 0.000 ms/op
c.a.p.SO28164665.baseLine 1000 avgt 10 0.017 ± 0.001 ms/op
c.a.p.SO28164665.baseLine 5000 avgt 10 0.086 ± 0.002 ms/op
c.a.p.SO28164665.baseLine 10000 avgt 10 0.175 ± 0.007 ms/op
c.a.p.SO28164665.binarySearchAL 100 avgt 10 0.014 ± 0.001 ms/op
c.a.p.SO28164665.binarySearchAL 1000 avgt 10 0.226 ± 0.006 ms/op
c.a.p.SO28164665.binarySearchAL 5000 avgt 10 2.413 ± 0.125 ms/op
c.a.p.SO28164665.binarySearchAL 10000 avgt 10 8.478 ± 0.523 ms/op
c.a.p.SO28164665.binarySearchLL 100 avgt 10 0.031 ± 0.000 ms/op
c.a.p.SO28164665.binarySearchLL 1000 avgt 10 3.876 ± 0.100 ms/op
c.a.p.SO28164665.binarySearchLL 5000 avgt 10 263.717 ± 6.852 ms/op
c.a.p.SO28164665.binarySearchLL 10000 avgt 10 843.436 ± 33.265 ms/op
c.a.p.SO28164665.sortAL 100 avgt 10 0.051 ± 0.002 ms/op
c.a.p.SO28164665.sortAL 1000 avgt 10 3.381 ± 0.189 ms/op
c.a.p.SO28164665.sortAL 5000 avgt 10 118.882 ± 22.030 ms/op
c.a.p.SO28164665.sortAL 10000 avgt 10 511.668 ± 171.453 ms/op
c.a.p.SO28164665.sortLL 100 avgt 10 0.082 ± 0.002 ms/op
c.a.p.SO28164665.sortLL 1000 avgt 10 13.045 ± 0.460 ms/op
c.a.p.SO28164665.sortLL 5000 avgt 10 642.593 ± 188.044 ms/op
c.a.p.SO28164665.sortLL 10000 avgt 10 1182.698 ± 159.468 ms/op
c.a.p.SO28164665.binarySearchTL 100 avgt 10 0.056 ± 0.002 ms/op
c.a.p.SO28164665.binarySearchTL 1000 avgt 10 1.083 ± 0.052 ms/op
c.a.p.SO28164665.binarySearchTL 5000 avgt 10 8.246 ± 0.329 ms/op
c.a.p.SO28164665.binarySearchTL 10000 avgt 10 735.192 ± 56.071 ms/op
c.a.p.SO28164665.treeMultiSet 100 avgt 10 0.021 ± 0.001 ms/op
c.a.p.SO28164665.treeMultiSet 1000 avgt 10 0.288 ± 0.008 ms/op
c.a.p.SO28164665.treeMultiSet 5000 avgt 10 1.809 ± 0.061 ms/op
c.a.p.SO28164665.treeMultiSet 10000 avgt 10 4.283 ± 0.214 ms/op

For 100k items:

c.a.p.SO28164665.binarySearchAL    100000  avgt        6  890.585 ± 68.730  ms/op
c.a.p.SO28164665.treeMultiSet 100000 avgt 6 105.273 ± 9.309 ms/op

Java: Does it make sense to create LinkedList and convert it to an ArrayList for sorting?

This is related to two questions:

1. What's the difference between ArrayList and LinkedList, which one is faster for insertion?

2. Which one is faster in sorting?

For question 1, the essential difference between ArrayList and LinkedList is the data structure. ArrayList uses an array inside and good at random access(O(1)). On the other hand, LinkedList in good at delete and insert items(O(1). You can find more here

Back to the question, because we don't need to insert by index here.
So ArrayList and LinkedList both O(1) operation. But LinkedList will cause more memory because of the data structure, and ArrayList will cause more time if it needs to scale capacity(set a large enough initial capacity will help speed up the insertion).

For question 2, you can find the answer here
ArrayList is better for sorting.

In conclusion, I think you should stick with ArrayList, no need to import LinkedList here.

Performance of ArrayList vs. LinkedList varies for different machine?

1.Why the performance varies for different machine ?

This can be caused by many factors. For this example, the speed of the RAM is likely the largest contributor, however CPU speed, system load, etc. can all effect performance. Differences of this type are fine and expected.

2.Why arraylist add and remove slower than linklist ?

Over this large of a data set, the array list will intermittently run out of space in the array which it holds the data in internally. When this occurs, the array will need to resize, which means creating a new larger array and copying all of the data over (non-trivial task). Removing can require a shift of following elements in the array. This is similar.

3.Why Arraylist get faster ?

Arraylist get can be done in O(1) time (constant time) because it's just an offset memory lookup in an array, internally. A linked list, however, must traverse the list to find that element. This takes O(n) time (linear time).

4.In which case linklist should be prefer ?

If you do more insert/remove operations than lookup, linked lists can be better in performance than an arraylist. Conversely, if you are doing more lookup operations, the arraylist will probably give you better performance.



Related Topics



Leave a reply



Submit