Why Start an Arraylist with an Initial Capacity

What's meant by parameter (int initial capacity) in an arraylist

It's the initial capacity, i.e. the number of items that ArrayList will allocate to begin with as the internal storage of items.

ArrayList can contain "any number of items" (as long you have the memory for it) and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item.

Example:

ArrayList list = new ArrayList<Integer>(2);
list.add(1); // size() == 1
list.add(2); // size() == 2, list is "filled"
list.add(3); // size() == 3, list is expanded to make room for the third element

Advantages of creating an ArrayList with initial capacity of 0?

It keeps the size (in memory) of the ArrayList very small, and is a tactic for when you want the variable to be non-null and ready to use, but don't expect for the List to be populated immediately. If you expect it to be populated immediately, it's best to give it a larger initial value - any "growing" of the ArrayList is internally creating a new primitive array, and copying items over. Growth of an ArrayList is expensive, and should be minimized.

Or, if you're creating lots of instances of a class that each contain one of these List properties. If you don't immediately plan on filling them, you can save a bit of memory by not allocating the room just yet.

However: There is a better way: Collections.emptyList(). Normally you'll want to protect access to that list directly, and (as an example) in your class provide domain-specific method calls that operate on the internal List. For example, let's say you have a School class that contains a List of student names. (Keeping it simple, note this class is not thread safe.)

public class School {
private List<String> studentNames = Collections.emptyList();

public void addStudentName(String name) {
if (studentNames.isEmpty()) {
studentNames = new ArrayList<String>();
}
studentNames.add(name);
}

public void removeStudentName(String name) {
studentNames.remove(name);
if (studentNames.isEmpty()) {
studentNames = Collections.emptyList(); // GC will deallocate the old List
}
}
}

If you're willing to make the isEmpty() checks and perform the initialization/assignment, this is a better alternative to creating lots of empty ArrayList instances, as Collections.emptyList() is a static instance (only one exists) and is not modifiable.

Why start an ArrayList with an initial capacity?

If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.

The larger the final list, the more time you save by avoiding the reallocations.

That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).

Initial size for the ArrayList

You're confusing the size of the array list with its capacity:

  • the size is the number of elements in the list;
  • the capacity is how many elements the list can potentially accommodate without reallocating its internal structures.

When you call new ArrayList<Integer>(10), you are setting the list's initial capacity, not its size. In other words, when constructed in this manner, the array list starts its life empty.

One way to add ten elements to the array list is by using a loop:

for (int i = 0; i < 10; i++) {
arr.add(0);
}

Having done this, you can now modify elements at indices 0..9.

In Java 8, why is the default capacity of ArrayList now zero?

Technically, it's 10, not zero, if you admit for a lazy initialisation of the backing array. See:

public boolean add(E e) {
ensureCapacityInternal(size + 1);
elementData[size++] = e;
return true;
}

private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}

ensureExplicitCapacity(minCapacity);
}

where

/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;

What you're referring to is just the zero-sized initial array object that is shared among all initially empty ArrayList objects. I.e. the capacity of 10 is guaranteed lazily, an optimisation that is present also in Java 7.

Admittedly, the constructor contract is not entirely accurate. Perhaps this is the source of confusion here.

Background

Here's an E-Mail by Mike Duigou

I have posted an updated version of the empty ArrayList and HashMap patch.

http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/

This revised implementation introduces no new fields to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases.

For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity.

From: http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-April/015585.html

What is the benefit of setting the capacity of an ArrayList explicitly

ArrayList as an implementation of a Dynamic array data structure.

It resizes when its underlying array gets full (i.e. the current list index exceeds the last valid index of the underlying array).

When it happens, method add() (or addAll) internally will invoke the method grow(). Which will double the capacity. I.e. it will create a new array with the length two times bigger than the previous length plus a number of new elements that don't fit into the current size.

The growth has a cost of O(n) because all previously added elements need to be copied into the new array.

Reminder: when resizing isn't required, a new element will be added in constant time O(1).

No-argument constructor creates an ArrayList with capacity of 10.

If you expect that a newly created ArrayList would eventually contain let's say 50,000 elements, it makes sense to use an overloaded constructor to provide the initial capacity of 50,000 in order to improve performance by avoiding unnecessary resizing.

Also, for that you can use method ensureCapacity() which accessible in the ArrayList class (not in the List interface, because the notion of capacity isn't applicable to LinkedList which isn't backed by an array).

is there any method to view the current capacity

No, there isn't. That's called encapsulation. ArrayList, StringBuilder, HashMap, etc. are backed by a plain array, but they will not allow interacting with their underlying array directly.

But if you have a case when array initially increases size and then a lot of elements are being removed, and you want to release unoccupied heap space, you can use method trimToSize():

Trims the capacity of this ArrayList instance to be the list's current size. An application can use this operation to minimize the storage of an ArrayList instance.

But it has to be used with caution, because it can lead to cyclic growth and trimming, which will cause performance degradation.

Note that there's no need to worry about the amount of unoccupied space if the list is moderate in size, or if you are not expecting let's say 80% of the data to be removed in one go. I.e. even if the list is huge but 50% of its elements gets removed, and you apply trimToSize() on it, it'll restore its previous capacity with the next added element - that's the scenario of continuously growing and shrinking list which will perform badly.

As a possible option, if you heave a case when most of the data can be removed from a list, instead of using trimToSize() you can filter out the elements that have to be preserved, place them into a new list and dereference the previous one.

Reinstantiating ArrayList with initial capacity

Capacity does not equal size.

A newly created ArrayList with a capacity of 10, will have a backing array that allows for the allocation of 10 elements, but its size will still be zero. Addressing any element in that ArrayList will result in an IndexOutOfBoundsException, although the add() method will let you add an element at index 0.

As the Javadoc for the ArrayList.add() method states:

Throws:

IndexOutOfBoundsException - if the index is out of range (index < 0 || index > size())

This answer shows how an ArrayList can be initialized with a specific number of values.

What is the best practice for setting the initial capacity of an ArrayList that is not always used?

Premature optimisation is the root of all evil. - D. Knuth.

This seems like the kind of "performance issue" which actually never has any effect on performance. For one thing, how sure are you that these empty lists are actually initialised? I suspect that most modern compilers delay initialisation of objects until they know for sure that there will be a call on them. So if you pass the no arg constructor it will most likely never be used unless something is added to the list. On the other hand, if you use the 0 argument constructor, it guarantees that it has to resize every one that it uses.

These are the three laws of performance optimisation

  1. Never assume that you know what the compiled code is actually doing, or that you can sport small optimisations better than the compiler can.
  2. Never optimise without using a profiler to work out where the bottleneck is. If you think that you know, refer to rule number (1).
  3. Don't bother unless your application has a performance issue. Then refer to rule (2).

On the off chance that you somehow still believe that you understand compilers, check out this question: Why is it faster to process a sorted array than an unsorted array?



Related Topics



Leave a reply



Submit