In Java 8, Why Is the Default Capacity of Arraylist Now Zero

In Java 8, why is the default capacity of ArrayList now zero?

Technically, it's 10, not zero, if you admit for a lazy initialisation of the backing array. See:

public boolean add(E e) {
ensureCapacityInternal(size + 1);
elementData[size++] = e;
return true;
}

private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}

ensureExplicitCapacity(minCapacity);
}

where

/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;

What you're referring to is just the zero-sized initial array object that is shared among all initially empty ArrayList objects. I.e. the capacity of 10 is guaranteed lazily, an optimisation that is present also in Java 7.

Admittedly, the constructor contract is not entirely accurate. Perhaps this is the source of confusion here.

Background

Here's an E-Mail by Mike Duigou

I have posted an updated version of the empty ArrayList and HashMap patch.

http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/

This revised implementation introduces no new fields to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases.

For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity.

From: http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-April/015585.html

Default size of ArrayList

An ArrayList has an internal array to store the list elements.

There is a difference between the two constructor calls in Java 7 and 8:

If you do new ArrayList<>(0) the ArrayList creates a new Object array of size 0.

If you do new ArrayList<>() the ArrayList uses a static empty Object array of size 0 and switches to an own private array once you add items to the list.

EDIT:

The Javadoc of the default ArrayList constructor seems to contradict this.

/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
super();
this.elementData = EMPTY_DEFAULTCAPACITY_EMPTY_ELEMENTDATA; // = static, empty
}

But it does not create an element array of length 10 immediately, but instead when you add elements or ensure the capacity:

public void ensureCapacity(int minCapacity) {
int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
// any size if not default element table
? 0
// larger than default for default empty table. It's already
// supposed to be at default size.
: DEFAULT_CAPACITY; // = 10

if (minCapacity > minExpand) {
ensureExplicitCapacity(minCapacity);
}
}

java.util.ArrayList default capacity?

The initial capacity of a newly instantiated ArrayList is 0 - for performance reasons, an empty ArrayList uses a shared default array of size 0, so that if no new items are added to the list, you don't pay for allocating a new array that is never used.

Once the first item is added, the DEFAULT_CAPACITY (which is 10) is then used, i.e. the ArrayList creates a new array with size 10, which will then hold the first item (and has 9 additional, empty slots).

So I think you indeed used a slightly mistaken interpretation of the documentation.

The initial capacity is 0, and is then increased to 10 upon adding the first item, in order to avoid immediate reallocation for adding the next subsequent 9 items.

Why does using different ArrayList constructors cause a different growth rate of the internal array?

You get precisely what you asked for, respective what has been specified, even in older versions, where the implementation was different:

ArrayList()

Constructs an empty list with an initial capacity of ten.

ArrayList(int)

Constructs an empty list with the specified initial capacity.

So, constructing the ArrayList with the default constructor will give you an ArrayList with an initial capacity of ten, so as long as the list size is ten or smaller, no resize operation will ever be needed.

In contrast, the constructor with the int argument will precisely use the specified capacity, subject to the growing policy which is specified as

The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost.

which applies even when you specify an initial capacity of zero.

Java 8 added the optimization that the creation of the ten elements array is postponed until the first element is added. This is specifically addressing the common case that ArrayList instances (created with the default capacity) stay empty for a long time or even their entire lifetime. Further, when the first actual operation is addAll, it might skip the first array resize operation. This does not affect lists with an explicit initial capacity, as those are usually chosen carefully.

As stated in this answer:

According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases.

The motivation was to optimize precisely these scenarios, not to touch the specified default capacity, which was defined back when ArrayList was created. (Though JDK 1.4 is the first one specifying it explicitly)

Initial size for the ArrayList

You're confusing the size of the array list with its capacity:

  • the size is the number of elements in the list;
  • the capacity is how many elements the list can potentially accommodate without reallocating its internal structures.

When you call new ArrayList<Integer>(10), you are setting the list's initial capacity, not its size. In other words, when constructed in this manner, the array list starts its life empty.

One way to add ten elements to the array list is by using a loop:

for (int i = 0; i < 10; i++) {
arr.add(0);
}

Having done this, you can now modify elements at indices 0..9.

ArrayList: how does the size increase?

A new array is created and the contents of the old one are copied over. That's all you know at the API level. Quoting from the docs (my emphasis):

Each ArrayList instance has a capacity. The capacity is the size of the array used to store the elements in the list. It is always at least as large as the list size. As elements are added to an ArrayList, its capacity grows automatically. The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost.

In terms of how it actually happens with a specific implementation of ArrayList (such as Sun's), in their case you can see the gory details in the source. But of course, relying on the details of a specific implementation isn't usually a good idea...

ArrayList initial size doesn't work

ArrayList is backed by, as its name indicates, an array. Arrays are fixed-size; when the ArrayList gets near the array's capacity, it resizes it - it creates a new array, copies everything over, and ditches the old array.

When you initialize an ArrayList with a "size", that's actually its initial default capacity. It does not initialize anything into the array. The ArrayList is still empty after you create it; you still need to actually add values to the ArrayList.

Advantages of creating an ArrayList with initial capacity of 0?

It keeps the size (in memory) of the ArrayList very small, and is a tactic for when you want the variable to be non-null and ready to use, but don't expect for the List to be populated immediately. If you expect it to be populated immediately, it's best to give it a larger initial value - any "growing" of the ArrayList is internally creating a new primitive array, and copying items over. Growth of an ArrayList is expensive, and should be minimized.

Or, if you're creating lots of instances of a class that each contain one of these List properties. If you don't immediately plan on filling them, you can save a bit of memory by not allocating the room just yet.

However: There is a better way: Collections.emptyList(). Normally you'll want to protect access to that list directly, and (as an example) in your class provide domain-specific method calls that operate on the internal List. For example, let's say you have a School class that contains a List of student names. (Keeping it simple, note this class is not thread safe.)

public class School {
private List<String> studentNames = Collections.emptyList();

public void addStudentName(String name) {
if (studentNames.isEmpty()) {
studentNames = new ArrayList<String>();
}
studentNames.add(name);
}

public void removeStudentName(String name) {
studentNames.remove(name);
if (studentNames.isEmpty()) {
studentNames = Collections.emptyList(); // GC will deallocate the old List
}
}
}

If you're willing to make the isEmpty() checks and perform the initialization/assignment, this is a better alternative to creating lots of empty ArrayList instances, as Collections.emptyList() is a static instance (only one exists) and is not modifiable.

ArrayList public constructor - Constructs an empty list with an initial capacity of ten - where?

This is an optimization. The developers decided to initialize the ArrayList with an empty backing array, and lazily create a non-empty backing array only when you start adding elements to the List.

When you add the first element (by calling add), it calls

ensureCapacityInternal(size + 1);

which checks if elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA, and if so, sets the capacity to

minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);

DEFAULT_CAPACITY is 10.



Related Topics



Leave a reply



Submit