No, let's start!
Guiding language
We use ArrayList almost every day, but during the real interview, we find that many people still don't know the details of the source code, leaving a bad impression on the interviewer. In this section, we'll take a look at the source code related to ArrayList in the interview.
1. Overall architecture
The overall architecture of ArrayList is relatively simple, which is an array structure, as shown in the following figure:
The figure shows an array with a length of 10, counting from 1, index represents the subscript of the array, counting from 0, and elementData represents the array itself. In addition to these two concepts, there are the following three basic concepts in the source code:
- DEFAULT_ Capability indicates the initial size of the array. The default is 10. Remember this number;
- Size indicates the size of the current array. The type is int. it is not decorated with volatile. It is non thread safe
- modCount counts the number of modified versions of the current array. If the array structure changes, it will be + 1.
Class annotation
To see the source code, first look at the class annotation. Let's see what the class annotation says, as follows:
- Allow put null value, and the capacity will be expanded automatically;
- The time complexity of methods such as size, isEmpty, get, set and add is O (1);
- Enhance the for loop or use an iterator. If the array size is changed during iteration, it will fail quickly and throw an exception.
In addition to the four points mentioned in the above comments, the essence of initialization and capacity expansion, iterators and other issues are often asked. Next, we will analyze them one by one from the source code.
2. Source code analysis
2.1 initialization
We have three initialization methods: direct initialization without parameters, initialization with specified size and initialization with specified initial data. The source code is as follows:
Direct initialization without parameters:
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {}; // Directly initialize without parameters, and the array size is empty public ArrayList() { this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA; }
Specify size initialization:
transient Object[] elementData; // Specify array length initialization public ArrayList(int initialCapacity) { if (initialCapacity > 0) { this.elementData = new Object[initialCapacity]; } else if (initialCapacity == 0) { this.elementData = EMPTY_ELEMENTDATA; } else { throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); } }
Specify initial data initialization:
// Specify initial data initialization public ArrayList(Collection<? extends E> c) { Object[] a = c.toArray(); if ((size = a.length) != 0) { if (c.getClass() == ArrayList.class) { elementData = a; } else { elementData = Arrays.copyOf(a, size, Object[].class); } } else { // The given data has no value, and the default is an empty array elementData = EMPTY_ELEMENTDATA; } }
In addition to the Chinese notes of the source code, we add two points:
- When the ArrayList parameterless constructor is initialized, the default size is an empty array, which is not what we often say 10. 10 is the array value expanded at the first add.
- When specifying the initial data initialization, we found a comment like see 6260652. This is a bug in Java, which means that when the element in a given collection is not of Object type, we will convert it to Object type.
Generally, this bug will not be triggered. It will only be triggered in the following scenarios: after the ArrayList is initialized (the ArrayList element is not of Object type), call the toArray method again to get the Object array, and assign a value to the Object array. The code and reason are shown in the figure:
Official view Document address , the problem is solved in Java 9.
2.2. Realization of new addition and expansion
Adding is to add elements to the array, which is mainly divided into two steps:
- Judge whether capacity expansion is required. If necessary, perform capacity expansion;
- Direct assignment.
The two-step source code is reflected as follows:
public boolean add(E e) { //Ensure that the size of the array is sufficient for capacity expansion. The size is the size of the current array ensureCapacityInternal(size + 1); // Increments modCount!! //Direct assignment, thread unsafe elementData[size++] = e; return true; }
Let's take a look at the source code of ensureCapacityInternal:
private void ensureCapacityInternal(int minCapacity) { ensureExplicitCapacity(calculateCapacity(elementData, minCapacity)); } // Calculate the required capacity private static int calculateCapacity(Object[] elementData, int minCapacity) { if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) { return Math.max(DEFAULT_CAPACITY, minCapacity); } return minCapacity; } // Ensure sufficient capacity private void ensureExplicitCapacity(int minCapacity) { // Record the number of array modifications modCount++; // If the expected capacity is greater than the length of the current array, the capacity is expanded if (minCapacity - elementData.length > 0) grow(minCapacity); } // Expand the capacity and copy the existing data into the new array private void grow(int minCapacity) { int oldCapacity = elementData.length; // oldCapacity > > 1 means dividing oldCapacity by 2 // It is half of the original capacity + capacity, i.e. 1.5 times int newCapacity = oldCapacity + (oldCapacity >> 1); // If the expanded value is less than our expected value, the expanded value is equal to our expected value if (newCapacity - minCapacity < 0) newCapacity = minCapacity; // If the expanded value > the maximum value of the array that the JVM can allocate, the maximum value of Integer is used if (newCapacity - MAX_ARRAY_SIZE > 0) newCapacity = hugeCapacity(minCapacity); // Copy array elementData = Arrays.copyOf(elementData, newCapacity); }
The notes should be detailed. We should pay attention to the following four points:
- The rule of capacity expansion is not to double, but half of the original capacity + capacity. Frankly, the size after capacity expansion is 1.5 times of the original capacity;
- The maximum value of an array in ArrayList is Integer.MAX_VALUE, beyond which the JVM will not allocate memory space to the array.
- When adding, the value is not strictly verified, so ArrayList allows null values.
From the new and expanded source code, the following points are worth learning from:
- When expanding the source code, we have the awareness of array size overflow, that is, the lower bound of array size after expansion cannot be less than 0 and the upper bound cannot be greater than the maximum value of Integer. We can learn this awareness.
After the expansion, the assignment is very simple. You can directly add elements to the array: elementData [size++] =e. It is through this simple assignment that there is no lock control, so the operation here is thread unsafe.
2.3 essence of capacity expansion
Capacity expansion is realized through this line of code:
Arrays.copyOf(elementData, newCapacity);
This line of code describes the essence of copying between arrays. For capacity expansion, we will first create a new array that meets our expected capacity, and then copy the data of the old array. We copy it through the System.arraycopy method. This method is a native method. The source code is as follows:
/** * @param src Copied array * @param srcPos Start with the array * @param dest target array * @param destPos Copy from the index position of the target array * @param length Length of copy * This method has no return value, and the value is passed through the reference of dest */ public static native void arraycopy(Object src, int srcPos,Object dest, int destPos,int length);
2.4. Delete
There are many ways to delete elements in ArrayList, such as deleting according to array index, deleting according to value or batch deleting. The principle and idea are the same. We choose the method of deleting according to value to explain the source code:
public boolean remove(Object o) { // If the value to be deleted is null, find the deletion with the first null value in the array if (o == null) { for (int index = 0; index < size; index++) if (elementData[index] == null) { fastRemove(index); return true; } } else { // If the value to be deleted is not null, find the first deletion equal to the value to be deleted for (int index = 0; index < size; index++) // Here, the values are determined to be equal according to equals, and then deleted according to the index position if (o.equals(elementData[index])) { fastRemove(index); return true; } } return false; }
We need to pay attention to two points:
- Null is not checked when adding, so null values can be deleted when deleting;
- The index position of the value in the array is determined by equals. If the array element is not a basic type, we need to pay attention to the specific implementation of equals.
The above code has found the index position of the element to be deleted. The following code deletes the element according to the index position:
private void fastRemove(int index) { // The structure of the record array is about to change. The number of array modifications + 1 modCount++; // After deletion, you need to move the following elements forward to calculate the moved quantity int numMoved = size - index - 1; if (numMoved > 0) // Move the following elements // numMoved indicates how many elements need to be moved from the back of index to the front after deleting the elements at the index position // The reason for subtracting 1 is that size starts from 1 and index starts from 0 // It is copied from the position of index +1. The starting position of the copy is index and the length is numMoved System.arraycopy(elementData, index+1, elementData, index, numMoved); // The last position of the array is assigned null to help GC elementData[--size] = null; // clear to let GC do its work }
From the source code, we can see that after an element is deleted, in order to maintain the array structure, we will move the elements behind the array forward.
2.5 iterators
If you want to implement the iterator yourself, just implement the java.util.Iterator class. ArrayList does the same. Let's take a look at some general parameters of the iterator:
int cursor;// During the iteration, the position of the next element starts from 0 by default. int lastRet = -1; // New scenario: indicates the location of the index in the last iteration; Delete scene: - 1. int expectedModCount = modCount;// expectedModCount indicates the expected version number during the iteration; modCount represents the actual version number of the array.
Iterators generally have three methods:
- Is there any value for hasNext that can be iterated
- next if there is a value that can be iterated, what is the value of the iteration
- remove deletes the value of the current iteration
Let's look at the source code of the following three methods:
2.5.1,hasNext
public boolean hasNext() { return cursor != size;//cursor indicates the position of the next element, and size indicates the actual size. If they are equal, it means that there are no elements to iterate. If they are not equal, it means that they can iterate }
2.5.2,next
public E next() { //During the iteration, judge whether the version number has been modified, and throw the ConcurrentModificationException checkForComodification(); //During this iteration, the index position of the element int i = cursor; if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); // At the next iteration, the position of the element is to prepare for the next iteration cursor = i + 1; // Return element value return (E) elementData[lastRet = i]; } // Version number comparison final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); }
As can be seen from the source code, the next method does two things. The first is to check whether the iteration can continue. The second is to find the value of the iteration and prepare for the next iteration (cursor+1).
2.5.3,remove
public void remove() { // If the position of the array is less than 0 during the last operation, it indicates that the array has been deleted if (lastRet < 0) throw new IllegalStateException(); //During the iteration, judge whether the version number has been modified, and throw the ConcurrentModificationException checkForComodification(); try { ArrayList.this.remove(lastRet); cursor = lastRet; // -1 indicates that the element has been deleted, and duplicate deletion is also prevented here lastRet = -1; // When deleting an element, the value of modCount has changed. Assign it to expectedModCount here // In this way, the values of the two are consistent in the next iteration expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } }
Here we need to pay attention to two points:
- The purpose of lastRet = -1 is to prevent duplicate deletion
- If the element is deleted successfully, the current modCount of the array will change. Here, the expectedModCount will be re assigned, and the values of the two will be the same in the next iteration
2.6 time complexity
From the source code analysis of the add or delete method above, the operation of array elements only needs to add and delete directly according to the array index, so the time complexity is O (1).
2.7 thread safety
We need to emphasize that there is a thread safety problem only when ArrayList is a shared variable. When ArrayList is a local variable in a method, there is no thread safety problem.
The essence of the thread safety problem of ArrayList is that the elementData, size and modConut of ArrayList are not locked during various operations, and the types of these variables are not volatile. Therefore, if multiple threads operate on these variables, the values may be overwritten.
In the class annotation, it is recommended that we use Collections#synchronizedList to ensure thread safety. SynchronizedList is realized by locking each method. Although thread safety is realized, the performance is greatly reduced. The specific implementation source code is as follows:
public boolean add(E e) { synchronized (mutex) {// Synchronized is a lightweight lock, and mutex represents a current synchronized list return c.add(e); } }
summary
Starting from the overall architecture of ArrayList, this paper lands on the core source code implementation such as initialization, addition, capacity expansion, deletion and iteration. We find that ArrayList actually focuses on the underlying array structure, and each API encapsulates the operation of the array, so that users do not need to perceive the underlying implementation, but only need to pay attention to how to use it.
No wordy, the end of the article, it is recommended to connect three times!