ArrayList&CopyOnWriteArrayList of Java Common Collection Classes

Keywords: Attribute Java

ArrayList is a single-threaded data structure, which is prone to unpredictable errors in a multi-threaded environment. So the Java class library provides us with CopyOnWriteArrayList for multithreading.

Let's first look at ArrayList, which has the following properties

    private static final int DEFAULT_CAPACITY = 10;

    private static final Object[] EMPTY_ELEMENTDATA = {};

    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    transient Object[] elementData; // non-private to simplify nested class access

    private int size;
    
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
  • elementData: The type is transient Object [], indicating whether ArrayList stores data in the underlying way or an array, except that ArrayList can dynamically change the size of the array (of course, the size of the array itself can not be changed after declaration, ArrayList here is used to create a new array to change its capacity). As for why transient is used, you can refer to this blog post.
    Why is elementData in ArrayList modified by transient s?
  • EMPTY_ELEMENTDATA: When creating an empty Array List using new ArryaList(0), elementData = EMPTY_ELEMENTDATA. If there are multiple empty Array Lists in the program, they all point to the same EMPTY_ELEMENTDATA, so the program is optimized.
  • DEFAULTCAPACITY_EMPTY_ELEMENTDATA: elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA when creating an empty ArrayList using the new ArrayList(), then the elementData size expands to DEFAULT_CAPACITY when the first element is added to the array.

The difference between the two is that java1.8 Only then. adopt ArrayList Constructors can clearly tell the difference between the two.

public ArrayList(int initialCapacity) {
     if (initialCapacity > 0) {
           this.elementData = new Object[initialCapacity];
       } else if (initialCapacity == 0) {
           this.elementData = EMPTY_ELEMENTDATA;
       } else {
           throw new IllegalArgumentException("Illegal Capacity: "+
                                              initialCapacity);
       }
   }
public ArrayList() {
     this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
   }
  • DEFAULT_CAPACITY: ArrayList default capacity
  • Size: ArrayList size (number of elements) defaults to 0 because no size is declared. Note that size here is different from elementData.length, which refers to the number of elements in an array, while elementDate.length is the length of the array. The end of the array may be empty, so elementData.length > size.
  • MAX_ARRAY_SIZE: Maximum capacity of ArrayList, Integer.MAX_VALUE-8

How does the add() method make an empty array size DEFAULT_CAPACITY? Let's first look at the add() method

public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

Among them, the ensureCapacityInternal() method is called, which uses a set of combination punches as follows:

//Combination entry to ensure elementData. length > size, otherwise call group method to expand
private void ensureCapacityInternal(int minCapacity) {
        ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
    }
    
/**
* modCount Is an attribute in AbstractList that records the number of times ArrayList has been modified to ensure that only one thread pair is available at the same time.
* It operates.
*
*/
private void ensureExplicitCapacity(int minCapacity) {
        modCount++;
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }
    
private static int calculateCapacity(Object[] elementData, int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            return Math.max(DEFAULT_CAPACITY, minCapacity);
        }
        return minCapacity;
    }

//Increase by 0.5 times at a time
private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

There's nothing special about the other methods, so let's look at CopyOnWriteArrayList and see what mechanism it uses to work in a multithreaded environment.

CopyOnWriteArrayList has the following properties:

final transient ReentrantLock lock = new ReentrantLock();

private transient volatile Object[] array;

private static final sun.misc.Unsafe UNSAFE;

private static final long lockOffset;

lock: for synchronization
array: The place where the underlying data is stored
UNSAFE and lockOffset are used to reset locks in deserialization. Because locks are transient s, they are not serialized.
Why does CopyOnWriteArrayList have no size attribute? Let's look at its add() method:

public boolean add(E e) {
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            Object[] elements = getArray();
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len + 1);
            newElements[len] = e;
            setArray(newElements);
            return true;
        } finally {
            lock.unlock();
        }
    }

1. Acquire locks and lock them
2. Get the original array and its length, copy it into the new array, and finally set the last element of the new array to the element to be added. Replace the original array with a new array
3. Release lock

As you can see, the size of the array increases by only one after each add(), not more than one at a time like ArrayList. So the size of the array is equal to array.length.

Let's look at the modification operation again:

public E set(int index, E element) {
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            Object[] elements = getArray();
            E oldValue = get(elements, index);

            if (oldValue != element) {
                int len = elements.length;
                Object[] newElements = Arrays.copyOf(elements, len);
                newElements[index] = element;
                setArray(newElements);
            } else {
                // Not quite a no-op; ensures volatile write semantics
                setArray(elements);
            }
            return oldValue;
        } finally {
            lock.unlock();
        }
    }

1. Acquire locks and lock them
2. Get the elements corresponding to the original array and index
3. Comparing the elements to be modified with the elements to be transferred, if they are equal, the original array is reset to the current array.
4. If not, copy the original array to the new array, modify the value of the element at the index of the new array, and replace the original array with the new array.
5. Release lock

Some students may ask, why do you want to get array and set array through getArray() and setArray()? Can't arrays be manipulated directly? Look at the source code of these two methods first:

/**
     * Gets the array.  Non-private so as to also be accessible
     * from CopyOnWriteArraySet class.
     */
    final Object[] getArray() {
        return array;
    }

    /**
     * Sets the array.
     */
    final void setArray(Object[] a) {
        array = a;
    }

The reason is simple, because array is a private property. In order to make array accessible in the concurrent package, only getArray() and setArray() can be used. In order to keep the coding style consistent, these two functions are also used in this class.

Posted by MorganM on Mon, 05 Aug 2019 23:54:50 -0700