Competition between brothers -- ArrayList and LinkedList

Keywords: Java less

In the process of daily development, we often use the list structure. Many brothers started to write such statements directly: List < x x x > x = new ArrayList < > ();. But there are many ways to implement list, and LinkedList is a general way to implement list without thread safety. We understand their principles and compare their similarities and differences from multiple dimensions through source code today.

I believe in a point of view: do not take the source code to speak, there is no foundation. If the conclusion of this article is different from that of other articles, please follow me. Of course, I also welcome your valuable suggestions and opinions.

I will add Chinese notes to the source code used in this article for understanding.

Comparison between ArrayList and LinkedList

ArrayList

ArrayList is a List structure based on dynamic array, which is a continuous storage space in memory.

When initializing, it can optionally pass in parameters to set the initial capacity of the array (if not, the default value is' 10 '):

    /**
     * Default initial capacity is 10
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * Shared empty array instance for empty instance
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    /**
     * Another kind of shared empty array instance for terrorist forces (there is no actual difference from the above instance, just to analyze how much default capacity should be set)
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    /**
     * This is the underlying container of ArrayList
     */
    transient Object[] elementData;

    /**
     * list The number of elements currently stored in (not the size of the array)
     */
    private int size;

Because the Random Access strategy of binary search can be used to find and modify elements, the speed is extremely fast:

   /**
    * get Method to get the elements in the array, first check whether the index is within the allowed range, and then directly get the elements marked index
    */
    public E get(int index) {
        rangeCheck(index);

        return elementData(index);
    }
       
   /**
    * set Method to modify the elements in the array, first check whether the index is within the allowed range, and then directly replace the elements with index as the subscript
    */
    public E set(int index, E element) {
        rangeCheck(index);

        E oldValue = elementData(index);
        elementData[index] = element;
        return oldValue;
    }
    
    private void rangeCheck(int index) {
        if (index >= size)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    E elementData(int index) {
        return (E) elementData[index];
    }

In the process of adding and deleting, the efficiency is greatly reduced because the elements need to be moved.

  • When an element is removed, the array skips the element to be removed and copies it, then reassigns the new list to itself:
   /**
    * remove Method to remove elements from an array
    */
    public E remove(int index) {
        rangeCheck(index);

        modCount++; // Number of times this array has been modified + 1
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            // Skip the elements that need to be removed and copy all the elements after that, and then overwrite the location of the elements that need to be removed
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // Empty the last position of the array and wait for GC to recycle

        return oldValue;
    }

Mention arraycopy (object SRC, int srcpos, object DeST, int destpos, Int length) method: this method is java.lang.System Class, the first parameter is the source array; the second parameter is the starting position of the source array copy; the third parameter is the target array; the third parameter is the starting position of the copied part to the target array; the last parameter is the length of the copy.

  • When an element is added, if the total number of elements exceeds the maximum capacity of the current list, it will be expanded. Each expansion will reach 1.5 times of the current capacity:
    /**
     * Add element to end of array
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Judge whether the capacity should be expanded, and if necessary, expand the capacity
        elementData[size++] = e;
        return true;
    }
    
    /**
     * Expansion method, the input parameter is the minimum capacity to be loaded
     */
    private void grow(int minCapacity) {
        // Previous array capacity
        int oldCapacity = elementData.length;
        // New capacity = old capacity + 1 / 2 of old capacity
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        // Determine new capacity if capacity is sufficient for all elements
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        // If the capacity is not enough for all elements, use the hugeCapacity(minCapacity) method
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // Copy array
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
    
    /**
     * MAX_ARRAY_SIZE For Integer.MAX_VALUE - 8, if it is only a little larger than the maximum limit of array, it is acceptable; otherwise, it can only be expanded to Integer.MAX_VALUE
     */
    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

We always say, "if we can know the final size range of the list, setting its initial capacity reasonably will be beneficial to performance." A reasonable explanation is given here. Imagine that when the array is expanded to 1.5 times the original capacity, there will be some empty array space. This means that in the end, the list we get usually takes up more space than we need. If you have an ArrayList object with a large number of elements, there will eventually be a lot of space wasted. Although we have a trimToSize() method that can kill the space wasted at the end of the array after ArrayList allocation, the array will be reallocated every time it is expanded, and the reallocation process consumes more resources, resulting in performance degradation. Good initial value setting can avoid this problem as much as possible.

LinkedList

LinkedList is a kind of List structure based on the form of two-way linked List. On the contrary of array, the speed of adding and deleting elements is fast, while the speed of querying elements is slow.

    transient int size = 0;

    /**
     * Point to the head node of the linked list
     */
    transient Node<E> first;

    /**
     * Point to the end node of the linked list
     */
    transient Node<E> last;

    /**
     * The construction method is empty. After initialization, size=0, and both the head and tail nodes are empty
     */
    public LinkedList() {
    }

A private static internal class Node is defined in the LinkedList class, and each Node object is an element in the LinkedList. There are three variables in the Node class: the item variable of the template holds the value (address) of the element, and the prev and next variables hold the previous / next element object of this Node respectively. In LinkedList, there are only the first item (transient Node < E > first;) and the last item (transient Node < E > last;) in the list. So the way LinkedList is stored causes additional overhead on each element.

    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

Because it is necessary to traverse the linked list to find and modify elements, the speed is slow:

    public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }

    public E set(int index, E element) {
        checkElementIndex(index);
        Node<E> x = node(index);
        E oldVal = x.item;
        x.item = element;
        return oldVal;
    }

    /**
     * Find the underlying method of index corresponding node, and find the corresponding subscript node through traversal
     */
    Node<E> node(int index) {
        // assert isElementIndex(index);
        
        // Here's an interesting point. You can choose whether to use traversal from end to end or traversal from end to end to increase efficiency through whether the subscript is less than half of the size
        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

When adding or deleting elements, you don't need to move the element position, you just need to change the pointer of individual nodes, which is very efficient.

  • Add (e e e) add elements to the end of the list
    public void addLast(E e) {
        linkLast(e);
    }

   /**
    * Added to the end of the list by default. As long as space can be allocated, it will return true. Otherwise, an OOM exception will be thrown
    */
    public boolean add(E e) {
        linkLast(e);
        return true;
    }
    
    /**
     * Chain new elements to the end of the list
     */
    void linkLast(E e) {
        // Old tail node
        final Node<E> l = last;
        // Construct a new tail Node object
        final Node<E> newNode = new Node<>(l, e, null);
        // Assign a new tail node to the last parameter in the LinkedList
        last = newNode;
        // If there is no element in the list at the beginning, and the old tail node is empty, then the new tail node is also assigned to the first parameter in the LinkedList; otherwise, the next node of the old tail node is set to the new one
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        // Chain length + 1
        size++;
        // Modification times of linked list + 1
        modCount++;
    }

  • Addfirst (e e e) add element to chain header (same implementation as adding element to footer)
    public void addFirst(E e) {
        linkFirst(e);
    }
    
    /**
     * Chain new elements to the head of the list
     */
    private void linkFirst(E e) {
        // Old head node
        final Node<E> f = first;
        // Construct a new head Node node object
        final Node<E> newNode = new Node<>(null, e, f);
        // Assign a new header node to the first parameter in the LinkedList
        first = newNode;
        // If there is no element in the list at the beginning, and the old header node is empty, then the new header node is also assigned to the last parameter in the LinkedList; otherwise, the prev node of the old header node is set to the new one
        if (f == null)
            last = newNode;
        else
            f.prev = newNode;
        // Chain length + 1
        size++;
        // Modification times of linked list + 1
        modCount++;
    }
  • add(int index, E element)
    public void add(int index, E element) {
        checkPositionIndex(index); // Check if index is out of range

        if (index == size)
            linkLast(element); //Same as add (e e e) added to the end of the list
        else
            linkBefore(element, node(index)); // node(index) is a method to get corresponding nodes by traversing the previous part
    }

    private void checkPositionIndex(int index) {
        if (!isPositionIndex(index))
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    /**
     * Chain the new element before the specified node
     */
    void linkBefore(E e, Node<E> succ) {
        // assert succ != null;
        // Get the previous node of the node at index
        final Node<E> pred = succ.prev;
        // Construct a new Node object. The former Node is the former Node of the original Node at index, and the latter Node is the Node at index
        final Node<E> newNode = new Node<>(pred, e, succ);
        // Set the previous node of the node at index as the new node
        succ.prev = newNode;
        // If the previous node of the original node in index is empty, then the head node in index is the head node. You need to set the flag bit of the head node to a new node. Otherwise, you need to set the subsequent node of the previous node in index to a new node
        if (pred == null)
            first = newNode;
        else
            pred.next = newNode;
        size++;
        modCount++;
    }

Delete the element, that is, associate the prev/next parameters of the front and back nodes of the current node with each other, and modify the first/next node in the LinkedList (possibly).

    public E remove(int index) {
        checkElementIndex(index);
        return unlink(node(index));
    }

    E unlink(Node<E> x) {
        // assert x != null;
        final E element = x.item;
        final Node<E> next = x.next;
        final Node<E> prev = x.prev;

        if (prev == null) {
            first = next;
        } else {
            prev.next = next;
            x.prev = null;
        }

        if (next == null) {
            last = prev;
        } else {
            next.prev = prev;
            x.next = null;
        }

        x.item = null;
        size--;
        modCount++;
        return element;
    }

summary

ArrayList and LinkedList have their own advantages. In terms of details,
1. The cost of adding an element at the end of the list is fixed. For ArrayList, it may occasionally cause array reallocation; for LinkedList, the overhead is always to allocate an internal Node object.
2. Adding or deleting elements in the middle of ArrayList means that all the remaining elements in the list will be moved, while the cost of adding or deleting nodes in the middle of LinkedList is fixed.
3. LinkedList does not support efficient random element access.
4. The space waste of ArrayList is mainly reflected in that a certain amount of capacity space may be reserved at the end of the list, while the space cost of LinkedList is reflected in that each element of it needs to consume more space.

To sum up, if you often add or delete in the middle of a series of data, the LinkedList is better, while if you have more queries and add or delete always appears at the end, the ArrayList is better.

Posted by adrianTNT on Mon, 15 Jun 2020 00:19:33 -0700