Java Collection Series (5) -- List Summary

Keywords: Java less JDK

Earlier, we learned the main ArrayList and LinkedList in List. Here is a summary for understanding.

I. List Basic Description

Let's review the framework of List first.

List is an interface that inherits from Collection. It represents an ordered queue.
2 > AbstractList is an abstract class that inherits from AbstractCollection. AbstractList implements functions in the List interface other than size(), get(int location).
3 > AbstractSequential List is an abstract class that inherits from AbstractList. AbstractSequential List implements "all functions in a linked list that operate on index values".
4 > ArrayList, LinkedList, Vector and Stack are the four implementation classes of List.
ArrayList is an array queue, equivalent to a dynamic array. It is implemented by arrays, with high random access efficiency and low efficiency of random insertion and deletion.
LinkedList is a two-way linked list. It can be operated as a stack, queue, or double-ended queue. LinkedList random access efficiency is low, but random insertion and deletion efficiency is high.
Vector is a vector queue, and like Array List, it is also a dynamic array, implemented by an array, but Array List is non-thread-safe, and Vector is thread-safe.
Stack is a stack, which inherits from Vector. Its characteristics are: first in, last out.

II. List usage occasions

First, the use scenarios of each List are outlined, and then the reasons are analyzed.
If the operation of "stack", "queue" and "linked list" is involved, we should first consider using List, and choose which List to choose according to the following criteria.
1 > LinkedList should be used for quickly inserting and deleting elements.
2 > ArrayList should be used for elements requiring quick random access
3 > For "single-threaded" or "multi-threaded" environments, but lists are only operated on by a single thread, and asynchronous classes (such as Array List) should be used.
For threaded environments where lists may be operated by multiple threads at the same time, synchronous classes should be used.
Through the following test procedures, we will verify the conclusions of 1 and 2 above. The reference code is as follows:

package Test;
import java.util.*;

/**
 * Created by LKL on 2017/2/17.
 */
public class TestTime {
    /*
     * @desc Comparing the efficiency of insertion, random reading and deletion of ArrayList and LinkedList
     *
     */
        private static final int COUNT = 100000;
        private static LinkedList linkedList = new LinkedList();
        private static ArrayList arrayList = new ArrayList();
        private static Vector vector = new Vector();
        private static Stack stack = new Stack();

        public static void main(String[] args) {
            // Newline character
            System.out.println();
            // insert
            insertByPosition(stack) ;
            insertByPosition(vector) ;
            insertByPosition(linkedList) ;
            insertByPosition(arrayList) ;

            // Newline character
            System.out.println();
            // Random read
            readByPosition(stack);
            readByPosition(vector);
            readByPosition(linkedList);
            readByPosition(arrayList);

            // Newline character
            System.out.println();
            // delete
            deleteByPosition(stack);
            deleteByPosition(vector);
            deleteByPosition(linkedList);
            deleteByPosition(arrayList);
        }

        // Get the name of the list
        private static String getListName(List list) {
            if (list instanceof LinkedList) {
                return "LinkedList";
            } else if (list instanceof ArrayList) {
                return "ArrayList";
            } else if (list instanceof Stack) {
                return "Stack";
            } else if (list instanceof Vector) {
                return "Vector";
            } else {
                return "List";
            }
        }

        // Insert COUNT elements to the pointer of the list and count the time
        private static void insertByPosition(List list) {
            long startTime = System.currentTimeMillis();

            // Insert the number of COUNT s to position 0 of the list
            for (int i=0; i<COUNT; i++)
                list.add(0, i);
            long endTime = System.currentTimeMillis();
            long interval = endTime - startTime;
            System.out.println(getListName(list) + " : insert "+COUNT+" elements into the 1st position use time: " + interval+" ms");
        }

        // Delete COUNT elements from the specified location of the list and count the time
        private static void deleteByPosition(List list) {
            long startTime = System.currentTimeMillis();
            // Delete the first location element of the list
            for (int i=0; i<COUNT; i++)
                list.remove(0);
            long endTime = System.currentTimeMillis();
            long interval = endTime - startTime;
            System.out.println(getListName(list) + " : delete "+COUNT+" elements from the 1st position use time: " + interval+" ms");
        }

        // Read the elements from the list and count the time according to position
        private static void readByPosition(List list) {
            long startTime = System.currentTimeMillis();
            // Read list elements
            for (int i=0; i<COUNT; i++)
                list.get(i);
            long endTime = System.currentTimeMillis();
            long interval = endTime - startTime;
            System.out.println(getListName(list) + " : read "+COUNT+" elements by position use time: " + interval+" ms");
        }
    }

The results are as follows:

Stack : insert 100000 elements into the 1st position use time: 3279 ms
Vector : insert 100000 elements into the 1st position use time: 1731 ms
LinkedList : insert 100000 elements into the 1st position use time: 26 ms
ArrayList : insert 100000 elements into the 1st position use time: 1719 ms

Stack : read 100000 elements by position use time: 9 ms
Vector : read 100000 elements by position use time: 6 ms
LinkedList : read 100000 elements by position use time: 9296 ms
ArrayList : read 100000 elements by position use time: 4 ms

Stack : delete 100000 elements from the 1st position use time: 1680 ms
Vector : delete 100000 elements from the 1st position use time: 1571 ms
LinkedList : delete 100000 elements from the 1st position use time: 15 ms
ArrayList : delete 100000 elements from the 1st position use time: 1614 ms

LinkedList takes the shortest time to insert 100,000 elements: 26 ms.
LinkedList takes the shortest time to delete 100,000 elements: 15 ms.
LinkedList takes the longest time to traverse 100,000 elements: 9296 ms, while ArrayList, Stack and Vector take just a few seconds.

Synchronization is supported when combining Vector, and Stack is inherited from Vector. Therefore, we get:
1 > LinkedList should be used for quickly inserting and deleting elements.
2 > ArrayList should be used for fast random access to female elements.
3 > For single-threaded or multi-threaded environments, lists are only operated on by a single thread, and asynchronous classes (ArrayList or LinkedList) should be used.

3. Analysis of performance differences between LinkedList and Array List

Let's see why LinkedList always seems to insert elements very quickly, while LinkedList inserts elements very slowly.
The code for inserting elements to a specified location in LinkedList.java is as follows

// Add a node before index, and the value of the node is element
public void add(int index, E element) {
    addBefore(element, (index==size ? header : entry(index)));
}

// Gets the node at the specified location in the bidirectional list
private Entry<E> entry(int index) {
    if (index < 0 || index >= size)
        throw new IndexOutOfBoundsException("Index: "+index+
                                            ", Size: "+size);
    Entry<E> e = header;
    // Gets the node at index.
    // If the index is less than 1/2 of the length of the two-way linked list, it is searched from front to back.
    // Otherwise, look back and forward.
    if (index < (size >> 1)) {
        for (int i = 0; i <= index; i++)
            e = e.next;
    } else {
        for (int i = size; i > index; i--)
            e = e.previous;
    }
    return e;
}

// Before adding a node (node data is e) to the entry node.
private Entry<E> addBefore(E e, Entry<E> entry) {
    // New node newEntry, insert new Entry before node e; and set the data for new Entry to be e
    Entry<E> newEntry = new Entry<E>(e, entry, entry.previous);
    // Insert new Entry into the list
    newEntry.previous.next = newEntry;
    newEntry.next.previous = newEntry;
    size++;
    modCount++;
    return newEntry;
}

As we can see from the above, when inserting elements into LinkedList by add ing (int index, E element), we first find the location index of the inserting node in the bidirectional list, and then insert a new node.
There is an acceleration action when searching for the node in the index position in the bi-directional list: if the index is less than 1/2 of the bi-directional list, the node will be searched from the front to the back; otherwise, the node will be searched from the back to the front.
Next, let's look at the code in ArrayList.java that inserts elements to a specified location. As follows:

// Add e to the ArrayList pointer
public void add(int index, E element) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(
        "Index: "+index+", Size: "+size);

    ensureCapacity(size+1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
         size - index);
    elementData[index] = element;
    size++;
}

The function of ensureCapacity(size+1) is to "confirm the capacity of ArrayList and increase capacity if capacity is insufficient"
The real time-consuming operation is System.arraycopy(elementData, index, elementData, index + 1, size - index);

The arraycopy() declaration in the java/lang/System.java of the Sun JDK package is as follows:

public static native void arraycopy(Object src, int srcPos, Object dest, int destPos, int length);

In fact, we only need to know about System.arraycopy(elementData, index, elementData, index + 1, size - index); we can move all the elements after index. This means that the add (int index, E element) function of ArrayList causes all elements to change after index.

From the above analysis, we can understand why the insertion of elements in LinkedList is very fast, while the insertion of elements in ArrayList is very slow.
Deleting elements is similar to inserting elements without explanation.
Next, let's look at "Why is random access slow in LinkedList and fast in ArrayList".
LinkedList Random Access Source Code

// Returns the element at the specified location in LinkedList
public E get(int index) {
    return entry(index).element;
}

// Gets the node at the specified location in the bidirectional list
private Entry<E> entry(int index) {
    if (index < 0 || index >= size)
        throw new IndexOutOfBoundsException("Index: "+index+
                                            ", Size: "+size);
    Entry<E> e = header;
    // Gets the node at index.
    // If the index is less than 1/2 of the length of the two-way linked list, it will be searched from before to after.
    // Otherwise, look back and forward.
    if (index < (size >> 1)) {
        for (int i = 0; i <= index; i++)
            e = e.next;
    } else {
        for (int i = size; i > index; i--)
            e = e.previous;
    }
    return e;
}

From this, we can see that when we get the index element of LinkedList through get (int index). First, find the element in the index position in the bidirectional list; then return.
There is an acceleration action when searching for the node in the index position in the bi-directional list: if the index is less than 1/2 of the bi-directional list, the node will be searched from the front to the back; otherwise, the node will be searched from the back to the front.
Here's an analysis of ArrayList random access code

// Get the element value of the index position
public E get(int index) {
    RangeCheck(index);

    return (E) elementData[index];
}

private void RangeCheck(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException(
        "Index: "+index+", Size: "+size);
}

From this, you can get the index element of ArrayList by getting (int index). Returns directly the element at the index position in the array without needing to look up like a LinkedList.

IV. Comparison of Vector and Array List

Same point:
1 > They all implement the List interface. Inheritance from AbstractList
The classes of ArrayList and Vector are defined as follows:

// Definition of ArrayList
public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable

// VectorDefinition
public class Vector<E> extends AbstractList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable {}

2 > They both implement Random Access and Loneable interfaces
Implementing Random Access interfaces means that they all support fast random access.
Implementing Cloneable interfaces means that they can produce their own copies.
3 > They are all implemented through arrays, essentially dynamic arrays
ArrayList.java defines an array elementData for saving elements

// Array to save data in ArrayList
private transient Object[] elementData;

The array elementData is also defined in Vector.java to save elements

// Array of data in Vector
protected Object[] elementData;

4 > Their default array capacity is 10
If the capacity size is not specified when creating an ArrayList or Vector, the default capacity size of 10 is used.
The default constructor for ArrayList is:


// ArrayList constructor. The default capacity is 10.
public ArrayList() {
    this(10);
}

The default constructor for Vector is as follows:

// Vector constructor. The default capacity is 10.
public Vector() {
    this(10);
} 

5 > Both support Iterator and list Iterator traversal
They all inherit from AbstractList, which implements "iterator () interface returns Iterator iterator" and "listIterator () returns ListIterator iterator" respectively.

Difference:
1 > Thread security is different
ArrayList is non-thread-safe; Vector is thread-safe, and its functions are synchronized, that is, they all support synchronization.
ArrayList is suitable for single threading and Vector for multi-threading.
2 > Different support for serialization
ArrayList supports serialization, while Vector does not; that is, ArrayList implements the Serializable interface, which Vector does not.
3 > The number of constructors is different
ArrayList has three constructors, while Vector has four constructors. In addition to three constructors similar to Array List, Vector has another constructor that specifies the capacity increment factor.
The ArrayList constructor is as follows:

// Default constructor
ArrayList()

// Capacity is the default capacity size for ArrayList. When the capacity is insufficient due to the increase of data, the capacity will be added half of the size of the previous capacity.
ArrayList(int capacity)

// Create an Array List with collection
ArrayList(Collection<? extends E> collection)

The constructor of Vector is as follows:

// Default constructor
Vector()

// Capacity is the default capacity size for Vector. When the capacity increases due to the increase of data, the capacity doubles each time.
Vector(int capacity)

// Create a Vector with collection
Vector(Collection<? extends E> collection)

// Capacity is the default capacity size of Vector, and capacity Increment is the incremental value every time Vector capacity increases.
Vector(int capacity, int capacityIncrement)

4 > Different ways of capacity increase
When adding elements one by one, if the capacity of ArrayList is insufficient, the new capacity = "original capacity * 1.5 + 1"
Vector's capacity growth is related to the growth factor. If the growth factor is specified and the growth factor is effective (i.e., greater than 0), then when the capacity is insufficient, "new capacity = original capacity + growth factor". If the growth factor is invalid (<=0), then "new capacity = original capacity*2"
The function of capacity growth in ArrayList is as follows:

public void ensureCapacity(int minCapacity) {
    // Modify Statistics + 1
    modCount++;
    int oldCapacity = elementData.length;
    // If the current capacity is insufficient to accommodate the current number of elements, set the new capacity="(original capacity x3)/2+1"
    if (minCapacity > oldCapacity) {
        Object oldData[] = elementData;
        int newCapacity = (oldCapacity * 3)/2 + 1;
        if (newCapacity < minCapacity)
            newCapacity = minCapacity;
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
}

The main functions of capacity growth in Vector are as follows:

private void ensureCapacityHelper(int minCapacity) {
    int oldCapacity = elementData.length;
    // When Vector's capacity is not enough to accommodate all the current elements, increase the capacity size.
    // If the capacity increment coefficient is greater than 0 (capacityIncrement > 0), the capacity will increase when capacityIncrement
    // Otherwise, the capacity will be doubled.
    if (minCapacity > oldCapacity) {
        Object[] oldData = elementData;
        int newCapacity = (capacityIncrement > 0) ?
            (oldCapacity + capacityIncrement) : (oldCapacity * 2);
        if (newCapacity < minCapacity) {
            newCapacity = minCapacity;
        }
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
}

5 > Support for Enumeration is different. Vector supports traversal through Enumeration, while List does not.
The code to implement Enumeration in Vector is as follows:

public Enumeration<E> elements() {
    // Enumeration through anonymous classes
    return new Enumeration<E>() {
        int count = 0;

        // Is there the next element?
        public boolean hasMoreElements() {
            return count < elementCount;
        }

        // Get the next element
        public E nextElement() {
            synchronized (Vector.this) {
                if (count < elementCount) {
                    return (E)elementData[count++];
                }
            }
            throw new NoSuchElementException("Vector Enumeration");
        }
    };
}

Reference resources: List Summary of Java Collection Series 08 (LinkedList, ArrayList and other usage scenarios and performance analysis)

Posted by AJW on Fri, 29 Mar 2019 15:54:30 -0700