Data structure: analyzing the read-write performance of ArrayList and LinkedList with examples

Keywords: Java Programming

catalog

background

ArrayList and LinkedList are two basic data structures that are often used in Java programming. The following two features are generally explained in books:

  • For elements that require quick random access, you should use ArrayList.
  • LinkedList should be used for quick insertion and deletion of elements.

This paper analyzes the read-write performance of these two kinds of data through practical examples.

ArrayList

ArrayList is a data structure based on dynamic array

private static final int DEFAULT_CAPACITY = 10;
...
transient Object[] elementData;
...
public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }

LinkedList

LinkedList is a linked list based data structure.

private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }
...    
transient Node<E> first;
transient Node<E> last;
...
private void linkFirst(E e) {
        final Node<E> f = first;
        final Node<E> newNode = new Node<>(null, e, f);
        first = newNode;
        if (f == null)
            last = newNode;
        else
            f.prev = newNode;
        size++;
        modCount++;
    }

Case study

  • Through adding, inserting and traversing the two data structures, the performance of reading and writing is analyzed
1. Add data
public class ArrayListAndLinkList {
    public final static int COUNT=100000;
    public static void main(String[] args) {

        // ArrayList insert
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss:SSS");
        Long start = System.currentTimeMillis();
        System.out.println("ArrayList Insert start time:" + sdf.format(start));

        ArrayList<Integer> arrayList = new ArrayList<>();
        for (int i = 0; i < COUNT; i++) {
            arrayList.add(i);
        }

        Long end = System.currentTimeMillis();
        System.out.println("ArrayList Insert end time:" + sdf.format(end));
        System.out.println("ArrayList insert" + (end - start) + "millisecond");


        // LinkedList insert
        start = System.currentTimeMillis();
        System.out.println("LinkedList Insert start time:" + sdf.format(start));
        LinkedList<Integer> linkedList = new LinkedList<>();
        for (int i = 0; i < COUNT; i++) {
            linkedList.add(i);
        }
        end = System.currentTimeMillis();
        System.out.println("LinkedList Insert end time:" + sdf.format(end));
        System.out.println("LinkedList Insert end time" + (end - start) + "millisecond");
     }
}

The output is as follows:
The performance difference between the two is not big!

2. Insert data

On the original added data, on the index:100 Another 100000 pieces of data are inserted at the location of.

public class ArrayListAndLinkList {
    public final static int COUNT=100000;
    public static void main(String[] args) {

        // ArrayList insert
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss:SSS");
        Long start = System.currentTimeMillis();
        System.out.println("ArrayList Insert start time:" + sdf.format(start));

        ArrayList<Integer> arrayList = new ArrayList<>();
        for (int i = 0; i < COUNT; i++) {
            arrayList.add(i);
        }
        for (int i = 0; i < COUNT; i++) {
            arrayList.add(100,i);
        }

        Long end = System.currentTimeMillis();
        System.out.println("ArrayList Insert end time:" + sdf.format(end));
        System.out.println("ArrayList insert" + (end - start) + "millisecond");


        // LinkedList insert
        start = System.currentTimeMillis();
        System.out.println("LinkedList Insert start time:" + sdf.format(start));
        LinkedList<Integer> linkedList = new LinkedList<>();
        for (int i = 0; i < COUNT; i++) {
            linkedList.add(i);
        }
        for (int i = 0; i < COUNT; i++) {
            linkedList.add(100,i);
        }
        end = System.currentTimeMillis();
        System.out.println("LinkedList Insert end time:" + sdf.format(end));
        System.out.println("LinkedList Insert end time" + (end - start) + "millisecond");
     }
}

The output is as follows:
The performance of ArrayList is much worse than that of LinkedList.

Look at the reasons:
Insert source code of ArrayList:

  public void add(int index, E element) {
        rangeCheckForAdd(index);

        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }

The insertion principle of ArrayList: after inserting in the index position, the subsequent data of index needs to be copied one by one.

Source code of LinkedList:

public void add(int index, E element) {
        checkPositionIndex(index);

        if (index == size)
            linkLast(element);
        else
            linkBefore(element, node(index));
 }
 ...
  void linkBefore(E e, Node<E> succ) {
        // assert succ != null;
        final Node<E> pred = succ.prev;
        final Node<E> newNode = new Node<>(pred, e, succ);
        succ.prev = newNode;
        if (pred == null)
            first = newNode;
        else
            pred.next = newNode;
        size++;
        modCount++;
    }

The insertion principle of LinkedList: disconnect the two previously linked nodes and insert new nodes into the middle of the two nodes. There is no replication process at all.

3. Traversal data

On the basis of adding and inserting, we use get method to traverse.

public class ArrayListAndLinkList {
    public final static int COUNT=100000;
    public static void main(String[] args) {

        // ArrayList insert
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss:SSS");
        Long start = System.currentTimeMillis();
        System.out.println("ArrayList Insert start time:" + sdf.format(start));

        ArrayList<Integer> arrayList = new ArrayList<>();
        for (int i = 0; i < COUNT; i++) {
            arrayList.add(i);
        }
        for (int i = 0; i < COUNT; i++) {
            arrayList.add(100,i);
        }

        Long end = System.currentTimeMillis();
        System.out.println("ArrayList Insert end time:" + sdf.format(end));
        System.out.println("ArrayList insert" + (end - start) + "millisecond");


        // LinkedList insert
        start = System.currentTimeMillis();
        System.out.println("LinkedList Insert start time:" + sdf.format(start));
        LinkedList<Integer> linkedList = new LinkedList<>();
        for (int i = 0; i < COUNT; i++) {
            linkedList.add(i);
        }
        for (int i = 0; i < COUNT; i++) {
            linkedList.add(100,i);
        }
        end = System.currentTimeMillis();
        System.out.println("LinkedList Insert end time:" + sdf.format(end));
        System.out.println("LinkedList Insert end time" + (end - start) + "millisecond");

        // ArrayList traversal
        start = System.currentTimeMillis();
        System.out.println("ArrayList Traversal start time:" + sdf.format(start));
        for (int i = 0; i < 2*COUNT; i++) {
            arrayList.get(i);
        }
        end = System.currentTimeMillis();
        System.out.println("ArrayList Traverse start time:" + sdf.format(end));
        System.out.println("ArrayList Traverse start time" + (end - start) + "millisecond");

        // LinkedList traversal
        start = System.currentTimeMillis();
        System.out.println("LinkedList Traversal start time:" + sdf.format(start));
        for (int i = 0; i < 2*COUNT; i++) {
            linkedList.get(i);
        }
        end = System.currentTimeMillis();
        System.out.println("LinkedList Traverse start time:" + sdf.format(end));
        System.out.println("LinkedList Traverse start time" + (end - start) + "millisecond");

    }
}

The output is as follows:

There is a huge difference between the two:
Let's take a look at the get method of LInkedList: traversing nodes from the beginning or from the end

public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }
 ...
 Node<E> node(int index) {
        // assert isElementIndex(index);

        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }
3.1. LinkedList traversal improvement

We use iterators to improve the traversal of LinkedList

		...
		// LinkedList traversal
        start = System.currentTimeMillis();
        System.out.println("LinkedList Traverse start time:" + sdf.format(start));
        Iterator<Integer> iterator = linkedList.iterator();
        while(iterator.hasNext()){
            iterator.next();
        }
        end = System.currentTimeMillis();
        System.out.println("LinkedList Traversal start time:" + sdf.format(end));
        System.out.println("LinkedList Traverse start time" + (end - start) + "millisecond");

Let's look at the results:
Their ergodic performance is close.

summary

  • List uses the preferred ArrayList. LinkedList can be used for many individual inserts and deletions.
  • LinkedList, Iterator iterator is recommended for traversal, especially when the data volume is large, LinkedList avoids get traversal.

Posted by will on Wed, 03 Jun 2020 21:46:19 -0700