LongAdder Source Code Analysis of Dead java Atoms

Keywords: Java Attribute Mobile

(mobile phone horizontal screen source code is more convenient)


(1) why do we need to add LongAdder in java8?

(2) How to implement LongAdder?

(3) Comparison between Long Adder and Atomic Long?

brief introduction

LongAdder is a new atomic class in Java 8. In multithreaded environments, LongAdder performs much better than AtomicLong, especially in more-written scenarios.

How did it come about? Let's study together.


The principle of LongAdder is to update the value of base only when there is no competition at first. When there is multithread competition, different threads can update different segments by piecewise thought, and finally add these segments to get the complete value of LongAdder storage.

Source code analysis

LongAdder inherits from Striped64 Abstract class, which defines Cell internal classes and important attributes.

Major internal classes

// Internal classes in Striped64, using the @sun.misc.Contended annotation, show that the values in Striped64 eliminate pseudo-sharing
@sun.misc.Contended static final class Cell {
    // Store element values, using volatile modifiers to ensure visibility
    volatile long value;
    Cell(long x) { value = x; }
    // CAS updates the value of value
    final boolean cas(long cmp, long val) {
        return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val);

    // Unsafe example
    private static final sun.misc.Unsafe UNSAFE;
    // Offset of value field
    private static final long valueOffset;
    static {
        try {
            UNSAFE = sun.misc.Unsafe.getUnsafe();
            Class<?> ak = Cell.class;
            valueOffset = UNSAFE.objectFieldOffset
        } catch (Exception e) {
            throw new Error(e);

The Cell class uses the @sun.misc.Contended annotation to avoid pseudo-sharing.

Update the value with Unsafe CAS, where the value is modified with volatile to ensure visibility.

For an introduction to Unsafe, please check it out[ Unsafe Analysis of Dead java Magic].

For an introduction to pseudo-sharing, see[ What is false sharing?].

Main attributes

// All three attributes are in Striped64
// cells array, which stores the values of each segment
transient volatile Cell[] cells;
// What was used initially when there was no competition was also a special segment.
transient volatile long base;
// Mark whether threads are currently creating or expanding cells, or creating Cell s
// Updating this value through CAS is equivalent to a lock
transient volatile int cellsBusy;

Initially, no competition or other threads used base to update values when creating cell arrays, and cells to update values when there was competition.

Initially no competition means that there is no competition between threads at first, but it may also be multithreaded, except that these threads do not update the value of base at the same time.

Competition refers to the use of cells to update values whenever there is competition, whether there is competition or not. The rule is that different threads have been updated to different cells to reduce competition.

add(x) method

The add(x) method is the main method of LongAdder. It can increase the value stored in LongAdder by x, which can be positive or negative.

public void add(long x) {
    // as is the cell attribute in Striped64
    // b is the base attribute in Striped64
    // v is the value stored in Cell from the current thread hash
    // m is the length of cells minus 1, used as a mask when hash
    // a is the Cell to which the current thread hash arrives
    Cell[] as; long b, v; int m; Cell a;
    // Conditions 1: Cells are not empty, indicating that competition has occurred and cells have been created
    // Conditional 2: The cas e operation base fails, indicating that other threads have modified the base one step at a time and are competing
    if ((as = cells) != null || !casBase(b = base, b + x)) {
        // true says the competition is not intense
        // false indicates intense competition, multiple threads hash to the same Cell, which may need to be expanded.
        boolean uncontended = true;
        // Conditions 1: cells are empty, indicating that competition is occurring. The above is from condition 2.
        // Conditions 2: Should not appear
        // Conditional 3: The current thread is empty, indicating that the current thread has not updated Cell, so a Cell should be initialized.
        // Conditional 4: The failure of updating Cell where the current thread is located indicates that the competition is fierce now. Multiple threads hash to the same Cell, which should be expanded.
        if (as == null || (m = as.length - 1) < 0 ||
            // The getProbe() method returns the threadLocalRandomProbe field in the thread
            // It is a value generated by a random number, which is fixed for a given thread.
            // Unless it is deliberately modified.
            (a = as[getProbe() & m]) == null ||
            !(uncontended = a.cas(v = a.value, v + x)))
            // Calling Method Processing in Striped64
            longAccumulate(x, null, uncontended);

(1) only update base when there is no competition at first;

(2) Create cell arrays until update base fails;

(3) When multiple threads compete fiercely for the same Cell, they may need to expand.

longAccumulate() method

final void longAccumulate(long x, LongBinaryOperator fn,
                              boolean wasUncontended) {
    // Store thread probe values
    int h;
    // If the getProbe() method returns 0, the random number is not initialized
    if ((h = getProbe()) == 0) {
        // Forced initialization
        ThreadLocalRandom.current(); // force initialization
        // Retrieve the probe value
        h = getProbe();
        // It's not initialized. There's definitely no competition yet.
        wasUncontended = true;
    // Is there a collision?
    boolean collide = false;                // True if last slot nonempty
    for (;;) {
        Cell[] as; Cell a; int n; long v;
        // cells have been initialized
        if ((as = cells) != null && (n = as.length) > 0) {
            // Cell, where the current thread is, is not initialized
            if ((a = as[(n - 1) & h]) == null) {
                // No other threads are currently creating or expanding cells, and no threads are creating Cell s.
                if (cellsBusy == 0) {       // Try to attach new Cell
                    // Create a new Cell with the current value to be added
                    Cell r = new Cell(x);   // Optimistically create
                    // Check cellsBusy again and try to update it to 1
                    // Equivalent to the current thread lock
                    if (cellsBusy == 0 && casCellsBusy()) {
                        // Whether it was created successfully or not
                        boolean created = false;
                        try {               // Recheck under lock
                            Cell[] rs; int m, j;
                            // Retrieve cells and locate the current thread hash in the cells array
                            // It's important to retrieve cells here because as is not locked in
                            // Maybe it has been expanded. Here we need to retrieve it.
                            if ((rs = cells) != null &&
                                (m = rs.length) > 0 &&
                                rs[j = (m - 1) & h] == null) {
                                // Place the new Cell above at the j location of the cells.
                                rs[j] = r;
                                // Create success
                                created = true;
                        } finally {
                            // Equivalent to release lock
                            cellsBusy = 0;
                        // Create successfully and return
                        // Value has been put in the new Cell
                        if (created)
                        continue;           // Slot is now non-empty
                // Markup does not currently conflict
                collide = false;
            // Cell where the current thread is located is not empty and the update failed
            // Let's simply set it to true, which is equivalent to simply spinning once.
            // Modify the thread's probe and try again with the following statement
            else if (!wasUncontended)       // CAS already known to fail
                wasUncontended = true;      // Continue after rehash
            // Try CAS again to update the Cell value of the current thread, and return if it succeeds.
            else if (a.cas(v = a.value, ((fn == null) ? v + x :
                                         fn.applyAsLong(v, x))))
            // If the cell array length reaches the CPU core, or the cell is expanded
            // Set collide to false and modify thread probe s with the following statement to try again
            else if (n >= NCPU || cells != as)
                collide = false;            // At max size or stale
            // The last elseif update failed, and the previous condition did not hold, indicating that there was a conflict.
            else if (!collide)
                collide = true;
            // Conflict is clear, try to occupy locks, and expand
            else if (cellsBusy == 0 && casCellsBusy()) {
                try {
                    // Check if other threads have been expanded
                    if (cells == as) {      // Expand table unless stale
                        // The new array is twice the original
                        Cell[] rs = new Cell[n << 1];
                        // Copy old array elements into new arrays
                        for (int i = 0; i < n; ++i)
                            rs[i] = as[i];
                        // Reassign cells to a new array
                        cells = rs;
                } finally {
                    // Release lock
                    cellsBusy = 0;
                // Conflict resolved
                collide = false;
                // Re-attempt with a new expanded array
                continue;                   // Retry with expanded table
            // Update failed or reached CPU core number, rebuild probe, and try again
            h = advanceProbe(h);
        // cells array is not initialized, try to occupy the lock and initialize the cell array
        else if (cellsBusy == 0 && cells == as && casCellsBusy()) {
            // Successful initialization
            boolean init = false;
            try {                           // Initialize table
                // Check if other threads have been initialized
                if (cells == as) {
                    // Create a new Cell array of size 2
                    Cell[] rs = new Cell[2];
                    // Find the location of the current thread hash in the array and create its corresponding Cell
                    rs[h & 1] = new Cell(x);
                    // Assignment to cells Array
                    cells = rs;
                    // Successful initialization
                    init = true;
            } finally {
                // Release lock
                cellsBusy = 0;
            // Successful Initialization Direct Return
            // Because the added value has been created in Cell at the same time
            if (init)
        // If there are other threads in the initialization cell array, try to update the base
        // If successful, go back.
        else if (casBase(v = base, ((fn == null) ? v + x :
                                    fn.applyAsLong(v, x))))
            break;                          // Fall back on using base

(1) If the cell array is not initialized, the current thread will try to occupy the cell Busy lock and create the cell array;

(2) If the current thread tries to create an array of cells and finds that other threads have already been created, it tries to update the base and returns if it succeeds.

(3) Find which Cell in the Cell array should be updated by the thread's probe value.

(4) If the Cell where the current thread is located is not initialized, it occupies the cellsBusy lock and creates a Cell in the corresponding location.

(5) Attempt CAS to update the Cell where the current thread is located, and return if successful. Failure indicates a conflict.

(5) When the current thread fails to update Cell, it does not expand immediately, but try to update the probe value and try again.

(6) If the update fails at retry, it will be expanded.

(7) When expanding capacity, the current thread occupies the cell Busy lock, expands the array capacity to twice, and then migrates elements from the original cell array to the new array.

(8) cellsBusy is used in creating Cell arrays, creating Cell arrays and expanding Cell arrays.

sum() method

The sum() method is to get the size of the real stored value in LongAdder by adding the base and all segments together.

public long sum() {
    Cell[] as = cells; Cell a;
    // sum is initially equal to base
    long sum = base;
    // If cells are not empty
    if (as != null) {
        // Traveling through all Cell s
        for (int i = 0; i < as.length; ++i) {
            // If the Cell is not empty, add its value to the sum
            if ((a = as[i]) != null)
                sum += a.value;
    // Return to sum
    return sum;

As you can see, the sum() method adds the base and the values of all segments. So, here's a question. If the value of Cell that has been accumulated on sum has been modified, can't it be calculated?

That's the answer, so LongAdder can say that it's not strong consistency, it's final consistency.

LongAdder VS AtomicLong

Code directly:

public class LongAdderVSAtomicLongTest {
    public static void main(String[] args){
        testAtomicLongVSLongAdder(1, 10000000);
        testAtomicLongVSLongAdder(10, 10000000);
        testAtomicLongVSLongAdder(20, 10000000);
        testAtomicLongVSLongAdder(40, 10000000);
        testAtomicLongVSLongAdder(80, 10000000);

    static void testAtomicLongVSLongAdder(final int threadCount, final int times){
        try {
            System.out.println("threadCount: " + threadCount + ", times: " + times);
            long start = System.currentTimeMillis();
            testLongAdder(threadCount, times);
            System.out.println("LongAdder elapse: " + (System.currentTimeMillis() - start) + "ms");

            long start2 = System.currentTimeMillis();
            testAtomicLong(threadCount, times);
            System.out.println("AtomicLong elapse: " + (System.currentTimeMillis() - start2) + "ms");
        } catch (InterruptedException e) {

    static void testAtomicLong(final int threadCount, final int times) throws InterruptedException {
        AtomicLong atomicLong = new AtomicLong();
        List<Thread> list = new ArrayList<>();
        for (int i=0;i<threadCount;i++){
            list.add(new Thread(() -> {
                for (int j = 0; j<times; j++){

        for (Thread thread : list){

        for (Thread thread : list){

    static void testLongAdder(final int threadCount, final int times) throws InterruptedException {
        LongAdder longAdder = new LongAdder();
        List<Thread> list = new ArrayList<>();
        for (int i=0;i<threadCount;i++){
            list.add(new Thread(() -> {
                for (int j = 0; j<times; j++){

        for (Thread thread : list){

        for (Thread thread : list){

The results are as follows:

threadCount: 1, times: 10000000
LongAdder elapse: 158ms
AtomicLong elapse: 64ms
threadCount: 10, times: 10000000
LongAdder elapse: 206ms
AtomicLong elapse: 2449ms
threadCount: 20, times: 10000000
LongAdder elapse: 429ms
AtomicLong elapse: 5142ms
threadCount: 40, times: 10000000
LongAdder elapse: 840ms
AtomicLong elapse: 10506ms
threadCount: 80, times: 10000000
LongAdder elapse: 1369ms
AtomicLong elapse: 20482ms

You can see that when there is only one thread, AtomicLong has higher performance. With more and more threads, AtomicLong's performance decreases dramatically, while LongAdder's performance has little impact.


(1) LongAdder stores values through base and cells arrays;

(2) Different threads will hash to different cell s to update, reducing competition;

(3) LongAdder has very high performance and will eventually reach a non-competitive state.


In the long Accumulate () method, there is a condition that n >= NCPU will not go to the expansion logic, and N is a multiple of 2. Does that mean that the maximum cell array can only reach the minimum 2nd power greater than or equal to NCPU?

The answer is clear. Because the same CPU core only runs one thread at the same time, and the failure of update indicates that two different cores update the same Cell, then the probe value of the thread that failed to update will be reset, so that next time the Cell in which it is located will change greatly. If run long enough, eventually all threads of the same core will have hash to the same. A Cell (Probability, but not necessarily all on one Cell) is updated, so the length of the cells array here does not need to be too long, enough to reach the CPU core.

For example, the author's computer is 8 core, so the largest array of cells will only reach 8, and 8 will not expand.

Welcome to pay attention to my public number "Tong Ge read the source code", see more source series articles, and swim together with brother Tong's source ocean.

Posted by cneale on Wed, 09 Oct 2019 04:55:17 -0700