If you want to implement a counter, experienced students can quickly think of using AtomicInteger or AtomicLong for simple encapsulation.
Because counter operation involves memory visibility and thread competition, and the implementation of Atomic * * perfectly masks these technical details, we only need to execute the corresponding methods to achieve the corresponding business requirements.
Although Atomic * * is easy to use, the performance problems of these operations will be magnified in the case of large concurrency. Let's take a look at the implementation code of getAndIncrement
public final long getAndIncrement() { return unsafe.getAndAddLong(this, valueOffset, 1L); } // Implementation of unsafe class public final long getAndAddLong(Object var1, long var2, long var4) { long var6; do { var6 = this.getLongVolatile(var1, var2); } while(!this.compareAndSwapLong(var1, var2, var6, var6 + var4)); return var6; }
Obviously, in the implementation of getAndAddLong, in order to achieve the correct accumulation operation, if the concurrency is large, the cpu will spend a lot of time on trial and error, which is equivalent to a spin operation. If the concurrency is small, these costs can be ignored.
Now that we have realized that Atomic * * has such a business defect, Doug Lea provides us with LongAdder. The internal implementation is similar to the segmented lock of ConcurrentHashMap. At best, each thread has an independent counter, which can greatly reduce concurrent operations.
Next, compare the performance of AtomicLong and LongAdder through JMH.
@OutputTimeUnit(TimeUnit.MICROSECONDS) @BenchmarkMode(Mode.Throughput) public class Main { private static AtomicLong count = new AtomicLong(); private static LongAdder longAdder = new LongAdder(); public static void main(String[] args) throws Exception { Options options = new OptionsBuilder().include(Main.class.getName()).forks(1).build(); new Runner(options).run(); } @Benchmark @Threads(10) public void run0(){ count.getAndIncrement(); } @Benchmark @Threads(10) public void run1(){ longAdder.increment(); } }
1. Set the benchmark mode to Mode.Throughput , test throughput
2. Set the benchmark mode to Mode.AverageTime , average test time
Number of threads is 1
1. Throughput
Benchmark Mode Cnt Score Error Units Main.run0 thrpt 5 154.525 ± 9.767 ops/us Main.run1 thrpt 5 89.599 ± 7.951 ops/us
2. Average time
Benchmark Mode Cnt Score Error Units Main.run0 avgt 5 0.007 ± 0.001 us/op Main.run1 avgt 5 0.011 ± 0.001 us/op
Single thread situation:
1. AtomicLong has advantages in throughput and average time
10 threads
1. Throughput
Benchmark Mode Cnt Score Error Units Main.run0 thrpt 5 37.780 ± 1.891 ops/us Main.run1 thrpt 5 464.927 ± 143.207 ops/us
2. Average time
Benchmark Mode Cnt Score Error Units Main.run0 avgt 5 0.290 ± 0.038 us/op Main.run1 avgt 5 0.021 ± 0.001 us/op
When there are 10 concurrent threads:
The throughput of LongAdder is relatively large, more than 10 times that of AtomicLong. The average time taken by LongAdder is one tenth that of AtomicLong.
30 threads
1. Throughput
Benchmark Mode Cnt Score Error Units Main.run0 thrpt 5 36.215 ± 2.341 ops/us Main.run1 thrpt 5 486.630 ± 26.894 ops/us
2. Average time
Main.run0 avgt 5 0.792 ± 0.021 us/op Main.run1 avgt 5 0.063 ± 0.002 us/op
When the number of threads is 30:
The throughput of LongAdder is relatively large, which is more than 10 times that of AtomicLong.
The average time consumption of LongAdder is also one tenth of that of AtomicLong.
summary
In some high concurrency scenarios, such as current limiting counter, it is recommended to replace AtomicLong with LongAdder. The performance can be improved a lot.