Multithreaded learning

Keywords: Java Back-end Multithreading

1, Java Memory Model
JMM is the Java Memory Model, which defines the abstract concepts of main memory and working memory. The bottom layer corresponds to CPU register, cache, hardware memory, CPU instruction optimization, etc.
JMM is reflected in the following aspects:
1. Atomicity - ensure that the instruction will not be affected by thread context switching;
2. Visibility - ensure that the instruction will not be affected by the cpu cache;
3. Orderliness - ensure that the instructions will not be affected by the parallel optimization of cpu instructions;
Note: synchronized can ensure both atomicity and visibility, but the disadvantage is that synchronized is a heavyweight operation with relatively lower performance;

2, Balking in synchronous mode
Balking mode is used when a thread finds that another thread or this thread has done the same thing, then this thread does not need to do it again and directly ends the return.
For example:

public class MonitorService{
    // Used to indicate whether a thread has been started
    private volatile boolean strarting;
    
    public void start(){
        System.out.println("Attempt to start monitoring thread...");
        synchronized(this){
            if(strarting){
                return;
            }
            strarting = true;
        }
        // Actually start the monitoring thread
    }
}

3, volatile
use:
When obtaining a shared variable, in order to ensure the visibility of the variable, you need to use volatile decoration. It can be used to modify member variables and static member variables. It can prevent threads from looking up the value of variables from their own work cache. They must get its value from main memory. Threads operate volatile variables directly in main memory. That is, the modification of volatile variable by one thread is visible to another thread.
be careful:
volatile only ensures the visibility of shared variables so that other threads can see the latest values, but it can not solve the problem of instruction interleaving (atomicity cannot be guaranteed);
cas must use volatile to read the latest value of shared variables to achieve the effect of "compare and exchange".

Principle:
The underlying implementation principle of volatile is Memory Barrier.
1. Write instructions to volatile variables will be added to the write barrier (the write barrier ensures that changes to shared variables are synchronized to main memory before the barrier);
2. Read instructions to volatile variables will be added to the read barrier (and the read barrier ensures that the latest data in main memory is loaded for reading shared variables after the barrier);
Happens before principle
Happens before specifies the write operation of shared variables, which is visible to the read operation of other threads. It is a summary of a set of rules for visibility and order. Regardless of the happens before rules, JMM can not guarantee the write of shared variables by one thread and the visibility of shared variables by other threads.

4, CAS
The bottom layer of cas is lock cmpchg (X86 architecture), which can ensure the atomicity of [comparison exchange] under single core cpu and multi-core cpu. In the multi-core state, when a core executes an instruction with lock, the cpu will lock the bus. When the core finishes executing the instruction, turn on the bus. This process will not be interrupted by the thread scheduling mechanism, which ensures the accuracy of memory operation by multiple threads. It is atomic.
Characteristics of CAS
The combination of cas and volatile can realize lock free concurrency, which is suitable for the scenario of few threads and multi-core cpu.
1. cas is based on the idea of optimistic locking: the most optimistic estimation is not afraid of other threads to modify shared variables. Even if they are changed, it doesn't matter. Just try again.
2. synchronized is based on the idea of pessimistic locking: the most pessimistic estimation is to prevent other threads from modifying shared variables. When locked, other threads cannot access variables.
3. cas embodies lock free concurrency and non blocking concurrency. Because synchronized is not used, threads will not fall into blocking, which is one of the factors to improve efficiency. However, if the competition is fierce, it can be expected that retries will occur frequently, but the efficiency will be affected.

5, Common atomic operations
1. Atomic array
AtomicIntegerArray,AtomicLongArray,AtomicReferenceArray

2. Field Updater
AtomicReferenceFieldUpdater,AtomicIntegerFieldUpdater,AtomicLongFieldUpdater

public class Test19 {
    public static void main(String[] args) {
        Student student = new Student();
        AtomicReferenceFieldUpdater updater =
                AtomicReferenceFieldUpdater.newUpdater(Student.class, String.class, "name");

        updater.compareAndSet(student, null, "Zhang San");

        System.out.println(student);
    }
}
class Student{
    volatile String name;

    @Override
    public String toString() {
        return "Student{" +
                "name='" + name + '\'' +
                '}';
    }
}

3. Atomic accumulator (LongAdder)
Compare AtomicLong with LongAdder:
When accumulating 500000 data, LongAdder is four or five times faster than AtmicLong. The reason for performance improvement is very simple. When there is competition, set multiple accumulation units (which will not exceed the number of cpu cores), Thread-0 accumulates Cell[0], and Thread-1 accumulates Cell[1]... Finally summarize the results, so that they operate on different cell variables during accumulation, so as to reduce the number of cas retry failures and improve the performance.

Several key fields in the LongAdder class:

// Accumulation cell array, lazy initialization
transient volatile Cell[] cells;

// Base value. If there is no competition, cas is used to accumulate this field
transient volatile long base;

// When cells are created or expanded, it is set to 1, indicating locking
transient volatile int cellsBusy;

6, Implement cas lock

/**
 * To realize cas lock, this method should not be used in production!!!
 * It is easy to make the thread idle and cause problems. The specific production is operated at the bottom of the jvm
 */
public class Test20 {

    /**
     * 0: No lock
     * 1: Lock
     */
    private AtomicInteger state = new AtomicInteger(0);

    /**
     * Lock
     */
    public void lock(){
        while(true){
            if(state.compareAndSet(0, 1)){
                break;
            }
        }
    }

    /**
     * Unlock
     */
    public void unLock(){
        state.set(0);
    }
}

7, Pseudo sharing
Cell source code in LongAddr:

@sun.misc.Contended
static final class Cell{
	volatile long value;
	Cell(long x){
		value= x;
	}
	
	// The most important method is used for cas accumulation. prev represents the old value and next represents the new value
	final boolean cas(long prev, long next){
		return UNSAFE.compareAndSwapLong(this, valueOffset, prev, next);
	}
	...
}

1. Because cells are in array form and stored continuously in memory, one cell has 24 bytes (16 bytes of object header and 8 bytes of value), so two cell objects can be stored in the cache line. This problem arises: Core-0 needs to modify Cell[0], and Core-1 needs to modify Cell[1]. Whoever modifies it successfully will invalidate the cache line of the other Core, such as Cell[0] = 6000 in Core-0, Cell[1] = 8000 accumulates Cell[0] = 6001 and Cell[1] = 8000. At this time, the Core-1 cache line is invalid.
The diagram is as follows:

2. @ sun.misc.contented is used to solve this problem. Its principle is to add 128 byte padding before and after the object or field using this annotation, so that the cpu can occupy different cache lines when pre reading the object to the cache, so as not to cause the invalidation of the other party's cache lines.
3. Pseudo sharing: one cache line holds multiple Cell objects@ Sun.misc.contented is used to prevent pseudo sharing.

Causes of pseudo sharing:
1. Because the speed of cpu and memory is very different, we need to read data to cache to improve efficiency.
2. The cache is in cache behavior units, and each cache line corresponds to a piece of memory, usually 64 bytes (8 long).
3. The addition of cache will result in the generation of data copies, that is, the same data will be cached in cache lines of different cores.
4. cpu should ensure data consistency. If a cpu core thread changes data, the entire cache line corresponding to other cpu cores must be invalidated.
Speed comparison of cpu reading data from different locations:

8, Immutable class
Immutable classes and objects are decorated with final and are immutable, so they are thread safe. Take String class as an example:

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
   
    private final char value[];

    private int hash; // Default to 0
	...
}

It can be found that all attributes in this class are final
1. The attribute is decorated with final to ensure that the attribute is read-only and cannot be modified;
2. The class is decorated with final to ensure that the methods in the class cannot be covered, so as to prevent the subclass from inadvertently breaking the ring and being immutable;
3. Using protective copy (that is, when modifying the object, you do not change the object, but directly create a new object). There are some methods related to modification. Using protective copy to ensure thread safety, such as substring method;

    public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        int subLen = value.length - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
    }

It can be found that the internal function is to call the String construction method to create a new String, and then enter this construction to see whether the final char[] value has been modified:

   public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

It is found that there is no. When constructing a new string object, a new char[] value will be generated to copy the content. This means of avoiding sharing by creating replica objects is called defensive copy;

9, Sharing element mode
Applicable scenario: when a limited number of objects of the same type need to be reused.
Embodiment: packaging
In JDK, wrapper classes such as boolean, byte, short, integer, Long, and character provide valueOf methods. For example, valueOf of Long caches Long objects between - 128 and 127. Objects will be reused in this range. If the range is greater than this, Long objects will be created:

public static Long valueOf(long l) {
        final int offset = 128;
        if (l >= -128 && l <= 127) { // will cache
            return LongCache.cache[(int)l + offset];
        }
        return new Long(l);
}

private static class LongCache {
        private LongCache(){}

        static final Long cache[] = new Long[-(-128) + 127 + 1];

        static {
            for(int i = 0; i < cache.length; i++)
                cache[i] = new Long(i - 128);
        }
}

be careful:
1. The range of Byte, Short and Long caches is - 128 ~ 127;
2. The range of Character cache is 0 ~ 127;
3. The default range of Integer is - 128 ~ 127. The minimum value cannot be changed, but the maximum value can be adjusted by adjusting virtual machine parameters
-Djava.lang.Integer.IntegerCache.high to change;
4. Boolean caches TRUE and FALSE;

10, final principle
The final principle can be divided into two parts:
1. How to set the final variable,

public class TestFinal{
	final int a = 20;
}

The assignment of the final variable will be completed through the putfield instruction. Similarly, a write barrier will be added after this instruction to ensure that it will not be 0 when other threads read its value;
2. How to get the final variable:
Get the final variable. If the value is small, it will be copied to the new class. If it is large, it will also be placed in the constant pool of this class. Ordinary variables should be taken from the heap, and the acquisition efficiency is higher than ordinary variables;

11, Stateless
Classes without any member variables are thread safe, because the data saved by member variables can also be called state information. Therefore, classes without member variables are called stateless.

12, Implementation atomic class

/**
 * Implementation atomic class
 */
class MyAtomicInteger {
    // Objects to protect
    private volatile int value;
    // Gets the offset of the protected object
    private static final long valueOffset;
    // Get unsafe object
    static final Unsafe UNSAFE;

    static {
        UNSAFE = UnsafeAccessor.getUnsafe();
        try {
            valueOffset = UNSAFE.objectFieldOffset(MyAtomicInteger.class.getDeclaredField("value"));
        } catch (NoSuchFieldException e) {
            e.printStackTrace();
            throw new RuntimeException(e);
        }
    }

    public MyAtomicInteger(int value) {
        this.value = value;
    }

    public int getValue(){
        return value;
    }

    // Atomic subtraction
    public void decrement(int amount){
        while(true){
            int prev = this.value;
            int next = prev - amount;
            if(UNSAFE.compareAndSwapInt(this, valueOffset, prev, next)){
                break;
            }
        }

    }

}

class UnsafeAccessor {
    private static final Unsafe unsafe;
    
    static {
        try{
            Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
            theUnsafe.setAccessible(true);
            unsafe = (Unsafe) theUnsafe.get(null);
        }catch (Exception e){
            throw new Error(e);
        }
    }

    public static Unsafe getUnsafe(){
        return unsafe;
    }
}

13, Implement database connection pool (shared meta mode)

class Pool{
    // 1. Connection pool size
    private final int poolSize;

    // 2. Connection object array
    private Connection[] connections;

    // 3. Connection status array 0: indicates idle, 1: indicates busy
    private AtomicIntegerArray states;

    // 4. Initialization of construction method
    public Pool(int poolSize){
        this.poolSize = poolSize;
        this.connections = new Connection[poolSize];
        this.states = new AtomicIntegerArray(new int[poolSize]);
        for(int i = 0; i < poolSize; i++){
            connections[i] = new MockConnection();
        }
    }

    // Borrow connection
    public Connection borrow(){
        while(true){
            for(int i = 0; i < poolSize; i++){
                // Get idle connection
                if (states.get(i) == 0) {
                    // Change the i-th position of the array from 0 to 1
                    if(states.compareAndSet(i, 0, 1)){
                        return connections[i];
                    }
                }
            }
           // If there is no idle connection, the current thread enters wait
           synchronized (this){
               try{
                   this.wait();
               }catch (InterruptedException e){
                   e.printStackTrace();
               }
           }
        }
    }

    // Return connection
    public void free(Connection conn){
        for(int i = 0; i < poolSize; i++){
            if(connections[i] == conn){
                // There is only one thread holding the connection, so cas is not used
                states.set(i, 0);
                synchronized (this){
            		this.notifyAll();
        		}
                break;
            }
        }
    }
}

class MockConnection implements Connection{
 // ...
}

Posted by miro on Sun, 07 Nov 2021 14:33:22 -0800