problem
(1) what is Phaser?
(2) What are the characteristics of Phaser?
(3) Phaser's advantages over Cyclic Barrier and Count Down Latch?
brief introduction
Phaser, translated into stages, is suitable for a scenario where a large task can be accomplished in multiple stages, and tasks in each stage can be executed concurrently by multiple threads, but tasks in the previous stage must be completed before tasks in the next stage can be executed.
Although this scenario can also be implemented using Cyclic Barrier or Country DownLatch, it is much more complex. Firstly, how many phases are needed may change, and secondly, the number of tasks in each phase may also change. Phaser is more flexible and convenient than Cyclic Barrier and Count DownLatch.
Usage method
Let's look at the simplest use case:
public class PhaserTest { public static final int PARTIES = 3; public static final int PHASES = 4; public static void main(String[] args) { Phaser phaser = new Phaser(PARTIES) { @Override protected boolean onAdvance(int phase, int registeredParties) { // This article is original by the public number "Tong Ge read the source code". Please support originality. Thank you! ] System.out.println("=======phase: " + phase + " finished============="); return super.onAdvance(phase, registeredParties); } }; for (int i = 0; i < PARTIES; i++) { new Thread(()->{ for (int j = 0; j < PHASES; j++) { System.out.println(String.format("%s: phase: %d", Thread.currentThread().getName(), j)); phaser.arriveAndAwaitAdvance(); } }, "Thread " + i).start(); } } }
Here we define a big task that needs four stages to complete. Each stage needs three small tasks. For these small tasks, we set up three threads to perform these small tasks separately. The output is as follows:
Thread 0: phase: 0 Thread 2: phase: 0 Thread 1: phase: 0 =======phase: 0 finished============= Thread 2: phase: 1 Thread 0: phase: 1 Thread 1: phase: 1 =======phase: 1 finished============= Thread 1: phase: 2 Thread 0: phase: 2 Thread 2: phase: 2 =======phase: 2 finished============= Thread 0: phase: 3 Thread 2: phase: 3 Thread 1: phase: 3 =======phase: 3 finished=============
As you can see, each stage is completed by three threads before entering the next stage. How does this work? Let's learn together.
Principle guess
Based on the principle of AQS, we can guess the implementation principle of Phaser.
First, we need to store the current phase, the number of tasks (participants) in the current phase and the number of unfinished participants. These three variables can be stored in a variable state.
Secondly, a queue is needed to store the first completed participants, and when the last participant completes the task, it is necessary to wake up the participants in the queue.
Well, that's about it.
Combined with the above case, bring in:
Initially, the current stage is 0, the number of participants is 3, and the number of uncompleted participants is 3.
The first thread executes to phaser.arriveAndAwaitAdvance(); when it enters the queue;
The second thread executes to phaser.arriveAndAwaitAdvance(); when it enters the queue;
The third thread executes to phaser.arriveAndAwaitAdvance(); it first executes the summary onAdvance() of this stage, and then wakes up the first two threads to continue the task of the next stage.
Well, the whole makes sense. As for whether this is the case, let's look at the source code.
Source code analysis
Major internal classes
static final class QNode implements ForkJoinPool.ManagedBlocker { final Phaser phaser; final int phase; final boolean interruptible; final boolean timed; boolean wasInterrupted; long nanos; final long deadline; volatile Thread thread; // nulled to cancel wait QNode next; QNode(Phaser phaser, int phase, boolean interruptible, boolean timed, long nanos) { this.phaser = phaser; this.phase = phase; this.interruptible = interruptible; this.nanos = nanos; this.timed = timed; this.deadline = timed ? System.nanoTime() + nanos : 0L; thread = Thread.currentThread(); } }
First completed participants are placed in the queue of nodes, where we only need to focus on threads and next two attributes can be, obviously this is a single linked list, stored in the queue threads.
Main attributes
// State variables that store the current phase, number of participants, and number of unfinished participants unarrived_count private volatile long state; // How many participants can there be at most, that is, how many tasks can there be at most at each stage? private static final int MAX_PARTIES = 0xffff; // How many stages can there be at most? private static final int MAX_PHASE = Integer.MAX_VALUE; // The offset of the number of participants private static final int PARTIES_SHIFT = 16; // Current phase offset private static final int PHASE_SHIFT = 32; // Mask for the number of unfinished participants, 16 bits low private static final int UNARRIVED_MASK = 0xffff; // to mask ints // Number of participants, 16 in the middle private static final long PARTIES_MASK = 0xffff0000L; // to mask longs // The mask of counts, which is equal to the'|'operation of the number of participants and the number of incomplete participants private static final long COUNTS_MASK = 0xffffffffL; private static final long TERMINATION_BIT = 1L << 63; // Complete one participant at a time private static final int ONE_ARRIVAL = 1; // Increase and decrease the number of participants used private static final int ONE_PARTY = 1 << PARTIES_SHIFT; // Reduce the use of participants private static final int ONE_DEREGISTER = ONE_ARRIVAL|ONE_PARTY; // Use without participants private static final int EMPTY = 1; // Used to find the number of incomplete participants private static int unarrivedOf(long s) { int counts = (int)s; return (counts == EMPTY) ? 0 : (counts & UNARRIVED_MASK); } // Used to find the number of participants (middle 16), pay attention to the position of int private static int partiesOf(long s) { return (int)s >>> PARTIES_SHIFT; } // For calculating the number of stages (32 bits high), pay attention to the position of int private static int phaseOf(long s) { return (int)(s >>> PHASE_SHIFT); } // Number of completed participants private static int arrivedOf(long s) { int counts = (int)s; // Low 32 bit return (counts == EMPTY) ? 0 : (counts >>> PARTIES_SHIFT) - (counts & UNARRIVED_MASK); } // It is used to store the thread of the completed participant, and select different queues according to the parity of the current stage. private final AtomicReference<QNode> evenQ; private final AtomicReference<QNode> oddQ;
The main attributes are state, evenQ and oddQ:
(1) state, state variables, high 32 bit storage at the current stage phase, the number of intermediate 16 bit storage participants, and the number of participants who have not completed 16 bit storage. ]
(2) EveQ and oddQ, queues stored by completed participants, wake up the participants in the queue after the last participant completes the task to continue the task of the next stage or finish the task.
Construction method
public Phaser() { this(null, 0); } public Phaser(int parties) { this(null, parties); } public Phaser(Phaser parent) { this(parent, 0); } public Phaser(Phaser parent, int parties) { if (parties >>> PARTIES_SHIFT != 0) throw new IllegalArgumentException("Illegal number of parties"); int phase = 0; this.parent = parent; if (parent != null) { final Phaser root = parent.root; this.root = root; this.evenQ = root.evenQ; this.oddQ = root.oddQ; if (parties != 0) phase = parent.doRegister(1); } else { this.root = this; this.evenQ = new AtomicReference<QNode>(); this.oddQ = new AtomicReference<QNode>(); } // The storage of state variables is divided into three segments this.state = (parties == 0) ? (long)EMPTY : ((long)phase << PHASE_SHIFT) | ((long)parties << PARTIES_SHIFT) | ((long)parties); }
There is also a parent and root in the constructor, which are used to construct multi-level stages and are not covered by this article.
The emphasis is still on the assignment of state, which stores the current phase with 32 bits high, the number of participants with 16 bits middle and the number of participants with 16 bits low.
Let's look at the source code of several main methods:
register() method
Register a participant, and if the onAdvance() method is executing when the method is called, the method waits for its execution to complete.
public int register() { return doRegister(1); } private int doRegister(int registrations) { // The value that state should add, note that this is equivalent to adding parties and unarrived at the same time long adjust = ((long)registrations << PARTIES_SHIFT) | registrations; final Phaser parent = this.parent; int phase; for (;;) { // The value of state long s = (parent == null) ? state : reconcileState(); // The lower 32 bits of state are the values of parties and unarrived int counts = (int)s; // Value of parties int parties = counts >>> PARTIES_SHIFT; // The value of unarrived int unarrived = counts & UNARRIVED_MASK; // Check for spillovers if (registrations > MAX_PARTIES - parties) throw new IllegalStateException(badRegister(s)); // Current phase phase = (int)(s >>> PHASE_SHIFT); if (phase < 0) break; // Not the first participant if (counts != EMPTY) { // not 1st registration if (parent == null || reconcileState() == s) { // unarrived equals 0, indicating that the onAdvance() method is being executed at the current stage, waiting for its execution to complete if (unarrived == 0) // wait out advance root.internalAwaitAdvance(phase, null); // Otherwise, change the value of state, add adjust, and jump out of the loop if it succeeds. else if (UNSAFE.compareAndSwapLong(this, stateOffset, s, s + adjust)) break; } } // Is the first participant else if (parent == null) { // 1st root registration // Calculate the value of state long next = ((long)phase << PHASE_SHIFT) | adjust; // Modify the value of state to jump out of the loop if it succeeds if (UNSAFE.compareAndSwapLong(this, stateOffset, s, next)) break; } else { // Multilayer Stage Processing synchronized (this) { // 1st sub registration if (state == s) { // recheck under lock phase = parent.doRegister(1); if (phase < 0) break; // finish registration whenever parent registration // succeeded, even when racing with termination, // since these are part of the same "transaction". while (!UNSAFE.compareAndSwapLong (this, stateOffset, s, ((long)phase << PHASE_SHIFT) | adjust)) { s = state; phase = (int)(root.state >>> PHASE_SHIFT); // assert (int)s == EMPTY; } break; } } } } return phase; } // Waiting for the onAdvance() method to complete // The principle is to spin a certain number of times first. If we enter the next stage, this method will return directly. // If the next stage is not reached after a certain number of spins, the current thread queues and waits for onAdvance() to wake up after execution. private int internalAwaitAdvance(int phase, QNode node) { // Guarantee that the queue is empty releaseWaiters(phase-1); // ensure old queue clean boolean queued = false; // true when node is enqueued int lastUnarrived = 0; // to increase spins upon change // Number of spins int spins = SPINS_PER_ARRIVAL; long s; int p; // Check if the current phase changes, and if the change indicates that the next phase is in progress, then there is no need to spin. while ((p = (int)((s = state) >>> PHASE_SHIFT)) == phase) { // If the node is empty, the incoming is empty when registering if (node == null) { // spinning in noninterruptible mode // Number of unfinished participants int unarrived = (int)s & UNARRIVED_MASK; // unarrived changes, increasing the number of spins if (unarrived != lastUnarrived && (lastUnarrived = unarrived) < NCPU) spins += SPINS_PER_ARRIVAL; boolean interrupted = Thread.interrupted(); // When the number of spins is over, a new node is created. if (interrupted || --spins < 0) { // need node to record intr node = new QNode(this, phase, false, false, 0L); node.wasInterrupted = interrupted; } } else if (node.isReleasable()) // done or aborted break; else if (!queued) { // push onto queue // Nodes are queued AtomicReference<QNode> head = (phase & 1) == 0 ? evenQ : oddQ; QNode q = node.next = head.get(); if ((q == null || q.phase == phase) && (int)(state >>> PHASE_SHIFT) == phase) // avoid stale enq queued = head.compareAndSet(q, node); } else { try { // The current thread is blocked and waits to be waked up, just like calling LockSupport.park(). ForkJoinPool.managedBlock(node); } catch (InterruptedException ie) { node.wasInterrupted = true; } } } // To show that the thread where the node is located has been awakened if (node != null) { // Threads in empty nodes if (node.thread != null) node.thread = null; // avoid need for unpark() if (node.wasInterrupted && !node.interruptible) Thread.currentThread().interrupt(); if (p == phase && (p = (int)(state >>> PHASE_SHIFT)) == phase) return abortWait(phase); // possibly clean up on abort } // Wake up currently blocked threads releaseWaiters(phase); return p; }
The overall logic of adding a participant is as follows:
(1) To add a participant, two values of parties and unarrived, i.e. the middle 16 and the low 16 positions of state, need to be added at the same time.
(2) If it is the first participant, try to update the value of state atomically and exit if it succeeds.
(3) If it is not the first participant, check whether onAdvance() is being executed, if it is waiting for onAdvance() to be executed, and if it is not, try to update the value of state atomically until it exits successfully.
(4) Waiting for onAdvance() to complete is to wait by spinning first and then queuing to reduce thread context switching.
arriveAndAwaitAdvance() method
The current stage of the current thread is completed, waiting for other threads to complete the current stage.
If the current thread is the last to arrive at this stage, the current thread executes the onAdvance() method and wakes other threads to the next stage.
public int arriveAndAwaitAdvance() { // Specialization of doArrive+awaitAdvance eliminating some reads/paths final Phaser root = this.root; for (;;) { // The value of state long s = (root == this) ? state : reconcileState(); // Current stage int phase = (int)(s >>> PHASE_SHIFT); if (phase < 0) return phase; // Values of parties and unarrived int counts = (int)s; // unarrived value (low 16 bits of state) int unarrived = (counts == EMPTY) ? 0 : (counts & UNARRIVED_MASK); if (unarrived <= 0) throw new IllegalStateException(badArrive(s)); // Modify the value of state if (UNSAFE.compareAndSwapLong(this, stateOffset, s, s -= ONE_ARRIVAL)) { // If it's not the last one to arrive, call the internalAwaitAdvance() method to spin or enter the queue to wait if (unarrived > 1) // Here is a direct return. The source code for the internalAwaitAdvance() method is parsed by the register() method. return root.internalAwaitAdvance(phase, null); // Here is the last participant to arrive. if (root != this) return parent.arriveAndAwaitAdvance(); // n retains only the parts in the state, that is, the middle 16 bits. long n = s & PARTIES_MASK; // base of next state // The value of parties, i.e. the number of participants to arrive next time int nextUnarrived = (int)n >>> PARTIES_SHIFT; // Execute the onAdvance() method and return true to indicate that the number of participants in the next stage is zero, which is the end. if (onAdvance(phase, nextUnarrived)) n |= TERMINATION_BIT; else if (nextUnarrived == 0) n |= EMPTY; else // n plus unarrived n |= nextUnarrived; // Next phase waits for the current phase to add 1 int nextPhase = (phase + 1) & MAX_PHASE; // n plus the value of the next stage n |= (long)nextPhase << PHASE_SHIFT; // Modify the value of state to n if (!UNSAFE.compareAndSwapLong(this, stateOffset, s, n)) return (int)(state >>> PHASE_SHIFT); // terminated // Wake up other participants and move on to the next stage releaseWaiters(phase); // Returns the value of the next phase return nextPhase; } } }
The general logic of arriveAndAwaitAdvance is:
(1) Modify the unarrived part of the state by 1;
(2) If it is not the last one to arrive, the internalAwaitAdvance() method is called to spin or queue for waiting.
(3) If the last one arrives, the onAdvance() method is called, and then the value of state is changed to the corresponding value of the next stage, and other waiting threads are awakened.
(4) Return the value of the next stage;
summary
(1) Phaser is suitable for multi-stage and multi-task scenarios, and the tasks in each stage can be controlled very carefully.
(2) Phaser uses state variables and queues to implement the whole logic. This article is original by the public number "Tong Ge read the source code". Please support originality. Thank you! ]
(3) The current phase is stored in 32 bits of state, 16 bits of state store the number of participants (tasks) in the current phase, and 16 bits of state store the number of unfinished participants unarrived.
(4) Queues will select different queues according to the parity of the current stage.
(5) When not the last participant arrives, it spins or enters the queue to wait for all participants to complete the task.
(6) When the last participant completes the task, it wakes up the thread in the queue and goes to the next stage.
Egg
Phaser's advantages over Cyclic Barrier and Count Down Latch?
Answer: There are two main advantages:
(1) Phaser can accomplish multi-stage tasks, while a Cyclic Barrier or CountDownLatch can only control one or two stages of tasks.
(2) The number of tasks in each phase of Phaser can be controlled, and once the number of tasks in a Cyclic Barrier or CountDownLatch is determined, it cannot be modified.
Recommended reading
1,The Beginning of the Dead java Synchronization Series
2,Unsafe Analysis of Dead java Magic
3,JMM (Java Memory Model) of Dead java Synchronization Series
4,volatile analysis of dead java synchronization series
5,synchronized analysis of dead-end java synchronization series
6,Do it yourself to write a Lock
7,AQS Beginning of the Dead java Synchronization Series
9,ReentrantLock Source Code Resolution of Dead java Synchronization Series (2) - Conditional Lock
10,ReentrantLock VS synchronized
11,ReentrantReadWriteLock Source Parsing of Dead java Synchronization Series
12,Semaphore Source Parsing of Dead java Synchronization Series
13,CountDownLatch Source Parsing of Dead java Synchronization Series
14,The Final AQS of the Dead java Synchronization Series
15,StampedLock Source Parsing of Dead java Synchronization Series
16,Cyclic Barrier Source Parsing of Dead java Synchronization Series
This article is original from "Tong Ge read source code". Please support originality. Thank you! ]