DisruptorRingBuffer single producer write

Keywords: Java Big Data disruptor

The previous chapter mainly introduces how consumers read data from RingBuffer. This chapter mainly introduces how a single producer writes data to RingBuffer. How to avoid overlapping rings in the process of RingBuffer data writing, notify consumers after writing, batch processing at the producer side, and how multiple producers work together.

The process of writing data in RingBuffer involves two-phase commit

1) The producer needs to apply for the next node in the buffer.

2) When the producer has finished writing data to the node, it needs to call publish to publish the data.

1. Single producer SingleProducerSequencer data write

In the background, the producer sequencer is responsible for all interaction details to find the next node from the RingBuffer, and then the producer is allowed to write data to it.

In the figure, a producer writes to the RingBuffer, and the SingleProducerSequencer object has a list of gatingSequences of all consumers accessing the RingBuffer (different from the queue, which needs to track the head and tail of the queue, and they sometimes point to the same location)., In the Disruptor, consumers are responsible for notifying them which serial number is processed, not RingBuffer.

If you want to make sure that we don't overlap RingBuffer, you need to check where all consumers read. In the figure above, there are two consumers. One consumer successfully reads the maximum serial number 13 (highlighted in blue), and the second consumer is a little behind and stops at serial number 6. Therefore, consumer 2 needs to run a full RingBuffer lap before catching up with consumer 1.

Now the producer wants to write to the node occupied by sequence number 6 in RingBuffer, because it is the next node of the current cursor of RingBuffer. But the singleproducer sequencer understands that it cannot write now because a consumer is occupying it. So the single producer sequencer stops spinning and waits until the consumer leaves.

2. Apply for next node

It is now conceivable that consumer 2 has processed a batch of nodes and moved its serial number forward. It may be moved to No. 9 (because of the batch processing method on the consumer side, in reality, I would expect it to reach 13)

The above figure shows what happens when consumer 2 moves to serial number 9. The singleproducer sequencer will see the next node sequence number 6, which is already available. It will preempt the Entry on this node (I haven't specifically introduced the Entry object. Basically, it is a bucket of RingBuffer data written to a sequence number), update the next sequence number (14) to the sequence number of the Entry, and then return the Entry to the producer. The producer can then write data into the Entry.

3. Submit new data

Submit the production data and notify the consumer.

Green indicates the recently written Entry with the serial number of 14. Submit it through the publish method, set the cursor of RingBuffer to 14, and notify the consumer that 14 has been updated and can be read (different WaitStrategy implementations implement reminders in different ways, depending on whether it adopts blocking mode). Now consumer 2 can read the data of Entry14 for consumption.

After reading the above principle, let's analyze how the singleproducer sequencer obtains the serial number and submits the data**

4. SingleProducerSequencer producer class diagram

  • Singleproducer Sequencer inherits AbstractSequencer and implements Sequencer interface.

  • Sequencer provides adding and deleting consumer sequences, creating SequenceBarrier, and obtaining the minimum sequence number and the maximum published sequence number.

  • Cursored gets the current cursor.

  • Sequenced get the current ringbuffer size, get a sequence number, and submit the data interface.

5. Direct relationship between consumers and producers

First, take a look at the definitions in AbstractSequencer

// The current cursor position of the producer
protected final Sequence cursor = new Sequence(Sequencer.INITIAL_CURSOR_VALUE);
// The collection of sequence numbers currently processed by the consumer
protected volatile Sequence[] gatingSequences = new Sequence[0];

Because volatile can only save visibility and prohibit compiler optimization, mutual exclusion can not be guaranteed at that time, and there will be problems when multithreading reads and writes concurrently.

private static final AtomicReferenceFieldUpdater<AbstractSequencer, Sequence[]> SEQUENCE_UPDATER = AtomicReferenceFieldUpdater.newUpdater(AbstractSequencer.class, Sequence[].class, "gatingSequences");

Use AtomicReferenceFieldUpdater to solve the problem of multi-threaded updating gatingSequences

The specific implementation refers to the CAS update in SequenceGroups.

public final void addGatingSequences(Sequence... gatingSequences) {
    SequenceGroups.addSequences(this, SEQUENCE_UPDATER, this, gatingSequences);
}
public boolean removeGatingSequence(Sequence sequence) {
    return SequenceGroups.removeSequence(this, SEQUENCE_UPDATER, sequence);
}

6. The producer uses next to get the next available sequence number

public long next(int n) {
    if (n <1) {
        throw new IllegalArgumentException("n must be > 0");
    }
    // Current minimum sequence number (cursor with single producer as producer)
    long nextValue = this.nextValue;
    // Next sequence number
    long nextSequence = nextValue + n;
    // Overlapping point position
    long wrapPoint = nextSequence - bufferSize;
    // Sequence number of cached consumer processing
    long cachedGatingSequence = this.cachedValue;
    // wrapPoint > cachedGatingSequence,
    // The overlapping position is greater than the sequence number of the cached consumer processing, which indicates that some consumers have not completed the processing and cannot prevent the data
    // cachedGatingSequence > nextValue
    // Only in https://github.com/LMAX-Exchange/disruptor/issues/76 In case of
    if (wrapPoint > cachedGatingSequence || cachedGatingSequence > nextValue) {
        long minSequence;
        // Exit the loop after waiting for no overlap
        while (wrapPoint > (minSequence = Util.getMinimumSequence(gatingSequences, nextValue))) {
            // Notify the consumer to handle the event
            waitStrategy.signalAllWhenBlocking();
            // After the producer waits for spin, the subsequent strategy needs to be used
            LockSupport.parkNanos(1L);
        }
        // Cache the minimum sequence number of consumers and producers
        this.cachedValue = minSequence;
    }
    // Sets the sequence number of the next available for the producer
    this.nextValue = nextSequence;
    return nextSequence;
}

7. Producers use publish to publish data

public void publish(long sequence) { // Set the cursor sequence number of the producer
    cursor.set(sequence);
    // Notify the consumer to handle the event
    waitStrategy.signalAllWhenBlocking();
}

After publishing the data, the consumer sequenceBarrier.waitFor(nextSequence) can obtain the maximum accessible availableSequence number of RingBuffer and process the data.

8. Consumer consumption data

Recall the waitFor function of ProcessingSequenceBarrier, which calls to sequencer.getHighestPublishedSequence(sequence, availableSequence).

public long waitFor(final long sequence)
        throws AlertException, InterruptedException, TimeoutException {
    // Check for clert exceptions
    checkAlert();
    // Obtain the available Sequence number through the waitStrategy policy. cursorSequence is the current Sequence and dependentSequence is the dependent Sequence []
    long availableSequence = waitStrategy.waitFor(sequence, cursorSequence, dependentSequence, this);
    // The generated sequence is smaller than expected, and the sequence number may be reset back to the old oldSequence value
    //Can refer to https://github.com/LMAX-Exchange/disruptor/issues/76
    if (availableSequence <sequence) {
        return availableSequence;
    }
    // Get the largest available published sequence, which may be smaller than the sequence
    // It will appear in multiple producers. When producer 1 obtains serial number 13 and producer 2 obtains serial number 14; If producer 1 does not publish and producer 2 publishes, the available sequence number obtained will be 12 and the sequence will be 13
    return sequencer.getHighestPublishedSequence(sequence, availableSequence);
}
public long getHighestPublishedSequence(long lowerBound, long availableSequence) {
    return availableSequence;
}

In the getHighestPublishedSequence method of singleproducer sequence, directly return the available availableSequence to notify consumers of consumption data. Through the above steps, producers and consumers work together.

Posted by The-Last-Escape on Wed, 27 Oct 2021 19:48:18 -0700