ConsumeQueue construction process analysis

Keywords: RocketMQ

1. Preface

Theoretically, RocketMQ can run normally as long as there is a CommitLog file. Why maintain the ConsumeQueue file?

ConsumeQueue is a consumption queue, which is introduced to improve the consumption speed of consumers. After all, RocketMQ is based on the Topic subscription mode. Consumers often only care about the messages they subscribe to. If each consumption retrieves data from the CommitLog file, there is no doubt that the performance is very poor. With consumqueue, consumers can quickly locate the message for consumption according to the offset of the message in the CommitLog file.

As mentioned in previous articles, the Broker will write the messages sent by the client to the CommitLog file for persistent storage. However, the whole process does not involve the operation of ConsumeQueue file, so how is ConsumeQueue file built?

2. ReputMessageService

ReputMessageService is a message replay service. Please allow me to name it this way. When the Broker starts, it will start a thread to execute the doReput() method every millisecond.

Its purpose is to "replay" the message written to the CommitLog file. It has a property reputFromOffset, which records the offset of message replay. It will be assigned when the MessageStore is started.

Its working principle is to read the message to be replayed in the CommitLog according to the replay offset reputFromOffset, build the DispatchRequest object, and then distribute the DispatchRequest object to each CommitLogDispatcher for processing.

MessageStore maintains a collection of CommitLogDispatcher objects. Currently, there are only three processors:

  1. Commitlogdispatcher buildconsumequeue: build the consummequeue index.
  2. Commitlogdispatcher buildindex: build Index.
  3. Commitlogdispatcher calcbitmap: build bloom filter to accelerate SQL92 filtering efficiency.

This article mainly analyzes commitlogdispatcher buildconsumequeue to see how RocketMQ builds consumqueue.

3. Source code analysis

The author drew a sequence diagram of the construction process of ConsumeQueue. The whole construction process is not complicated.

1. The doreput () method is executed once every 1 ms. its method body is a for loop. As long as the reputFromOffset does not reach the maximum offset of the CommitLog file, the message will continue to be replayed.

private boolean isCommitLogAvailable() {
    return this.reputFromOffset < DefaultMessageStore.this.commitLog.getMaxOffset();
}

First, it intercepts a ByteBuffer from the CommitLog file according to the reputFromOffset. The buffer is the message data to be replayed.

public SelectMappedBufferResult getData(final long offset, final boolean returnFirstOnNotFound) {
    // CommitLog the size of a single file
    int mappedFileSize = this.defaultMessageStore.getMessageStoreConfig().getMappedFileSizeCommitLog();
    // Find the file waiting to be built according to the index construction progress. The file name is the starting Offset. You can find it by traversing the file
    MappedFile mappedFile = this.mappedFileQueue.findMappedFileByOffset(offset, returnFirstOnNotFound);
    if (mappedFile != null) {
        // Calculates the read pointer position of Offset in the current file
        int pos = (int) (offset % mappedFileSize);
        /**
         * A ByteBuffer object is derived from MappedByteBuffer based on MappedFile
         * Share the same memory, but have their own pointers
         */
        SelectMappedBufferResult result = mappedFile.selectMappedBuffer(pos);
        return result;
    }
    return null;
}

The properties of SelectMappedBufferResult class are as follows:

// Start offset
private final long startOffset;
// buffer
private final ByteBuffer byteBuffer;
// length
private int size;
// Associated MappedFile object
private MappedFile mappedFile;

2. With SelectMappedBufferResult, you can read the message data. Since the message replay does not need to know the content of the message Body, it will not read the message Body, but only read the relevant properties and build the DispatchRequest object. The properties read are as follows:

// Topic to which the message belongs
private final String topic;
// Queue ID to which the message belongs
private final int queueId;
// The offset of the message in the CommitLog file
private final long commitLogOffset;
// Message size
private int msgSize;
// Message Tag hash code
private final long tagsCode;
// Message save time
private final long storeTimestamp;
// Logical consumption queue point
private final long consumeQueueOffset;
private final String keys;
private final boolean success;
// Message unique key
private final String uniqKey;
// Message system tag
private final int sysFlag;
// Transaction message offset
private final long preparedTransactionOffset;
// attribute
private final Map<String, String> propertiesMap;

3. With the DispatchRequest object, the next step is to call the doDispatch method to distribute the request. At this point, the commitlogdispatcher buildconsumequeue will be triggered, and it will forward the request to the DefaultMessageStore for execution.

DefaultMessageStore.this.putMessagePositionInfo(request);

4. The messagestore locates the ConsumeQueue file according to the message Topic and QueueID, and then appends the index to the file.

public void putMessagePositionInfo(DispatchRequest dispatchRequest) {
    // Navigate to the ConsumeQueue file according to the Topic and QueueID
    ConsumeQueue cq = this.findConsumeQueue(dispatchRequest.getTopic(), dispatchRequest.getQueueId());
    // Append index to file
    cq.putMessagePositionInfoWrapper(dispatchRequest);
}

Before writing the index, you will first ensure that the message warehouse is writable:

boolean canWrite = this.defaultMessageStore.getRunningFlags().isCQWriteable();

Then, initialize a ByteBuffer with a capacity of 20 bytes, and write the message Offset, size and tagsCode into it in turn.

// The length of each index is 20 bytes, and byteBufferIndex is recycled
this.byteBufferIndex.flip();
this.byteBufferIndex.limit(CQ_STORE_UNIT_SIZE);
/**
* Index structure: Offset+size+tagsCode
* 8 Byte 4 byte 8 byte
*/
this.byteBufferIndex.putLong(offset);
this.byteBufferIndex.putInt(size);
this.byteBufferIndex.putLong(tagsCode);

Calculate the file location where the index should be written according to the consumption queue bit and the length of a single index. Because it is written sequentially, obtain the latest ConsumeQueue file. If the file is full, a new one will be created to continue writing.

final long expectLogicOffset = cqOffset * CQ_STORE_UNIT_SIZE;
MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile(expectLogicOffset);

Before writing, verify whether the expected offset and logical offset are equal. Normally, they should be equal. If they are not equal, it means that the data construction is disordered and needs to be rebuilt.

if (cqOffset != 0) {
    // Offset: write pointer position of current file + file start offset (file name)
    long currentLogicOffset = mappedFile.getWrotePosition() + mappedFile.getFileFromOffset();

    // Under normal circumstances, expectLogicOffset and currentLogicOffset should be equal
    if (expectLogicOffset < currentLogicOffset) {
        log.warn("Build  consume queue repeatedly, expectLogicOffset: {} currentLogicOffset: {} Topic: {} QID: {} Diff: {}",
                 expectLogicOffset, currentLogicOffset, this.topic, this.queueId, expectLogicOffset - currentLogicOffset);
        return true;
    }
    if (expectLogicOffset != currentLogicOffset) {
        LOG_ERROR.warn(
            "[BUG]logic queue order maybe wrong, expectLogicOffset: {} currentLogicOffset: {} Topic: {} QID: {} Diff: {}",
            expectLogicOffset,
            currentLogicOffset,
            this.topic,
            this.queueId,
            expectLogicOffset - currentLogicOffset
        );
    }
}

After passing the inspection, you can write normally. First update the maximum offset maxPhysicOffset of the current consumqueue record message, and then write 20 bytes of index data to the file.

// Update the maximum offset of the message recorded by the current ConsumerQueue in the CommitLog
this.maxPhysicOffset = offset + size;
// Writes 20 bytes of index data to a file
return mappedFile.appendMessage(this.byteBufferIndex.array());

At this point, the synchronization of the messages in the CommitLog to the indexes in the ConsumeQueue file is completed.

ConsumeQueue index entry structure:

lengthexplain
8The offset of the message in the CommitLog file
4Message length
8Message Tag hash code, filtering messages according to Tag

4. Summary

ConsumeQueue is an index file used by RocketMQ to accelerate Consumer consumption efficiency. It is a logical consumption queue. It does not save the message itself, but a message index. The index length is 20 bytes. It records the offset of the message in the CommitLog file, the message length, and the hash value of the message Tag. When consuming a message, the Consumer can quickly filter the message according to the Tag hash value, quickly locate the message according to the offset, and then read a complete message according to the message length.

After the Broker writes the message to the CommitLog, it will not write the ConsumeQueue immediately. Instead, an asynchronous thread ReputMessageService replays the message. During the replay process, the commitlogdispatcher buildconsumequeue builds the message into the ConsumeQueue file. The construction frequency is once a millisecond, which is almost real-time. There is no need to worry about the delay of consumption.

Posted by russy on Tue, 07 Sep 2021 13:10:43 -0700