How does RocketMQ Consumer get and maintain consumption progress?

Keywords: JSON

background

Cosumer's message consumption process is relatively complex, and the following modules are more important: maintenance of consumption progress, search of messages, message filtering, load balancing, message processing, acknowledgment of back delivery, etc. Limited to space, this article mainly introduces how consumers acquire and maintain consumption progress. As the above steps are closely connected, there may be interpenetration between them.

Consumption progress document

Our previous articles How does RocketMQ build ConsumerQueue's? As mentioned in, a service thread asynchronously obtains the written message from CommitLog, and then saves the key information such as message location, size and tagsCode to the selected ConsumerQueue. ConsumerQueue is an index file that stores information such as the location of messages under a topic in CommitLog. When we consume messages, we first read the message location from the ConsumerQueue, and then go to CommitLog to retrieve the messages. This is a typical space for time approach. At the same time, we need to record where we read it. The next time we read it, we need to read it from the end of the last time, so we need a file to save the consumption progress. In RocketMQ, this file is ConsumserOffset.json. In the store/config directory, it is as follows:

The content in consumerOffset.json is similar to this:

Test ﹣ URL @ sub ﹣ localtest is the primary key, the rule is topic@consumerGroup, and the content is the consumption progress of each ConsumerQueue. For example, the next Queue 0 should consume 2599, the next Queue 1 should consume 2602, and the next Queue 6 should consume 102. This progress is accumulated by 1, corresponding to the fixed 20 bytes of ConsumerQueue.
Now let's do an experiment. First, send a message through Producer (the code is very simple, so we don't paste it). The sending result is as follows:

SendResult [sendStatus=SEND_OK, msgId=C0A84D05091058644D4664E1427C0000, offsetMsgId=C0A84D0000002A9F00000000001ED9BA, messageQueue=MessageQueue [topic=test_url, brokerName=broker-a, queueId=6], queueOffset=102]

As you can see from the results, the message is saved in location 102 in the messengerqueue-6 of broker-a. Then we start Consumer and consume this message:

2020-01-20 14:08:04.230 INFO  [MessageThread_3] c.c.r.example.simplest.Consumer - Receive message:topic=test_url,msgId=C0A84D05091058644D4664E1427C0000

The message has been consumed successfully. Then, let's look at the contents of the ConsumerOffset.json file:

See? In the last screenshot, the consumption progress of Queue 6 was 102, which became 103 this time. A message has been successfully consumed.

Now we are going to go deep into: how is the consumption progress obtained and maintained when consuming information?

Where to start spending?

Consumer is calling

consumer.start();

At startup, the consumption progress will be loaded as follows:

//DefaultMQPushConsumerImpl.start() method
if (this.defaultMQPushConsumer.getOffsetStore() != null) {
     this.offsetStore = this.defaultMQPushConsumer.getOffsetStore(); // If there is a direct load locally
} else {
	switch (this.defaultMQPushConsumer.getMessageModel()) {
                        case BROADCASTING:
                            this.offsetStore = new LocalFileOffsetStore(this.mQClientFactory, this.defaultMQPushConsumer.getConsumerGroup());
                            break;
                        case CLUSTERING:
                            this.offsetStore = new RemoteBrokerOffsetStore(this.mQClientFactory, this.defaultMQPushConsumer.getConsumerGroup()); // In the cluster mode, a remote broker offset sotre is generated, and the consumption progress is saved on the broker side
                            break;
                        default:
                            break;
                    }
                    this.defaultMQPushConsumer.setOffsetStore(this.offsetStore);
                }
                this.offsetStore.load(); // Load local progress

The offsetStore object mentioned above is an interface, as follows:

public interface OffsetStore {
    void load() throws MQClientException;
    void updateOffset(final MessageQueue mq, final long offset, final boolean increaseOnly);
    long readOffset(final MessageQueue mq, final ReadOffsetType type);
    void persistAll(final Set<MessageQueue> mqs);
    void persist(final MessageQueue mq);
    void removeOffset(MessageQueue mq);
    Map<MessageQueue, Long> cloneOffsetTable(String topic);
    void updateConsumeOffsetToBroker(MessageQueue mq, long offset, boolean isOneway) throws RemotingException,
        MQBrokerException, InterruptedException, MQClientException;
}

OffsetStore provides methods for operating consumption progress, such as loading consumption progress, reading consumption progress, updating consumption progress, etc. In the cluster consumption mode, the consumption progress is not persisted on the Consumer side, but saved on the remote Broker side. For example, the remoteborkeroffsetstore class is used above:

public class RemoteBrokerOffsetStore implements OffsetStore {
    private final static InternalLogger log = ClientLogger.getLog();
    private final MQClientInstance mQClientFactory; // Client instance
    private final String groupName; // Cluster name
    private ConcurrentMap<MessageQueue, AtomicLong> offsetTable =
        new ConcurrentHashMap<MessageQueue, AtomicLong>(); // Consumption progress of each Queue
}

Because the consumption progress is saved in the broker side, the load method used to load the local consumption progress is empty in the remote broker offset store, and nothing is done. The real reading of the consumption progress is realized through the readOffset method.
So far, we know that the consumption progress exists in a consumerOffset.json file on the broker side. By reading this file through the readOffset method, we know where to start consumption.

Brief introduction to load balancing

Before you can get a deeper understanding of the readOffset method for reading consumption progress, you need to simply know when this method will be called.
At the beginning of our article, we mentioned the module of Consumer consuming messages, including the module of load balancing. A topic has multiple consumption queues, and there may be multiple consumers under the same group subscribing to the topic, so these queues need to be allocated to the consumers under the same group according to certain policies, such as the following average allocation:

In the figure above, topic UA has 5 queues and 2 consumers have subscribed. If it is distributed evenly, it is three queues consumed by Consumer1 and two queues consumed by Consumer2. This is load balancing.
When a Consumer is assigned to several message queues, several message processing queues will be created accordingly (processqueues will be used when consuming messages), and a pull message request (PullRequest will be used when requesting messages) will be generated. This request is not a real request to get messages from the broker, but is saved in a blocking queue After that, the special service thread that pulls the message reads it and assembles the request to get the message, and sends it to the broker side (for the benefit of asynchrony, of course). The next consumption progress is temporarily saved in this request PullRequest, as shown below:

Only here will a PullRequest be generated. Subsequently, all pull messages of the Queue under the consumerGroup will reuse the PullRequest, only updating the nextofffset and other parameters. The PullRequest class is as follows:

public class PullRequest {
    private String consumerGroup; // Consumption grouping
    private MessageQueue messageQueue; // Consumption queue
    private ProcessQueue processQueue; // Message processing queue
    private long nextOffset; // Consumption schedule
    private boolean lockedFirst = false; // Is it locked?
}

Calculate consumption progress

When a PullRequest is generated, it calculates where to start consumption (computPullFromWhere). The default consumption starting point of RocketMQ is consume [from] last [offset], so in the computPullFromWhere method, you will go to this case:

 case CONSUME_FROM_LAST_OFFSET: {
                long lastOffset = offsetStore.readOffset(mq, ReadOffsetType.READ_FROM_STORE);// Read offset
                if (lastOffset >= 0) { // Normal deviation
                    result = lastOffset;
                }
                // First start,no offset
                else if (-1 == lastOffset) { // First start, broker has no offset, readOffset returns - 1
                    if (mq.getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
                        result = 0L; // Message retry
                    } else {
                        try {
                            result = this.mQClientFactory.getMQAdminImpl().maxOffset(mq);// Get the maximum offset of the message queue consumption
                        } catch (MQClientException e) {
                            result = -1;
                        }
                    }
                } else {
                    result = -1; // In case of exception, readOffset returns - 2, and here returns - 1
                }
                break;
            }

This is the readOffset method we mentioned above!

readOffset

The input parameters of the readoffer method t are the currently assigned messagequeue and the fixed readoffsettype.read'from'store, which means to read the consumption progress of the messagequeue from the remote Broker. Therefore, we go to the branch of "read" from "store, as follows:

case READ_FROM_STORE: {
                    try {
                        long brokerOffset = this.fetchConsumeOffsetFromBroker(mq);// Get consumption progress from broker
                        AtomicLong offset = new AtomicLong(brokerOffset);
                        this.updateOffset(mq, offset.get(), false); // Update the consumption progress, and update the table remoteborkeroffsetstore.offsettable
                        return brokerOffset;
                    }
                    // No offset in broker
                    catch (MQBrokerException e) {
                        return -1;
                    }
                    //Other exceptions
                    catch (Exception e) {
                        log.warn("fetchConsumeOffsetFromBroker exception, " + mq, e);
                        return -2;
                    }
                }

fetchConsumerOffsetFromBroker is to send a request to the broker where the MessageQueue is located to get the consumption progress. The previous article on the underlying communication has already mentioned it, so we won't go into details here. On the broker side, the consumption progress is saved in the ConsumerOffsetManager:

key is Topic@ConsumerGroup, and value stores the mapping relationship between queueId and offset. The code to find the consumption progress is as follows:

long offset =
            this.brokerController.getConsumerOffsetManager().queryOffset(
                requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueId());

From the key point of view, the consumption progress of topic is distinguished by the ConsumerGroup. The consumption progress of the MessageQueue under different consumergroups does not affect each other, which is also well understood.

Message acquisition and update consumption progress

So far, we know that the consumption progress of each MessageQueue has a corresponding Broker side. When the load balancing service does load balancing for each Topic, it creates a PullRequest and reads the consumption progress offset. Then put the PullRequest into a PullRequest queue for the special pullmessage service to read, and then initiate the real pull message request. Update consumption progress here is closely related to getting messages, so it will involve some content of getting messages.

Pullmessageservice is a service thread specially used to pull messages. Its run method is as follows:

@Override
    public void run() {
        log.info(this.getServiceName() + " service started");

        while (!this.isStopped()) {
            try {
                PullRequest pullRequest = this.pullRequestQueue.take(); // Get a PullsRequest from the blocking queue
                this.pullMessage(pullRequest); // Fetch message
            } catch (InterruptedException ignored) {
            } catch (Exception e) {
                log.error("Pull Message Service Run Method exception", e);
            }
        }

        log.info(this.getServiceName() + " service end");
    }

First, pull the PullRequest from the blocking queue, and then call the pullMessage method to get the message. The call will eventually go to the pullMessage method of DefaultMQPushConsumerImpl. There are a lot of codes and how to pull the message is not the key content of this time. Here, we just post the last part of the send request, understand the parameters in it, and understand the message get request.

try {
            this.pullAPIWrapper.pullKernelImpl(
                pullRequest.getMessageQueue(), //Message queue
                subExpression,// Subscription expression, such as "tag a"
                subscriptionData.getExpressionType(), // Expression type, such as "TAG"
                subscriptionData.getSubVersion(), // Version number
                pullRequest.getNextOffset(), // Next message progress
                this.defaultMQPushConsumer.getPullBatchSize(), // How many pieces are pulled at one time, default 32
                sysFlag,// Some flag bit sets, temporarily not concerned
                commitOffsetValue, //
                BROKER_SUSPEND_MAX_TIME_MILLIS,// hold time for long polling, default 15 seconds
                CONSUMER_TIMEOUT_MILLIS_WHEN_SUSPEND,// Call timeout, 30 seconds by default
                CommunicationMode.ASYNC, // Asynchronous communication
                pullCallback// Callback
            );
        } catch (Exception e) {
            log.error("pullKernelImpl exception", e);
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_EXCEPTION);
        }
Update consumption progress on the Consumer side

Since the request to send a message is an asynchronous operation and the return processing is in the pullCallback, we can boldly guess that the update of the consumption progress on the Consumer side must also be in this. Indeed, the next offset of PullRequest will be updated in pullCallback:

broker side update consumption progress

Since message processing is not the focus of this time, we do not intend to go deep into the Broker side's processing of getting messages. We just need to know that after the Broker side gets messages, the accountant calculates a nextBeginOffset, which is the next consumption progress, and then returns to the Consumer side for the Consumer side to update the progress, as shown below:

nextBeginOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE);

Secondly, after getting the message, the broker will update the consumption progress as follows:

boolean storeOffsetEnable = brokerAllowSuspend; // brokerAllowSuspend is true by default. If there is no message, the request will be held
        storeOffsetEnable = storeOffsetEnable && hasCommitOffsetFlag; // If the consumption progress is allowed to be submitted when pulling messages, commitOffsetFlag has
        storeOffsetEnable = storeOffsetEnable 
            && this.brokerController.getMessageStoreConfig().getBrokerRole() != BrokerRole.SLAVE;// So this is true by default for the Broker master node
        if (storeOffsetEnable) {
            this.brokerController.getConsumerOffsetManager().commitOffset(RemotingHelper.parseChannelRemoteAddr(channel),
                requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueId(), requestHeader.getCommitOffset()); // Save consumption progress and write offsetTable
        }

Consumption progress persistence

Broker updates the consumption progress only by updating the offsetTable table table, not the ConsumerOffset.json file. In fact, during broker initialization, a timing task will be started to save tableOffset to the ConsumerOffset.json file on a regular basis, as shown below:

this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        BrokerController.this.consumerOffsetManager.persist(); // Save file
                    } catch (Throwable e) {
                        log.error("schedule persist consumerOffset error.", e);
                    }
                }
            }, 1000 * 10, this.brokerConfig.getFlushConsumerOffsetInterval(), TimeUnit.MILLISECONDS);

Flushconsumeroffsetinterval is 5s by default, which means saving the consumption progress to the file every 5 times. The process of saving is to save the original file to the ConsumerOffset.json.bak file, and then save the new content to the ConsumerOffset.json file.
At this point, the consumption progress maintenance of ConsumerQueue is completed.

Summary

For each Topic message of RocketMQ, the consumption progress of each MessageQueue under its own ConsumerGroup exists in a consumerOffset.json file on the broker side. When the Consumer side starts, a PullRequest request will be created. At this time, a request to get the next consumption progress will be sent to the broker. The broker reads the next consumption progress and returns it to the Consumer side. Then the Consumer reads the PullRequest through a separate service thread and pulls the message accordingly. After the broker gets the message, it will update the consumption progress in time. In addition, there is a separate scheduled task to regularly save consumption progress to files and back up the original files.

379 original articles published, 85 praised, 590000 visitors+
His message board follow

Posted by zeroecko on Mon, 20 Jan 2020 20:44:12 -0800