Discussion on RocketMQ message consumption and rebalancing

Keywords: Java Apache kafka

In fact, the best way to learn is to communicate with each other. Recently, I also discussed some issues about RocketMQ message pulling and balancing with netizens. Let me write down my summary here.

On the problem of message loop pull in push mode

A previous article on rebalancing: Kafka rebalancing mechanism ", it is said that the RocketMQ rebalancing mechanism is to obtain the consumption ID and subscription information of the consumption group from any Broker node every 20s, and then allocate them according to these subscription information, and then encapsulate the allocated information into a pullRequest object and pull it into the pullRequestQueue queue queue. The pull thread wakes up and performs the pull task. The flow chart is as follows:

But some of them are not detailed. For example, do you have to wait for 20s every time you pull a message? A netizen really asked me the following questions:

Obviously, his project uses the push mode to pull messages. To answer this question, we need to start with the message pull of rockmq:

The implementation of RocketMQ's push mode is based on the pull mode, which only sets a layer in the pull mode, so RocketMQ's push mode is not really a "push mode". Therefore, in the push mode, after the consumers pull the messages, they will start the next pull task immediately, and they will not really wait for 20s to rebalance before pulling. As for how the push mode works Implementation, then from the source to find the answer.

I have written an article before: Why does RocketMQ ensure the consistency of subscription relationship? ", it has been said that the message pull is to pull the message from the PullRequestQueue block queue, but how does the PullRequest get into the PullRequestQueue block queue?

RocketMQ provides the following methods:

org.apache.rocketmq.client.impl.consumer.PullMessageService#executePullRequestImmediately:

public void executePullRequestImmediately(final PullRequest pullRequest) {
  try {
    this.pullRequestQueue.put(pullRequest);
  } catch (InterruptedException e) {
    log.error("executePullRequestImmediately pullRequestQueue.put", e);
  }
}

It is found from the call chain that in addition to the method called by rebalancing, in push mode, the onSuccess method in the PullCallback callback object also calls the method when the message is consumed:

org.apache.rocketmq.client.consumer.PullCallback#onSuccess:

case FOUND:

// If the pull message is empty, continue to put pullRequest into the blocking queue
if (pullResult.getMsgFoundList() == null || pullResult.getMsgFoundList().isEmpty()) {
  DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
} else {
  // Put the message into the consumer thread to execute
  DefaultMQPushConsumerImpl.this.consumeMessageService.submitConsumeRequest(//
    pullResult.getMsgFoundList(), //
    processQueue, //
    pullRequest.getMessageQueue(), //
    dispathToConsume);
  // Put pullRequest in the blocking queue
  DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);  
}

After the message is pulled from the broker, if the message is filtered out, the pullRequest will continue to be put into the blocking queue to perform the message pulling task circularly. Otherwise, the message will be put into the consumer thread to execute, and the pullRequest will be put into the blocking queue.

case NO_NEW_MESSAGE:

case NO_MATCHED_MSG:

pullRequest.setNextOffset(pullResult.getNextBeginOffset());
DefaultMQPushConsumerImpl.this.correctTagsOffset(pullRequest);
DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);

If there is no new message to pull from the broker side or there is no matching message, put pullRequest into the blocking queue to continue the message pull task.

It can be seen from the above message consumption logic that when the message is processed, the pullRequest will be put back into the blocking queue immediately, so this explains why the push mode can pull messages continuously:

After the message consumption is completed in push mode, this method will be called to put the PullRequest object into the PullRequestQueue block queue again and pull the message from the broker continuously to achieve the push effect.

How to deal with the queue after being rebalanced and allocated by other consumers?

Continue to think about a problem. If after rebalancing, you find that a queue has been allocated by a new consumer, what can you do? You can't continue to pull messages from the queue, right?

After RocketMQ is rebalanced, it will check whether the pullRequest is still in the newly allocated list. If not, it will be discarded. Call isDrop() to find out whether the pullRequest has been discarded:

org.apache.rocketmq.client.impl.consumer.DefaultMQPushConsumerImpl#pullMessage:

final ProcessQueue processQueue = pullRequest.getProcessQueue();
if (processQueue.isDropped()) {
  log.info("the pull request[{}] is dropped.", pullRequest.toString());
  return;
}

Before the message is pulled, first determine whether the queue has been discarded. If it has been discarded, directly abandon the pull task.

When is the queue discarded?

org.apache.rocketmq.client.impl.consumer.RebalanceImpl#updateProcessQueueTableInRebalance:

Iterator<Entry<MessageQueue, ProcessQueue>> it = this.processQueueTable.entrySet().iterator();
while (it.hasNext()) {
  Entry<MessageQueue, ProcessQueue> next = it.next();
  MessageQueue mq = next.getKey();
  ProcessQueue pq = next.getValue();

  if (mq.getTopic().equals(topic)) {
    // Determine whether the current cached MessageQueue is included in the latest mqSet. If not, discard the queue
    if (!mqSet.contains(mq)) {
      pq.setDropped(true);
      if (this.removeUnnecessaryMessageQueue(mq, pq)) {
        it.remove();
        changed = true;
        log.info("doRebalance, {}, remove unnecessary mq, {}", consumerGroup, mq);
      }
    } else if (pq.isPullExpired()) {
      // Discard if queue pull expires
      switch (this.consumeType()) {
        case CONSUME_ACTIVELY:
          break;
        case CONSUME_PASSIVELY:
          pq.setDropped(true);
          if (this.removeUnnecessaryMessageQueue(mq, pq)) {
            it.remove();
            changed = true;
            log.error("[BUG]doRebalance, {}, remove unnecessary mq, {}, because pull is pause, so try to fixed it",
                      consumerGroup, mq);
          }
          break;
        default:
          break;
      }
    }
  }
}

The updateProcessQueueTableInRebalance method is executed during rebalancing to update processQueueTable, which is the queue cache list of the current consumer. The above method logic determines whether the current cache MessageQueue is included in the latest mqSet. If not, it means that after this rebalancing, the queue is assigned to other consumers, or when pulling If the interval is too large and expired, the setDropped(true) method is called to set the queue to the discarded state.

You may ask, what is the relationship between processQueueTable and processQueue in pullRequest? Look down:

org.apache.rocketmq.client.impl.consumer.RebalanceImpl#updateProcessQueueTableInRebalance:

// New ProcessQueue 
ProcessQueue pq = new ProcessQueue();
long nextOffset = this.computePullFromWhere(mq);
if (nextOffset >= 0) {
  // Put ProcessQueue into processQueueTable
  ProcessQueue pre = this.processQueueTable.putIfAbsent(mq, pq);
  if (pre != null) {
    log.info("doRebalance, {}, mq already exists, {}", consumerGroup, mq);
  } else {
    log.info("doRebalance, {}, add a new mq, {}", consumerGroup, mq);
    PullRequest pullRequest = new PullRequest();
    pullRequest.setConsumerGroup(consumerGroup);
    pullRequest.setNextOffset(nextOffset);
    pullRequest.setMessageQueue(mq);
    // Put ProcessQueue into pullRequest pull task object
    pullRequest.setProcessQueue(pq);
    pullRequestList.add(pullRequest);
    changed = true;
  }
}

It can be seen that when rebalancing, the ProcessQueue object will be created, put into the processQueueTable cache queue list, and then put into the pullRequest pull task object, that is, the ProcessQueue in the processQueueTable and the ProcessQueue in the pullRequest are the same object.

Will rebalancing lead to repeated message consumption?

Before, a netizen in the group asked this question:

I replied at that time that RocketMQ does not have repeated consumption normally, but later I found that in some cases, RocketMQ also has repeated consumption of messages.

As mentioned earlier, when RocketMQ messages are consumed, messages will be put into the consuming thread for execution. The code is as follows:

org.apache.rocketmq.client.consumer.PullCallback#onSuccess:

DefaultMQPushConsumerImpl.this.consumeMessageService.submitConsumeRequest(//
  pullResult.getMsgFoundList(), //
  processQueue, //
  pullRequest.getMessageQueue(), //
  dispathToConsume);

The ConsumeMessageService class implements the logic of message consumption. It has two implementation classes:

// Concurrent message consumption logic implementation class
org.apache.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService;
// Sequential message consumption logic implementation class
org.apache.rocketmq.client.impl.consumer.ConsumeMessageOrderlyService;

First look at the processing logic related to concurrent message consumption:

ConsumeMessageConcurrentlyService:

org.apache.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService.ConsumeRequest#run:

if (this.processQueue.isDropped()) {
  log.info("the message queue not be able to consume, because it's dropped. group={} {}", ConsumeMessageConcurrentlyService.this.consumerGroup, this.messageQueue);
  return;
}

// Message consumption logic
// ...

// If the queue is set to drop, no message consumption progress is submitted
if (!processQueue.isDropped()) {
    ConsumeMessageConcurrentlyService.this.processConsumeResult(status, context, this);
} else {
    log.warn("processQueue is dropped without process consume result. messageQueue={}, msgs={}", messageQueue, msgs);
}

ConsumeRequest is a class that inherits Runnable. It is an implementation class of message consumption core logic. The submitConsumeRequest method puts ConsumeRequest into the consumption thread pool to perform message consumption. From its run method, it can be seen that if there are nodes in the execution message consumption logic, the queue will be allocated to other nodes for consumption after rebalancing If the queue of is discarded, the message consumption progress will not be submitted. Because it has been consumed before, the message consumption will be repeated.

Let's look at the sequential consumption related processing logic:

ConsumeMessageOrderlyService:

org.apache.rocketmq.client.impl.consumer.ConsumeMessageOrderlyService.ConsumeRequest#run:

public void run() {
  // Determine whether the queue is discarded
  if (this.processQueue.isDropped()) {
    log.warn("run, the message queue not be able to consume, because it's dropped. {}", this.messageQueue);
    return;
  }

  final Object objLock = messageQueueLock.fetchLockObject(this.messageQueue);
  synchronized (objLock) {
    // If it is not in broadcast mode, the queue is locked and the lock does not expire
    if (MessageModel.BROADCASTING.equals(ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.messageModel())
        || (this.processQueue.isLocked() && !this.processQueue.isLockExpired())) {
      final long beginTime = System.currentTimeMillis();
      for (boolean continueConsume = true; continueConsume; ) {
        // Judge whether the queue is discarded again
        if (this.processQueue.isDropped()) {
          log.warn("the message queue not be able to consume, because it's dropped. {}", this.messageQueue);
          break;
        }
        
        // Message consumption processing logic
        // ...
        
          continueConsume = ConsumeMessageOrderlyService.this.processConsumeResult(msgs, status, context, this);
        } else {
          continueConsume = false;
        }
      }
    } else {
      if (this.processQueue.isDropped()) {
        log.warn("the message queue not be able to consume, because it's dropped. {}", this.messageQueue);
        return;
      }
      ConsumeMessageOrderlyService.this.tryLockLaterAndReconsume(this.messageQueue, this.processQueue, 100);
    }
  }
}

RocketMQ sequential message consumption will lock the queue, which can only be consumed after the queue obtains the lock. Therefore, even if the message is added by nodes in the consumption process, the queue will be allocated to other nodes for consumption after rebalancing, and the queue at this time will be discarded, which will not cause repeated consumption.

The public number is "back-end", focusing on the sharing of back-end technologies: Java, Golang, WEB framework, distributed middleware, service governance, and so on.

Pay attention to public key reply key word "backend" free collection of backend development gift!

Posted by benbox on Mon, 04 Nov 2019 23:17:48 -0800