Detailed explanation of kafka partition redistribution

Keywords: kafka

When a node in the cluster suddenly goes down and goes offline, if the partitions on the node are single replica, these partitions will become unavailable, and the corresponding data will be lost before the node is restored; If the partition on a node is multi replica, the role of the leader replica on this node will be transferred to other follower replicas in the cluster. All in all, the partition replicas on this node are in a functional failure state. kafka does not automatically migrate these retired replicas to the remaining available broker nodes in the cluster. If left unchecked, it will not only affect the balanced load of the whole cluster, but also affect the availability and reliability of the overall service.

When a node in the cluster needs to be offline in a planned way, in order to ensure the reasonable allocation of partitions and replicas, we also hope to migrate the partition replicas on the node to other available nodes in some way.

When a broker node is added in the cluster, only the newly created topic partition can be allocated to this node, while the previous topic partition will not be automatically allocated to the newly added node, because there is no new node when they are created, so the load of the new node is seriously unbalanced with that of the original node.

In order to solve the above problems, the partition replica needs to be reasonably allocated again, that is, the so-called partition redistribution. Kafka provides the kafka-reassign-partitions.sh script to perform partition reallocation. It can migrate partitions in the scenario of cluster expansion and broker node failure.

1. Let the script automatically generate candidate schemes

The use of kafka-reassign-partitions.sh script is divided into three steps: first, create a JSON file containing the topic list, then generate a reallocation scheme according to the topic list and broker node list, and finally perform specific reallocation actions according to this scheme.

The original partition is as follows. Stop broker 0

[xuhaixing@hadoop kafka_2.12-2.3.1]$ kafka-topics.sh --zookeeper hadoop:2181/kafka --describe --topic topic-demo02
Topic:topic-demo02	PartitionCount:3	ReplicationFactor:1	Configs:
	Topic: topic-demo02	Partition: 0	Leader: 0	Replicas: 0	Isr: 0
	Topic: topic-demo02	Partition: 1	Leader: 0	Replicas: 0	Isr: 0
	Topic: topic-demo02	Partition: 2	Leader: 2	Replicas: 2	Isr: 2

Step 1: create a reassign.json file

{
        "topics":[
                {
                        "topic": "topic-demo02"
                }
        ],
        "version":1
}

Step 2: generate a candidate reassignment scheme according to the JSON file and the specified list of broker nodes to be allocated

[xuhaixing@hadoop kafka_2.12-2.3.1]$ kafka-reassign-partitions.sh --zookeeper hadoop:2181/kafka --generate --topics-to-move-json-file reassign.json --broker-list 1,2
Current partition replica assignment
{"version":1,"partitions":[{"topic":"topic-demo02","partition":1,"replicas":[0],"log_dirs":["any"]},{"topic":"topic-demo02","partition":0,"replicas":[0],"log_dirs":["any"]},{"topic":"topic-demo02","partition":2,"replicas":[2],"log_dirs":["any"]}]}

Proposed partition reassignment configuration
{"version":1,"partitions":[{"topic":"topic-demo02","partition":2,"replicas":[2],"log_dirs":["any"]},{"topic":"topic-demo02","partition":1,"replicas":[1],"log_dirs":["any"]},{"topic":"topic-demo02","partition":0,"replicas":[2],"log_dirs":["any"]}]}

Generate is an instruction type parameter in the kafka-reassign-partitions.sh script, similar to create and list. It is used to generate a candidate scheme for reallocation.

Topics to move JSON file is used to specify the path of the topic list file corresponding to the partition reassignment

Broker list is used to specify the list of broker nodes to be allocated

In the above example, two JSON contents are printed, one is the current partition allocation and the other is the candidate scheme for reallocation. This is only a generated scheme, not implemented. The specific algorithm for generating feasibility schemes is the same as when creating topics.

Save the generated second JSON content to the JSON file, project.json

Step 3: perform specific allocation actions

[xuhaixing@hadoop kafka_2.12-2.3.1]$ kafka-reassign-partitions.sh --zookeeper hadoop:2181/kafka --execute --reassignment-json-file project.json 
Current partition replica assignment

{"version":1,"partitions":[{"topic":"topic-demo02","partition":1,"replicas":[0],"log_dirs":["any"]},{"topic":"topic-demo02","partition":0,"replicas":[0],"log_dirs":["any"]},{"topic":"topic-demo02","partition":2,"replicas":[2],"log_dirs":["any"]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions.

Check the specific information again:

[xuhaixing@hadoop kafka_2.12-2.3.1]$ kafka-topics.sh --zookeeper hadoop:2181/kafka --describe --topic topic-demo02
Topic:topic-demo02	PartitionCount:3	ReplicationFactor:1	Configs:
	Topic: topic-demo02	Partition: 0	Leader: 2	Replicas: 2	Isr: 2
	Topic: topic-demo02	Partition: 1	Leader: 1	Replicas: 1	Isr: 1
	Topic: topic-demo02	Partition: 2	Leader: 2	Replicas: 2	Isr: 2

The execute instruction type parameter is used to specify the action to perform reallocation

Reassignment JSON file specifies the file path of the partition reassignment scheme

2. User defined reassignment scheme

Let the script automatically generate candidate schemes, and users can also customize the reassignment scheme, so there is no need to perform step 1 and step 2.

The basic principle of partition reallocation is to add a new replica for each partition through the controller (increase the replica factor), and the new replica will replicate all data from the leader replica of the partition. Depending on the size of the partition, the replication process may take some time because the data is replicated to the new replica over the network. After the replication is complete, the controller removes the old replica from the replica list (reverts to the original replica factor). Pay attention to ensure that there is enough space in the process of redistribution.

You can change execute to verify to view the progress of partition reallocation

Add partition

[xuhaixing@xhx151 cluster]$ cat project.json 
{"version":1,"partitions":[{"topic":"topic-demo02","partition":2,"replicas":[4,5,3],"log_dirs":["any","any","any"]},{"topic":"topic-demo02","partition":1,"replicas":[5,3,4],"log_dirs":["any","any","any"]},{"topic":"topic-demo02","partition":0,"replicas":[3,4,5],"log_dirs":["any","any","any"]}]}
[xuhaixing@xhx151 cluster]$ kafka-topics.sh --bootstrap-server 192.168.94.151:9093,192.168.94.151:9094,192.168.94.151:9095 --topic topic-demo02 --describe
Topic: topic-demo02	PartitionCount: 3	ReplicationFactor: 2	Configs: segment.bytes=1073741824
	Topic: topic-demo02	Partition: 0	Leader: 3	Replicas: 3,4	Isr: 3,4
	Topic: topic-demo02	Partition: 1	Leader: 4	Replicas: 4,5	Isr: 5,4
	Topic: topic-demo02	Partition: 2	Leader: 5	Replicas: 5,3	Isr: 5,3
[xuhaixing@xhx151 cluster]$ kafka-reassign-partitions.sh --bootstrap-server 192.168.94.151:9093,192.168.94.151:9094,192.168.94.151:9095 --execute --reassignment-json-file project.json 
Current partition replica assignment

{"version":1,"partitions":[{"topic":"topic-demo02","partition":0,"replicas":[3,4],"log_dirs":["any","any"]},{"topic":"topic-demo02","partition":1,"replicas":[4,5],"log_dirs":["any","any"]},{"topic":"topic-demo02","partition":2,"replicas":[5,3],"log_dirs":["any","any"]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started partition reassignments for topic-demo02-0,topic-demo02-1,topic-demo02-2
[xuhaixing@xhx151 cluster]$ kafka-topics.sh --bootstrap-server 192.168.94.151:9093,192.168.94.151:9094,192.168.94.151:9095 --topic topic-demo02 --describe
Topic: topic-demo02	PartitionCount: 3	ReplicationFactor: 3	Configs: segment.bytes=1073741824
	Topic: topic-demo02	Partition: 0	Leader: 3	Replicas: 3,4,5	Isr: 3,4,5
	Topic: topic-demo02	Partition: 1	Leader: 4	Replicas: 5,3,4	Isr: 5,4,3
	Topic: topic-demo02	Partition: 2	Leader: 5	Replicas: 4,5,3	Isr: 5,3,4

For more quality content, please pay attention to official account: programmer star toC

Posted by itguysam on Sun, 28 Nov 2021 14:20:29 -0800