Kafka Quick Start - Kafka Cluster Deployment

Keywords: Big Data kafka Zookeeper Docker socket

Kafka Quick Start (6) - Kafka Cluster Deployment

I. Kafka Cluster Deployment Plan

1. Operating system selection

In general, production environments should deploy Kafka clusters on Linux operating systems for the following reasons:
(1) At the bottom of the Kafka client, Java selector is used. The implementation mechanism of selector on Linux is epoll, while on Windows platform is select, so Kafka deployment can achieve more efficient I/O performance on Linux.
(2) Differences in network transmission efficiency.Kafka requires a large amount of data transfer between disks and networks. Deploying Kafka on Linux allows you to enjoy the fast data transfer features of Zero Copy technology.
(3) Support from the community.The Apache Kafka community currently has no commitment to the Kafka Bug found on the Windows platform.

2. Disk

(1) Kafka implements redundancy mechanisms to provide high reliability and load balancing at the software level through partitioning mechanisms, so Kafka's disk storage can be made up of normal disks without using RAID.
(2) The use of mechanical disks can be used for the Kafka online environment, but SSD performance is obviously better.

3. Disk capacity

When planning disk capacity, consider the number of new messages, message retention time, average message size, number of backups, whether compression is enabled, and so on.
Assuming that a company's business needs to send 100,000,000 messages to the Kafka cluster every day, two copies of each message are saved to prevent data loss. The message is saved by default for 7 days, with an average message size of 1 KB and a data compression ratio of 0.75 for Kafka.
100,000 1KB messages per day are saved in two copies with a compression ratio of 0.75, which equals 150 GB (100000*1KB*2/1000/1000*0.75). Considering the index data of the Kafka cluster, 10% of the disk space needs to be reserved, so the total storage capacity is 165 GB per day.The data remains for 7 days, so the planned disk capacity is 1155 GB (165 GB*7).

4. Network bandwidth

Assuming your company's computer room environment is a gigabyte network, or 1Gbps, your business needs to process 1TB of business data in one hour.Assuming that Kafka Broker uses 70% of the bandwidth resources, a threshold of more than 70% may result in network packet loss. A single Kafka Broker can use up to about 700Mb of bandwidth resources, but usually requires an additional two-thirds of the resources reserved for other services, that is, Kafka Broker can allocate 240Mbps (700Mb/3) of bandwidth for Kafka services.Processing 1TB data in an hour requires processing 2336Mb (1024*1024*8/3600) data per second, divided by 240, which equals about 10 servers.If you need to make two additional copies, multiply the number of servers by 3, or 30.

2. Configuration of Kafka cluster parameters

1. Broker End Parameters

Broker-side parameters are also known as static parameters (Static Configs), which can only be used in Kafka's configuration fileServer.propertiesYou must restart the Broker process to take effect.
log.dirs: Specify several file directory paths Broker needs to use. There is no default value and must be specified.In the production environment must beLog.dirsConfigure multiple paths and, if conditions permit, ensure that the directory is mounted on different physical disks.The advantage is that with improved read and write performance, multiple physical disks can read and write data at the same time with higher throughput; Failover can be achieved, and Kafka version 1.1 introduces Failover, which automatically transfers the data on the damaged disk to other normal disks, and Broker can work properly. Based on the Failover mechanism, Kafka can discard the RAID scheme.
zookeeper.connect CS format parameters, can be specified as zk1:2181,zk2:2181,zk3:2181, different Kafka clusters can be specified: zk1:2181,zk2:2181,zk3:2181/kafka1, chroot only needs to write once.
listeners: Set up a listener for intranet access to the Kafka service.
advertised.listeners: Set up a listener for external network access to Kafka services.
auto.create.topics.enable: Whether Topic creation is allowed automatically.
unclean.leader.election.enable: Allow Unclean Leader elections.
auto.leader.rebalance.enable: Whether regular Leader elections are allowed or not, a false setting is recommended in the production environment.
log.retention.{hours|minutes|ms}: Controls how long a message's data is saved.Priority: MS setting highest, minutes second, hours lowest.
log.retention.bytes: Specifies the total disk capacity that Broker saves for messages.message.max.bytes: Controls the maximum message size a broker can receive.

2. Topic level parameters

If both the Topic level parameter and the Global Broker parameter are set, the Topic level parameter overrides the Global Broker parameter, and each Topic can set its own parameter value.
In a production environment, Topic from different departments should be allowed to set their own retention times according to their own business needs.If you can only set the global Broker parameter, then it is necessary to extract the maximum of all business lifetime as the global parameter value, and setting the Topic level parameter to override the Broker parameter is a good choice.
retention.ms: Specifies how long a Topic message will be saved. By default, it is 7 days. Only the last 7 days of messages will be saved, overwriting the global parameter values at the Broker side.
retention.bytes: Specify how much disk space to reserve for Topic.Usually used in multi-tenant Kafka clusters, the default value is -1, indicating unlimited disk space usage.
max.message.bytes: Specify the maximum message size that Kafka Broker can normally receive Topic.
Topic level parameters can be set when Topic is created or when Topic is modified. It is recommended that Topic be set when Topic is modified. The Apache Kafka community may use the kafka-configs script to set Topic level parameters in the future.

3. JVM parameters

Kafka version 2.0.0 has officially abandoned support for Java 7.
Kafka Broker creates a large number of Byte Buffer instances on the JVM heap when interacting with clients, so Heap Size settings on the JVM side cannot be too small, it is recommended to set 6GB.
export KAFKA_HEAP_OPTS=--Xms6g --Xmx6g
An important parameter in JVM-side configuration is the setting of the garbage collector.For Java 7, the CMS collector is recommended if the CPU resources of the Broker machine are abundant.Enabled by specifying -XX:+UseCurrentMarkSweepGC.Otherwise, using the throughput collector, the open method is to specify -XX:+UseParallelGC.For Java 9, with the default G1 collector, G1 performs better than CMS without any tuning, mainly in fewer Full GC s, fewer parameters to adjust, etc., so using G1 is good.
export KAFKA_JVM_PERFORMANCE_OPTS= -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true

4. Operating system parameters

File descriptor limit: ulimit-n.It is recommended that you set it to a very large value, such as ulimit-n 1000000.
File system type: Selection of file system type.According to the test report on the official website, XFS has better performance than ext4.
Swappiness: A smaller value, such as 1, is recommended.Setting swap to 0 completely prohibits the Kafka Broker process from using swap space; when physical memory is exhausted, the operating system triggers the OOM killer component, randomly picking a process to kill without warning the user.If you set a smaller value, when you start using swap space, broker performance will drop dramatically, giving you time to further tune and diagnose problems.
Submission time: Submission time (Flush closing time).Sending data to Kafka is not really a matter of waiting for the data to be written to disk to be considered successful, but rather as long as it is written to the Page Cache of the operating system, which then periodically drops dirty data from the page cache onto the physical disk according to the LRU algorithm.The period for page cache data to be written to disk is determined by the commit time, which defaults to 5 seconds and can be appropriately increased to reduce physical disk write operations.If the data in the page cache goes down before it is written to disk, the data will be lost, but since Kafka already provides a redundant mechanism for multiple copies at the software level, it is reasonable to increase the commit interval in exchange for performance.

3. Docker Mirror Selection

1. Install docker

installDocker:sudo yumInstall docker
Start Docker:sudo systemctl start docker
Docker version check: docker version

2. docker-compose installation

Docker-compose download: sudo curl-LHttps://github.com/docker/compose/releases/download/1.23.0-rc3/docker-compose-uname-s-uname-m-o/usr/local/bin/docker-compose
Docker-compose installation: sudo Chmod +x/usr/local/bin/docker-compose
Doker-compose version check: docker-compose version

3. docker mirror selection

zookeeper mirror selection:
docker search zookeeper
Choose the mirror with the most star ts:docker.io/zookeeper
Kafka mirror selection:
docker search kafka
Choose the mirror with the most star ts:docker.io/wurstmeister/kafka
kafka-manager mirror selection:
docker search kafka-manager
Select mirror: Kafka manager/kafka-manager

4. Kafka Single-machine Deployment Scheme

1. Write docker-Compose.ymlfile

# Single machine zookeeper + kafka + kafka-manager cluster
version: '2'

services:
  # Define zookeeper service
  zookeeper-test:
    image: zookeeper # zookeeper mirror
    restart: always
    hostname: zookeeper-test
    ports:
      - "12181:2181" # Host port: docker internal port
    container_name: zookeeper-test # Container name

  # Define kafka services
  kafka-test:
    image: wurstmeister/kafka # kafka mirror
    restart: always
    hostname: kafka-test
    ports:
      - "9092:9092" # Expose Port Number
      - "9999:9999" # Exposure to JMX_PORT
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 192.168.0.105 #
      KAFKA_ADVERTISED_PORT: 9092 # 
      KAFKA_ZOOKEEPER_CONNECT: zookeeper-test:2181 # zookeeper service
      KAFKA_ZOOKEEPER_CONNECTION_TIMEOUT_MS: 30000 # zookeeper connection timeout
      KAFKA_LOG_CLEANUP_POLICY: "delete"
      KAFKA_LOG_RETENTION_HOURS: 120 # Set the maximum time to save message data to 120 hours
      KAFKA_MESSAGE_MAX_BYTES: 10000000 # Maximum number of bytes in message body
      KAFKA_REPLICA_FETCH_MAX_BYTES: 10000000 # 
      KAFKA_GROUP_MAX_SESSION_TIMEOUT_MS: 60000 # 
      KAFKA_NUM_PARTITIONS: 1 # Number of partitions
      KAFKA_DELETE_RETENTION_MS: 10000 # 
      KAFKA_BROKER_ID: 1 # ID of kafka
      KAFKA_COMPRESSION_TYPE: lz4
      KAFKA_JMX_OPTS: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=192.168.0.105 -Dcom.sun.management.jmxremote.rmi.port=9999"  # Import KAFKA_JMX_OPTS environment variable
      JMX_PORT: 9999  # Import JMX_PORT environment variable
    depends_on:
      - zookeeper-test # rely on
    container_name: kafka-test

  # Define kafka-manager service
  kafka-manager-test:
    image: kafkamanager/kafka-manager # kafka-manager mirror
    restart: always
    container_name: kafka-manager-test
    hostname: kafka-manager-test
    ports:
      - "9000:9000"  # Expose ports, provide web access
    depends_on:
      - kafka-test # rely on
    environment:
      ZK_HOSTS: zookeeper-test:2181 # Host IP
      KAFKA_BROKERS: kafka-test:9090 # kafka
      KAFKA_MANAGER_AUTH_ENABLED: "true"  # Turn on security authentication
      KAFKA_MANAGER_USERNAME: kafka-manager  # Kafka Manager Logon User
      KAFKA_MANAGER_PASSWORD: 123456  # Kafka Manager login password

You need to confirm whether the port is occupied.

2. Start Services

Create a kafka directory that will docker-Compose.ymlFiles are placed in the kafka directory and commands are executed in the kafka directory.
Start:
docker-compose up -d
Close:
docker-compose down

3. kafka Service View

Enter docker container:
docker exec -it kafka /bin/bash
Create Topic:
kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 3 --topic test
View Topic:
kafka-topics.sh --list --zookeeper zookeeper:2181
Production message:
kafka-console-producer.sh --broker-list kafka:9092 --topic test
Consumer News:
kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic test --from-beginning
Open two terminals, one command to execute production messages, and one command to execute consumer messages. Each message produced, the consumer message Terminal displays a message to implement the message queue.

4. Kafka Version Query

In the wurstmeister/kafka image, Kafka is installed in the / opt directory, enter / opt directory, kafka_The 2.12-2.4.0 directory is the Kafka installation directory.
Scala version: 2.12
Kafka version: 2.4

5. kafka-manager monitoring

Web access: http://127.0.0.1:9000

5. Error Resolution

1. Container Deletion Failed

docker rm -f $(docker ps -a --filter status=dead -q |head -n 1)
Error message:
ERROR: for f78856fb92e9_zoo1 Driver overlay2 failed to remove root filesystem f78856fb92e97f75ff4c255077de544b39351a4a2a3319737ada2a54df568032: remove /var/lib/docker/overlay2/2c257b8071b6a3d79e216838522f76ba7263d466a470dc92cdbef25c4dd04dc3/merged: device or resource busy
grep docker /proc/*/mountinfo|grep containerid | awk -F ":" '{print $1}' | awk -F "/" '{print $3}'
sudo kill -9 3119

2. The kafka service has been restarted

Error message:
Error response from daemon: Container 9b3f9af8a1196f2ad3cf74fe2b1eeb7ccbd231fe2a93ec09f594d3a0fbb5783c is restarting, wait until the container is running
Error Reason:
Docker-Compose.ymlThe file configures restart: always for the kafka service. If the kafka service fails to start and restarts all the time, you can view the log information of the kafka service start through docker logs kafka to find out the cause of the error.

6. Configuration of Kafka cluster parameters

############################# System ######################
# The ID that uniquely identifies the cluster, requires a positive number.  
broker.id = 0
# Service port, default 9092  
port = 9092
# Listening address, not all addresses  
host.name = debugo01

# Maximum number of threads handling network requests  
num.network.threads = 2
# Number of threads processing disk I/O  
num.io.threads = 8
# Number of background threads  
background.threads = 4
# Maximum number of request queues waiting to be processed by IO threads  
queued.max.requests = 500

#  Send buffer for socket (SO_SNDBUF)  
socket.send.buffer.bytes = 1048576
# Receive buffer for socket s (SO_RCVBUF)   
socket.receive.buffer.bytes = 1048576
# Maximum number of bytes requested by a socket.To prevent memory overflow,Message.max.bytesMust be less than  
socket.request.max.bytes = 104857600

############################# Topic ########################
# Number of partitions per topic, more partitions will produce more segment file s  
num.partitions = 2
# Whether automatic topic creation is allowed or, if false, command creation is required  
auto.create.topics.enable = true
# A topic, the number of replication s for the default partition, cannot be greater than the number of broker s in the cluster.  
default.replication.factor = 1
# Maximum size of the message body in bytes  
message.max.bytes = 1000000

############################# ZooKeeper ####################
# Zookeeper quorum settings.If more than one is split by commas  
zookeeper.connect = debugo01:2181, debugo02, debugo03
# Timeout for connecting zk  
zookeeper.connection.timeout.ms = 1000000
# Synchronization between leader and follower in ZooKeeper cluster  
zookeeper.sync.time.ms = 2000

############################# Log #########################
# Log storage directories, multiple directories separated by commas  
log.dirs = / var / log / kafka

# When the number of messages below is reached, the data is flush ed into the log file.Default 10000  
# log.flush.interval.messages=10000  
# When the following time (ms) is reached, a mandatory flush operation is performed.Interval.msandInterval.messagesEither way, it will flush.Default 3000ms  
# log.flush.interval.ms=1000  
# Check if log flush interval is required  
log.flush.scheduler.interval.ms = 3000

# Log Cleanup Policy (delete|compact)  
log.cleanup.policy = delete
# The log save time (hours|minutes) defaults to 7 days (168 hours).Over this time, the data will be processed according to the policy.bytes and minutes are triggered regardless of which is reached first.  
log.retention.hours = 168
# Maximum number of bytes stored in log data.Over this time, the data will be processed according to the policy.  
# log.retention.bytes=1073741824  

# Controls the size of the log segments file, beyond which a new log segments file is appended (-1 means no limit)  
log.segment.bytes = 536870912
# When the following time is reached, a new segment is forced to be created  
log.roll.hours = 24 * 7
# The review cycle for log fragment files to see if they meet the deletion policy settings (Log.retention.hoursorLog.retention.bytes)  
log.retention.check.interval.ms = 60000

# Whether compression is turned on  
log.cleaner.enable = false
# Maximum time to keep compressed logs  
log.cleaner.delete.retention.ms = 1
day

# Index file size limit for segment log  
log.index.size.max.bytes = 10 * 1024 * 1024
# A buffer computed by the y-index does not normally need to be set.  
log.index.interval.bytes = 4096

############################# replica #######################
# Timeout for communication between partition management controller s and replicas  
controller.socket.timeout.ms = 30000
# Size size of controller-to-broker-channels message queue  
controller.message.queue.size = 10
# The maximum wait time for replicas to respond to leader s, beyond which replicas are excluded from management  
replica.lag.time.max.ms = 10000
# Whether the controller is allowed to turn off the broker, and if set to true, all leader s on this broker will be turned off and transferred to other brokers  
controlled.shutdown.enable = false
# Number of attempts for controller shutdown  
controlled.shutdown.max.retries = 3
# Time interval for each shutdown attempt  
controlled.shutdown.retry.backoff.ms = 5000

# If relicas falls too far behind, it will be considered invalid.In general, due to network latency and other reasons, message synchronization in replicas always lags.If the message is severely delayed, leader will assume that this relicas network has a high latency or limited message throughput.In environments with a small number of broker s or insufficient network, it is recommended that this value be increased.  
replica.lag.max.messages = 4000
# socket timeout for leader and relicas  
replica.socket.timeout.ms = 30 * 1000
# socket cache size for leader replication  
replica.socket.receive.buffer.bytes = 64 * 1024
# Maximum number of bytes per replicas fetch  
replica.fetch.max.bytes = 1024 * 1024
# Maximum wait time for communication between replicas and leader. Failed retries  
replica.fetch.wait.max.ms = 500
# Minimum data size for each fetch operation, if the unsynchronized data in the leader is less than this value, will wait until the data reaches this size  
replica.fetch.min.bytes = 1
# Number of threads replicating in leader, increasing this number will increase relipca's IO  
num.replica.fetchers = 1
# Time interval between each replica flush the highest water level  
replica.high.watermark.checkpoint.interval.ms = 5000

# Whether to automatically balance allocation policies between broker s  
auto.leader.rebalance.enable = false
# The unbalanced proportion of leader s, if exceeded, will rebalance the partitions  
leader.imbalance.per.broker.percentage = 10
# Check for unbalanced lead intervals  
leader.imbalance.check.interval.seconds = 300
# Maximum space for clients to retain offset information  
offset.metadata.max.bytes = 1024

#############################Consumer #####################  
# The core configuration of the Consumer side isGroup.id,zookeeper.connect  
# By setting the same group ID multiple processes indicate that they are all parts of the same Consumer group.  
group.id
# Consumer ID, if not set, will increase  
consumer.id
# An ID for tracking surveys, preferably the same asGroup.ididentical  
client.id = < group_id >

# For zookeeper cluster designation, you must use the same zk configuration as broker  
zookeeper.connect = debugo01:2182, debugo02: 2182, debugo03: 2182
# zookeeper's heartbeat timeout, which is considered an invalid consumer  
zookeeper.session.timeout.ms = 6000
# zookeeper's Waiting Connection Time  
zookeeper.connection.timeout.ms = 6000
# Synchronization time between zookeeper's follower and leader  
zookeeper.sync.time.ms = 2000
# When there is no initial offset in the zookeeper, or when the offset limit is exceeded.  
# smallest: reset to minimum   
# largest:Reset to maximum   
# anything else: throw an exception to consumer  
auto.offset.reset = largest

# The socket timeout, the actual timeout isMax.fetch.wait+Socket.timeout.ms.  
socket.timeout.ms = 30 * 1000
# Receive cache size for socket s  
socket.receive.buffer.bytes = 64 * 1024
# Message size limit from fetch per partition  
fetch.message.max.bytes = 1024 * 1024

# When true, the consumer synchronizes offsets to zookeeper after the consumer message, so when the consumer fails, the new consumer can get the latest offsets from the zookeeper  
auto.commit.enable = true
# Time interval for automatic submission  
auto.commit.interval.ms = 60 * 1000

# Maximum number of message block buffer sizes for consumption, each of which can be equal toFetch.message.maxValues in.Bytes  
queued.max.message.chunks = 10

# When a new consumer joins the group, reblance is attempted to migrate the consumer side of partitions to the new consumer, which is the number of attempts  
rebalance.max.retries = 4
# Time interval per reblance  
rebalance.backoff.ms = 2000
# Time for each re-election of leader  
refresh.leader.backoff.ms

# The minimum data sent by the server to the consumer, if this number is not met, will wait until the specified size is met.The default is 1 for immediate reception.  
fetch.min.bytes = 1
# If not satisfiedFetch.min.bytesMaximum wait time for consumer requests when  
fetch.wait.max.ms = 100
# If no new messages are available for consumption within a specified time, an exception is thrown, default -1 means unlimited  
consumer.timeout.ms = -1

#############################Producer######################  
# Core configurations include:  
# metadata.broker.list  
# request.required.acks  
# producer.type  
# serializer.class  

# The address at which consumers obtain message meta-information (topics, partitions and replicas) in the format host1:port1,host2:port2 or vip outside  
metadata.broker.list

# Confirmation mode of message  
# 0: No acknowledgment of arrival is guaranteed, send only, low latency but message loss occurs, which is a bit like TCP in the case of a server failure  
# 1: Send a message and wait for leader to receive confirmation for reliability  
# -1: Send a message, wait for leader to receive confirmation, and replicate before returning. Maximum reliability  
request.required.acks = 0

# Maximum wait time for message to be sent  
request.timeout.ms = 10000
# Cache size of socket s  
send.buffer.bytes = 100 * 1024
# The serialization of key, if not set, is the same asSerializer.class  
key.serializer.class
# Partition policy, default is modulo  
partitioner.class =kafka.producer.DefaultPartitioner
# Compression mode for messages, none by default, can have gzip and snappy  
compression.codec = none
# You can compress specific topic s for silent writing  
compressed.topics = null
# Number of retries after message sending failure  
message.send.max.retries = 3
# Interval after each failure  
retry.backoff.ms = 100
# The time interval at which the producer periodically updates the topic meta-information, if set to 0, the data is updated after each message is sent  
topic.metadata.refresh.interval.ms = 600 * 1000
# User-specified, but not duplicated, primarily for tracking logged messages  
client.id = ""

# Maximum time to buffer data in asynchronous mode.Setting to 100, for example, collects messages within 100ms and sends them later, which improves throughput but increases the latency of message sending  
queue.buffering.max.ms = 5000
# Maximum number of messages buffered in asynchronous mode, same as above  
queue.buffering.max.messages = 10000
# The wait time for a message to enter the queue in asynchronous mode.If set to 0, the message does not wait, if it cannot enter the queue, it is discarded directly  
queue.enqueue.timeout.ms = -1
# In asynchronous mode, the number of messages sent at a time whenQueue.buffering.max.messages orQueue.buffering.maxThe producer triggers the send when one of the.Ms conditions is met.  
batch.num.messages = 200  

Posted by colesw on Mon, 25 May 2020 11:24:58 -0700