kafka Learning (1) Cluster Installation Configuration

Keywords: Zookeeper kafka Attribute Java

1 install java

mkdir /usr/local/java
cp jdk-8u20-linux-x64.tar.gz /usr/local/java
tar zxvf jdk-8u20-linux-x64.tar.gz
vim /etc/profile

JAVA_HOME=/usr/local/java/jdk1.8.0_20
JRE_HOME=JAVA_HOME/jre
CLASS_PATH=.:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib:$JAVA_HOME/lib/dt.jar
PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export JAVA_HOME JRE_HOME PATH CLASS_PATH

source /etc/profile
java -version

2 Install zookeeper

2.1 Deployment Configuration

mkdir /usr/local/zookeeper-cluster
cp zookeeper-3.5.2-alpha.tar.gz /usr/local/zookeeper-cluster/
tar zxvf zookeeper-3.5.2-alpha.tar.gz
cd /usr/local/zookeeper-cluster/zookeeper-3.5.2-alpha
mv conf/zoo_sample.cfg conf/zoo.cfg
mkdir data
mkdir datalog
vim conf/zoo.cfg

clientPort=2181
dataDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/data
datailogDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/datalog
syncLimit=5
initLimit=10
tickTime=2000
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

mv zookeeper-3.5.2-alpha/ zookeeper-3.5.2-node1
cp -R zookeeper-3.5.2-node1 zookeeper-3.5.2-node2
cp -R zookeeper-3.5.2-node1 zookeeper-3.5.2-node3
node2 conf/zoo.cfg

clientPort=2182
dataDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/data
datailogDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/datalog
syncLimit=5
initLimit=10
tickTime=2000
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

node3 conf/zoo.cfg

clientPort=2183
dataDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/data
datailogDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/datalog
syncLimit=5
initLimit=10
tickTime=2000
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

Write to myid

#node1
echo "1" > zookeeper-3.5.2-node1/data/myid
#node2
echo "2" > zookeeper-3.5.2-node2/data/myid
#node3
echo "3" > zookeeper-3.5.2-node3/data/myid

2.2 Configuration Notes

2.2.1 myid file and server.myid

The file that identifies the server stored in the snapshot directory is an important identifier used by the entire zk cluster to discover each other.

2.2.2 zoo.cfg

The file is the zookeeper configuration file in the conf directory.

2.2.3 log4j.properties

The file is the log output file of zk. The program written in java in conf directory basically has one thing in common: log4j is used to manage the log.

2.2.4 zkEnv.sh and zkServer.sh files

zkServer.sh master hypervisor file
zkEnv.sh is the main configuration, and the file for configuring environment variables at zookeeper cluster startup

2.3 Parametric Description

tickTime=2000:
tickTime This time is to act as Zookeeper Time interval between servers or between clients and servers to maintain heartbeat,That is, each tickTime Time sends a heartbeat.

initLimit=10:
initLimit This configuration item is used to configure Zookeeper Accept the client (the client here is not a user connection) Zookeeper Client side of server,But Zookeeper Connecting to a server cluster Leader Of Follower Server) How many heartbeat intervals can be tolerated when initializing the connection.
//When it has exceeded10The time of a heartbeat. tickTime)After length Zookeeper The server has not received the return information from the client yet.,This indicates that the client connection failed. The total length of time is 10*2000=20 Seconds;

syncLimit=5:
syncLimit This configuration item identifier Leader and Follower Send messages between,Request and response time length,No more than a few tickTime Length of time,The total length of time is5*2000=10Seconds;

dataDir=/export/search/zookeeper-cluster/zookeeper-3.4.6-node1/data
dataDir As the name implies, it is Zookeeper A directory for storing data,By default Zookeeper The log files that write data are also saved in this directory.
datailogDir=/usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/datalog
//If the storage path of the transaction log is not configured, the transaction log will be stored in the directory specified by dataDir by default, which will seriously affect the performance of zk. When the throughput of ZK is high, too many transaction logs and snapshot logs will be generated.
clientPort=2181
clientPort This port is the client connection. Zookeeper Port of Server,Zookeeper It will listen for this port to accept the client's access request.

server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889
server.A=B: C: D: 
A It's a number.,Which server is this?,B It's on this server. ip address
C The first port is used to exchange information among cluster members.,Represents in this server and cluster Leader Port of Server Exchange Information
D Is in leader When it hangs up, it's used exclusively for elections. leader Used

#autopurge.purgeInterval  This parameter specifies the cleaning frequency, in units of hours, which needs to be filled in.1Or larger integers, default is0，Indicates that you do not open your own cleaning function.
#autopurge.snapRetainCount This parameter is used in conjunction with the above parameter, which specifies the number of files to be retained. The default is to retain3One.

2.4 Start Test

start-up

zookeeper-3.5.2-node1/bin/zkServer.sh start
zookeeper-3.5.2-node2/bin/zkServer.sh start
zookeeper-3.5.2-node3/bin/zkServer.sh sta

test

[root@paasagento zookeeper-cluster]# zookeeper-3.5.2-node1/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-cluster/zookeeper-3.5.2-node1/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: leader

Connect

zookeeper-3.5.2-node1/bin/zkCli.sh -server 127.0.0.1:2181
ls /

3 Installation of KAFKA

3.1 Deployment Configuration

mkdir /usr/local/kafka
cp kafka_2.11-0.10.1.1.tgz /usr/local/kafka/
tar zxvf kafka_2.11-0.10.1.1.tgz
cd kafka_2.11-0.10.1.1
vim config/server.properties

broker.id=1  #The unique identity of the current machine in the cluster is the same as zookeeper's myid property.
//The only representation of each broker in a cluster requires a positive number. When the IP address of the server changes, broker.idNo change, no impact. consumers Information
port=9092  #At present, the default port of kafka is 9092
host.name=192.168.1.172 #broker's host address, if set, will be bound to this address, if not, will be bound to all interfaces, and send one of them to ZK, generally not set.
num.network.threads=3 #The maximum number of threads that broker processes messages, usually the number of cpu cores
num.io.threads=8 #The number of threads that broker processes disk IO, twice the number of cpu cores
socket.send.buffer.bytes=102400 #Send buffer size, data is not sent at once, first back to the buffer to reach a certain size, then send, can improve performance.
socket.receive.buffer.bytes=102400 #kafka receives buffer size and serializes data to disk when it reaches a certain size
socket.request.max.bytes=104857600 #This parameter is the maximum number of requests to request messages to Kafka or to send messages to kafka. This value cannot exceed the stack size of java.
log.dirs=/tmp/kafka-logs_1      #The storage address of Kafka data, comma division of multiple addresses, and distribution of multiple directories on different disks can improve read and write performance / data/kafka-logs-1, / data/kafka-logs-2.
num.partitions=3 #The number of partitions per topic, if not specified at the time of topic creation, will be overwritten by the specified parameters at the time of topic creation.
num.recovery.threads.per.data.dir=1 #The number of threads used for log recovery at startup by default is 1.
log.retention.hours=168 #Maximum persistence time for default messages, 168 hours, 7 days
log.segment.bytes=1073741824    #Topic partitions are stored in a bunch of segment files, which control the size of each segment and are overwritten by the specified parameters at the time of creation of the topic.
log.retention.check.interval.ms=300000 #Check the log expiration time (log.retention.hours=168) of the above configuration every 300,000 milliseconds, and go to the directory to see if there are expired messages. If so, delete them.
zookeeper.connect=localhost:2181,localhost:2182,localhost:2183
zookeeper.connection.timeout.ms=6000 #Connection timeout of ZooKeeper

mv config/server.properties config/server1.properties
cp -R config/server1.properties config/server2.properties
cp -R config/server1.properties config/server3.properties
kafka2

broker.id=2
port=9093
host.name=192.168.1.172
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs_2
num.partitions=3
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181,localhost:2182,localhost:2183
zookeeper.connection.timeout.ms=6000

kafka3

broker.id=3
port=9094
host.name=192.168.1.172
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs_2
num.partitions=3
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181,localhost:2182,localhost:2183
zookeeper.connection.timeout.ms=6000

3.2 Start Server

bin/kafka-server-start.sh config/server1.properties &
bin/kafka-server-start.sh config/server2.properties &
bin/kafka-server-start.sh config/server3.properties &

[root@paasagento kafka_1]# jobs
[1]   In operation               bin/kafka-server-start.sh config/server1.properties &
[2]-  In operation               bin/kafka-server-start.sh config/server2.properties &
[3]+  In operation               bin/kafka-server-start.sh config/server3.properties &

bin/zkCli.sh -server 192.168.1.172:2182
[zk: 192.168.1.172:2182(CONNECTED) 8] ls /
[admin, brokers, cluster, config, consumers, controller, controller_epoch, isr_change_notification, zookeeper]
[zk: 192.168.1.172:2182(CONNECTED) 5] get /brokers/ids/1
{"jmx_port":-1,"timestamp":"1484654956028","endpoints":["PLAINTEXT://192.168.1.172:9092"],"host":"192.168.1.172","version":3,"port":9092}
[zk: 192.168.1.172:2182(CONNECTED) 6] get /brokers/ids/2
{"jmx_port":-1,"timestamp":"1484655055260","endpoints":["PLAINTEXT://192.168.1.172:9093"],"host":"192.168.1.172","version":3,"port":9093}
[zk: 192.168.1.172:2182(CONNECTED) 7] get /brokers/ids/3
{"jmx_port":-1,"timestamp":"1484655071043","endpoints":["PLAINTEXT://192.168.1.172:9094"],"host":"192.168.1.172","version":3,"port":9094

3.3 Kafka test

Generate topic, three slices, three copies
bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 3 –partitions 3 –topic test_topic

[root@paasagento kafka_1]# bin/kafka-topics.sh --list --zookeeper localhost:2181
test_topic
[root@paasagento kafka_1]# bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test_topic                                         
Topic:test_topic        PartitionCount:3        ReplicationFactor:3     Configs:
        Topic: test_topic       Partition: 0    Leader: 1       Replicas: 1,2,3 Isr: 1,2,3
        Topic: test_topic       Partition: 1    Leader: 2       Replicas: 2,3,1 Isr: 2,3,1
        Topic: test_topic       Partition: 2    Leader: 3       Replicas: 3,1,2 Isr: 3,1,2

Publish news
bin/kafka-console-producer.sh –broker-list 192.168.1.172:9092,192.168.1.172:9093,192.168.172:9094 –topic test_topic
Consumer News
bin/kafka-console-consumer.sh –zookeeper localhost:2181 –topic test_topic
zookeeper View topic
bin/zkCli.sh -server 192.168.1.172:2182

[zk: 192.168.1.172:2182(CONNECTED) 2] get /brokers/topics/test_topic
{"version":1,"partitions":{"2":[3,1,2],"1":[2,3,1],"0":[1,2,3]}}

3.3 Annotations

server.log
kafka running log
state-change.log
kafka uses zookeeper to save state, so he may switch, and the log of the switch is saved here.
controller.log
Kafka chooses a node as a "controller" and is responsible for selecting a new leader among all the nodes in the partition when a node drops down. This enables Kafka to manage the master-slave relationship of all partition nodes in batches efficiently. If controller down is dropped, one of the surviving nodes will switch to a new controller.

4 supervisor management

4.1 Managing zookeeper

vim bin/zkE adds environmental paths to the top

JAVA_HOME=/usr/local/java/jdk1.8.0_20
export JAVA_HOME

vim /etc/supervisor/zookeeper.conf

[program:zookeeper]
command=/usr/local/zookeeper-3.5.2-alpha/bin/zkServer.sh start-foreground
autostart=true
autorestart=true
startsecs=10
stdout_logfile=/var/log/zookeeper.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=10
stdout_capture_maxbytes=1MB
stderr_logfile=/var/log/zookeeper.log
stderr_logfile_maxbytes=1MB
stderr_logfile_backups=10
stderr_capture_maxbytes=1MB

supervisorctl reload
bin/zkServer.sh status

4.2 Management of kafka

vim bin/kafka-run-class.sh adds environmental paths on top

JAVA_HOME=/usr/local/java/jdk1.8.0_20
export JAVA_HOME

vim /etc/supervisor/kafka.conf

[program:kafka]
command=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties
user=root
autostart=true
autorestart=true
startsecs=10
stdout_logfile=/var/log/kafka.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=10
stdout_capture_maxbytes=1MB
stderr_logfile=/var/log/kafka.log
stderr_logfile_maxbytes=1MB
stderr_logfile_backups=10
stderr_capture_maxbytes=1MB

supervisorctl reload

5 development

Waiting for the development of py in Chapter 2 and c# in Chapter 3

Posted by icesolid on Fri, 22 Mar 2019 09:27:52 -0700

Programmer Group