I. Realizing Functions
Flume monitors a log file and sends the data to kafka, and then another flume gets the data from kafka.
II. Implementation steps
1. environment
flume1.6.0
kafka_2.10-0.8.2.1
zookeeper3.4.5
2.flume monitors logs and sends them to kafka's configuration file test1_1.6.conf
[reference: http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.7.0/FlumeUserGuide.html]
# Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -f /opt/datas/access.log a1.sources.r1.channels = c1 # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Describe the sink a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink a1.sinks.k1.topic = beifeng1 a1.sinks.k1.brokerList = bigdata.ibeifeng.com:9092,bigdata.ibeifeng.com:9093 a1.sinks.k1.requiredAcks = 1 a1.sinks.k1.batchSize = 20 a1.sinks.k1.channel = c1
3.flume obtains the consumption data profile test2_1.6.conf from kafka
[reference: http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.7.0/FlumeUserGuide.html]
# Name the components on this agent a2.sources = r1 a2.sinks = k1 a2.channels = c1 #Configure of Flume Agent Source a2.sources.r1.type = org.apache.flume.source.kafka.KafkaSource a2.sources.r1.channels = c1 a2.sources.r1.topic= beifeng1 a2.sources.r1.groupId = flume a2.sources.r1.zookeeperConnect = localhost:2181/kafka08 #Configure of Flume Agent Source a2.channels.c1.type = memory # Configure of Flume Agent Sink=>Kafka Sink a2.sinks.k1.type=logger a2.sinks.k1.channel=c1
4.kafka creates topic: beifeng1
bin/kafka-topics.sh --create --topic beifeng1 --zookeeper bigdata.ibeifeng.com:2181/kafka08 --partitions 5 --replication-factor 2
Three, test
1. Start up zk:uuuuuuuuuuuu
cd /opt/modules/zookeeper-3.4.5 bin/zkServer.sh start
2. Start kafka
bin/kafka-server-start.sh config/server.properties bin/kafka-server-start.sh config/server1.properties
3. Start the monitoring log and pass kafka's flume service test1_1.6.conf
bin/flume-ng agent --name a1 --conf ./conf/ --conf-file ./conf/test1_1.6.conf -Dflume.root.logger=INFO,console
4. Start the flume service test2_1.6.conf that gets information from kafka
bin/flume-ng agent --name a2 --conf ./conf/ --conf-file ./conf/test2_1.6.conf -Dflume.root.logger=INFO,console
5. Start up kafka consumers
bin/kafka-console-consumer.sh --topic beifeng1 --zookeeper bigdata.ibeifeng.com:2181/kafka08
6. Input data to test file
echo "liuming gerry tom" >> /opt/datas/access.log