Apache Kafka Study Notes

Keywords: Programming kafka Docker Zookeeper Apache

quick get start

install

Docker installation single Kafka
docker pull wurstmeister/zookeeper
docker pull wurstmeister/kafka
docker run -d --name zookeeper -p 2181:2181  wurstmeister/zookeeper
docker run -d --name kafka -p 9092:9092 -e KAFKA_BROKER_ID=0 -e KAFKA_ZOOKEEPER_CONNECT=Host computer IP:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:9092 -e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9092 wurstmeister/kafka

command

Start ZK
bin/zookeeper-server-start.sh config/zookeeper.properties
Start Kafka
> bin/kafka-server-start.sh config/server.properties
Create Topic
> bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test
View Topic
> bin/kafka-topics.sh --list --bootstrap-server localhost:9092 test
send message
> bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic test

receive messages
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

Development

Gradle introduces dependency
dependencies {
    implementation 'org.springframework.kafka:spring-kafka'
    implementation 'org.apache.kafka:kafka-clients:2.4.0'
    testImplementation('org.springframework.boot:spring-boot-starter-test') {
        exclude group: 'org.junit.vintage', module: 'junit-vintage-engine'
    }
    testImplementation 'org.springframework.kafka:spring-kafka-test'
}
configuration parameter
spring:
  kafka:
    bootstrap-servers: 127.0.0.1:9092
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
      retries: 0
      batch-size: 16384
      buffer-memory: 33554432
    consumer:
      group-id: dev-consumer-group
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      auto-offset-reset: earliest
      enable-auto-commit: true
      auto-commit-interval: 100
Java Config
@EnableKafka
@Configuration
public class KafkaConfig {
    @Bean
    public NewTopic sampleTopic() {
        return TopicBuilder.name(TopicConsts.SAMPLE_TOPIC)
                .build();
    }
}
send message
@Slf4j
@Component
public class UserProducer {

    @Autowired
    private KafkaTemplate kafkaTemplate;

    public void send(Object message) {
        kafkaTemplate.send(UUID.randomUUID().toString(), JSON.toJSONString(message));
    }

}
Listen to messages
@Slf4j
@Component
public class UserConsumer {

    @KafkaListener(topics = TopicConsts.SAMPLE_TOPIC)
    public void receive(String message) {
        log.info("message {} " , message);
    }

    @KafkaListener(topics = TopicConsts.USER_TOPIC)
    public void receive2(ConsumerRecord<String, String> record) {
        log.info("K:{} V:{} " , record.key(), record.value());
    }

}

core

Design

File
Persistent / asynchronous disk swiping
  • Memory is precious, data is huge, so do not maintain too much data in memory, although it may be faster
  • All data is written to the file system, but does not need to be refreshed immediately
Lazy delete
  • Unlimited expansion of the disk does not degrade performance, so used data is retained for a period of time and then deleted
Zero-Copy
  • Normal file network transfer process
    1. The operating system reads data from disk to page cache in kernel space
    2. Application reads data from kernel space into user space buffer
    3. The application writes the data back to the socket buffer in kernel space
    4. The operating system copies data from the socket buffer to the NIC buffer sent over the network
  • Linux sendfile
    1. With sendfile, this re replication can be avoided by allowing the OS to send data directly from the page cache to the network

producer

load balancing
Asynchronous batch

consumer

Consumption model
  • push model
  • Pull model pull

Distributed model

  • Primary replica Leader
  • From replica Follower

Kafka

Copy log
Partition model
Replica management
Log compression
Log cleanup
Explanation of terms
Chinese concept English concept Explain
producer producer
consumer consumer
Connector connector
Streaming processor
Message proxy server broker server
theme topic
partition partition
Partition log partitioned log
Commit log commit log
Offset offset
Zero-copy zero-copy

Posted by ghost007 on Fri, 24 Apr 2020 00:00:49 -0700