Explore Kafka from the perspective of PHP - realize simple producer function
code first
public function actionProducer() { $conf = new \RdKafka\Conf(); $conf->set('metadata.broker.list', 'broker address'); /** * kafka Delivery callback * RdKafka\Message Object * ( * [err] => 0 // If it is equal to 0, the delivery is successful * [topic_name] => topic_name // Name of the posted topic * [timestamp] => 1637121418036 // Time of delivery * [partition] => 8 // Posted partition * [payload] => Message 100 // Delivered message * [len] => 11 // Length of message * [key] => // The special purpose of the key of each message (for example, the business that cannot be run through the key, key1 creation, key2 change, etc.) * [offset] => 1 // Offset * [headers] => * [opaque] => * ) */ $conf->setDrMsgCb(function ($kafka, $message) { if ($message->err) { echo 'Message permanent failure'."\n"; } else { echo 'Message sent successfully'."\n"; } }); $producer = new \RdKafka\Producer($conf); $topic = $producer->newTopic("topic_name"); for ($i = 0; $i < 10; $i++) { // RD_KAFKA_PARTITION_UA = -1 $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Message {$i}"); } while ($producer->getOutQLen() > 0) { $producer->poll(1000); } for ($flushRetries = 0; $flushRetries < 10; $flushRetries++) { $result = $producer->flush(500); // RD_KAFKA_RESP_ERR_NO_ERROR = 0 if (RD_KAFKA_RESP_ERR_NO_ERROR === $result) { break; } } if (RD_KAFKA_RESP_ERR_NO_ERROR !== $result) { throw new \RuntimeException('Unable to refresh, Messages may be lost!'); } }
Code decomposition
The first mock exam - configuration
$conf = new \RdKafka\Conf(); $conf->set('metadata.broker.list', 'broker address'); $conf->setDrMsgCb(function ($kafka, $message) { if ($message->err) { echo 'Message permanent failure'."\n"; } else { echo 'Message sent successfully'."\n"; } });
- The first part is new, an example of kafka configuration
- The second part is to set the broker address of kafka. It is not necessary. Subsequently, you can also use the \ Rdkafka\Producer::addBrokers() method to add the broker address
- In the third part, if you need to know the delivery result, this method must be implemented. This method is the callback event after delivering the task, = = it needs to be used in conjunction with the pol() method = =. The return values are as follows
RdKafka\Message Object ( [err] => 0 // If it is equal to 0, the delivery is successful [topic_name] => topic_name // Name of the posted topic [timestamp] => 1637121418036 // Time of delivery [partition] => 8 // Posted partition [payload] => Message 100 // Delivered message [len] => 11 // Length of message [key] => // The special purpose of the key of each message (for example, the business that cannot be run through the key, key1 creation, key2 change, etc.) [offset] => 1 // Offset [headers] => [opaque] => )
Second module generation (may be pushed)
$producer = new \RdKafka\Producer($conf); // $producer - > addbrokers ('broker address'); $topic = $producer->newTopic("topic_name"); for ($i = 0; $i < 10; $i++) { // RD_KAFKA_PARTITION_UA = -1 $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Message {$i}"); }
- The first mock exam new is a producer instance, which is a Conf object, that is, [\Rdkafka\Conf] (
- The second part is the first mock exam code and the second part of the first module set, two can be selected.
- The third part creates a topic
The official explanation is as follows: Chinese is translated by Baidu
Creates a new topic instance for topic_name.
Create a new topic instance for the topic name
- Part 4 generating and delivering data = = this method is very important==
The official explanation is as follows: Chinese is translated by Baidu
Produce and send a single message
Produce and send a message
Note:
Since producing is asynchronous, you should call flush before you destroy the producer. Otherwise, any outstanding messages will be silently discarded.
Since production is asynchronous, you should call before destroying the producer. flush . Otherwise, any incomplete messages will be silently discarded.
==If the code ends here, 99% of the message may be lost. Because the generation is asynchronous, the program execution may be completed and the message has not been delivered, so the message may be lost==
In the experiment, the program will sleep for 2 seconds. In fact, messages can be pushed, but if there are too many messages, the sleep 2S may not be pushed. There is no experiment in this case
$producer = new \RdKafka\Producer($conf); // $producer - > addbrokers ('broker address'); $topic = $producer->newTopic("topic_name"); for ($i = 0; $i < 10; $i++) { // RD_KAFKA_PARTITION_UA = -1 $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Message {$i}"); } sleep(2); // Program sleep
Third module confirm success mechanism
while ($producer->getOutQLen() > 0) { $producer->poll(1000); } for ($flushRetries = 0; $flushRetries < 10; $flushRetries++) { $result = $producer->flush(500); // RD_KAFKA_RESP_ERR_NO_ERROR = 0 if (RD_KAFKA_RESP_ERR_NO_ERROR === $result) { break; } } if (RD_KAFKA_RESP_ERR_NO_ERROR !== $result) { throw new \RuntimeException('Unable to refresh, Messages may be lost!'); }
Part I
- while loop condition $producer - > getoutqlen()
The official explanation is as follows: Chinese is translated by Baidu
Polls for events, cause application provided callbacks to be called.
Event polling, causing the callback provided by the application to be called.
Note:
An application using a sub-class of RdKafka should make sure to call poll() at regular intervals to serve any queued callbacks waiting to be called.
Applications that use the RdKafka subclass should ensure that poll() is called periodically to service any queued callbacks waiting to be called.
Returns the number of messages in the out queue.
Returns the number of messages in the output queue
- If the first mock exam is not executed, the callback event of the first module will not be executed if poll is not executed.
==As for the parameter poll(), I understand that it is the blocking time to wait for the callback. For example, if the callback fails to execute successfully in 1 second, the callback will be abandoned. If there is anything wrong, please correct it==
The official explanation is as follows: Chinese is translated by Baidu
Returns the number of events served.
Returns the number of events for the service
Part II
- for the number of cycles, I suggest to use the $producer - > getoutqlen() method to determine the number
- Plug () method in loop body
==As for the flush() method, my personal understanding is: = = this method is equivalent to a thorough poll() method. If the poll() method is not executed before, the flush() method will also help us execute poll(), so as to ensure that all messages are pushed. If the return value is 0, it means that all messages are pushed successfully. Otherwise, messages may be lost
The official explanation is as follows: Chinese is translated by Baidu
Wait until all outstanding produce requests, et.al, are completed. This should typically be done prior to destroying a producer instance to make sure all queued and in-flight produce requests are completed before terminating. This function will call poll() and thus trigger callbacks.
Wait until all outstanding product requests are completed. This should usually be done before the producer instance is destroyed to ensure that all queued and ongoing production requests are completed before termination. This function calls poll() and therefore triggers a callback.
In case of success returns RD_KAFKA_RESP_ERR_NO_ERROR, in case of timeout RD_KAFKA_RESP_ERR__TIMED_OUT and if not called on a producer instance RD_KAFKA_RESP_ERR__NOT_IMPLEMENTED.
Return RD if successful_ KAFKA_ RESP_ ERR_ NO_ Error, in case of timeout, RD_KAFKA_RESP_ERR__TIMED_OUT if RD is not called on the producer instance_ KAFKA_ RESP_ ERR__ NOT_ IMPLEMENTED.
The above is my most basic producer code analysis combined with official documents. If it's wrong, I hope you can correct it!!!!!!
Rdkafka documentation: arnaud.le-blanc.net/php-rdkafka-do...