Necessary for Architects: skillfully use Canal to realize asynchronous and decoupled architecture

Keywords: MySQL architecture canal

This paper introduces how to use Canal to realize asynchronous and decoupled architecture. Later, I will write an article to analyze the principle and source code of Canal.

Introduction to Canal

Canal is a middleware used to obtain database changes.
Disguise yourself as a MySQL slave database, pull the main database binlog, analyze and process it. The processing results can be sent to MQ to facilitate other services to obtain database change messages, which is very useful. Some typical uses are described below.

Among them, Canal+MQ as a whole is a data pipeline service from the outside, as shown in the figure below.

Typical use of Canal

Heterogeneous data (e.g. ES, HBase, DB with different routing key s)

Through the adapter provided by Canal, heterogeneous data can be synchronized to ES and HBase without cumbersome data conversion and synchronization. The adapter here is a typical adapter mode, which converts the data into the corresponding format and writes it to the heterogeneous storage system.

Of course, you can also synchronize the data to the DB, or even build a database with piecemeal routing according to different fields.
For example, when placing an order, the order record is divided into database and table by user id, and then with the help of the Canal data channel, an order record divided into database and table by merchant id is constructed for B-end business (for example, merchants query which orders they receive).

Cache refresh

The conventional method of cache refresh is to update the DB first, then delete the cache, and then delay the deletion (i.e. cache side pattern + delayed double deletion). This multi-step operation may fail and the implementation is relatively complex. With the help of Canal cache refresh, the main service and main process do not need to care about the consistency problems such as cache update, so as to ensure the final consistency.

Important business news such as price change

Downstream services can immediately perceive price changes.
The general practice is to modify the price before sending the message. The difficulty here is to ensure that the message is sent successfully and how to deal with it if it is not sent successfully. With Canal, you don't have to worry about message loss at the business level.

Database migration

  • Multi machine room data synchronization
  • Dismantling Library
    Although you can implement double write logic in the code and then process the historical data, the historical data may also be updated, which requires continuous iterative comparison and update. In short, it is very complex.

Real time reconciliation

The routine practice is that regular tasks run the reconciliation logic, with low timeliness, and inconsistency can not be found in time. With Canal, reconciliation logic can be triggered in real time.
The general process is as follows:

  • Receive data change message
  • Write hbase as pipeline record
  • After a period of window time, trigger the comparison and compare the opposite end data

Canal client demo code analysis

The following example is an example of a client connecting to Canal. It is modified from the official github example. The landlord has made some optimizations and added comments to the key code lines. If Canal sends the data change message to MQ, the writing method is different. The difference is that one is to subscribe to Canal and the other is to subscribe to MQ, but the parsing and processing logic are basically the same.


public void process() {
    // Number of pieces processed per batch
    int batchSize = 1024;
    while (running) {
        try {
            // Connect to Canal service
            // Subscribe to data (such as a table)
            while (running) {
                // Batch acquisition of data change records
                Message message = connector.getWithoutAck(batchSize);
                long batchId = message.getId();
                int size = message.getEntries().size();
                if (batchId == -1 || size == 0) {
                    // Unexpected situation, exception handling is required
                } else {
                    // Print data change details

                if (batchId != -1) {
                    // ack with batchId: indicates that the batch processing is completed and the consumption progress on the Canal side is updated
        } catch (Throwable e) {
            logger.error("process error!", e);
            try {
            } catch (InterruptedException e1) {
                // ignore

            // Processing failed, rollback progress
        } finally {
            // Disconnect

private void printEntry(List<Entry> entrys) {
    for (Entry entry : entrys) {
        long executeTime = entry.getHeader().getExecuteTime();
        long delayTime = new Date().getTime() - executeTime;
        Date date = new Date(entry.getHeader().getExecuteTime());
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

        // Only care about the type of data change
        if (entry.getEntryType() == EntryType.ROWDATA) {
            RowChange rowChange = null;
            try {
                // Parse data change object
                rowChange = RowChange.parseFrom(entry.getStoreValue());
            } catch (Exception e) {
                throw new RuntimeException("parse event has an error , data:" + entry.toString(), e);

            EventType eventType = rowChange.getEventType();

                new Object[] { entry.getHeader().getLogfileName(),
                        String.valueOf(entry.getHeader().getLogfileOffset()), entry.getHeader().getSchemaName(),
                        entry.getHeader().getTableName(), eventType,
                        String.valueOf(entry.getHeader().getExecuteTime()), simpleDateFormat.format(date),
                        entry.getHeader().getGtid(), String.valueOf(delayTime) });

            // Don't care about queries and DDL changes
            if (eventType == EventType.QUERY || rowChange.getIsDdl()) {
      "ddl : " + rowChange.getIsDdl() + " ,  sql ----> " + rowChange.getSql() + SEP);

            for (RowData rowData : rowChange.getRowDatasList()) {
                if (eventType == EventType.DELETE) {
                    // When the data change type is delete, the column value before the change is printed
                } else if (eventType == EventType.INSERT) {
                    // When the data change type is insert, the changed column value is printed
                } else {
                    // When the data change type is other (i.e. update), the column values before and after the change are printed

// Print column values
private void printColumn(List<Column> columns) {
    for (Column column : columns) {
        StringBuilder builder = new StringBuilder();
        try {
            if (StringUtils.containsIgnoreCase(column.getMysqlType(), "BLOB")
                || StringUtils.containsIgnoreCase(column.getMysqlType(), "BINARY")) {
                // get value bytes
                builder.append(column.getName() + " : "
                               + new String(column.getValue().getBytes("ISO-8859-1"), "UTF-8"));
            } else {
                builder.append(column.getName() + " : " + column.getValue());
        } catch (UnsupportedEncodingException e) {
        builder.append("    type=" + column.getMysqlType());
        if (column.getUpdated()) {
            builder.append("    update=" + column.getUpdated());


Posted by tinker on Fri, 26 Nov 2021 18:27:04 -0800