Read only instance (slave master-slave) delay troubleshooting

Keywords: MySQL

The following methods are applicable to real-time viewing of read-only delay (master-slave delay), that is, the problem can be confirmed only when the delay occurs. Historical delay is not applicable. Parallel replication has been enabled in the following environments.

Chapter 1: how to judge no delay
Step 1: get the master instance information

The primary instance (primary node) is executed under the high permission account:

1,show processlist

2,show master status\G

3,show variables like '%uuid%';

Step 2: read only instance information acquisition

The read-only instance (slave Slave) is executed under the high permission account:

1ï¼ show processlist

2ï¼ show slave status\G

mysql>show slave status\G

*************************** 1. row ***************************

           Slave_IO_State: Waiting for master to send event

              Master_Host: 10.151.83.26

              Master_User: replicator

              Master_Port: 3106

            Connect_Retry: 60

          Master_Log_File: mysql-bin.000379

      Read_Master_Log_Pos: 53064

           Relay_Log_File: slave-relay.000375

            Relay_Log_Pos: 53237

    Relay_Master_Log_File: mysql-bin.000379

         Slave_IO_Running: Yes

        Slave_SQL_Running: Yes

          Replicate_Do_DB:

      Replicate_Ignore_DB:

       Replicate_Do_Table:

   Replicate_Ignore_Table:

  Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

               Last_Errno: 0

               Last_Error:

             Skip_Counter: 0

      Exec_Master_Log_Pos: 53064

          Relay_Log_Space: 53527

          Until_Condition: None

           Until_Log_File:

            Until_Log_Pos: 0

       Master_SSL_Allowed: No

       Master_SSL_CA_File:

       Master_SSL_CA_Path:

          Master_SSL_Cert:

        Master_SSL_Cipher:

           Master_SSL_Key:

    Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

            Last_IO_Errno: 0

            Last_IO_Error:

           Last_SQL_Errno: 0

           Last_SQL_Error:

Replicate_Ignore_Server_Ids:

         Master_Server_Id: 999098802

              Master_UUID: a23bf85f-f20b-11e8-8268-7cd30abda240

         Master_Info_File: mysql.slave_master_info

                SQL_Delay: 0

      SQL_Remaining_Delay:

  Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

       Master_Retry_Count: 86400

              Master_Bind:

  Last_IO_Error_Timestamp:

 Last_SQL_Error_Timestamp:

      Last_SQL_Error_Gtid:

           Master_SSL_Crl:

       Master_SSL_Crlpath:

       Retrieved_Gtid_Set: a23bf85f-f20b-11e8-8268-7cd30abda240:391216-838127

        Executed_Gtid_Set: 937ca3af-f20b-11e8-9447-7cd30ab8a5d8:1-178841,

a23bf85f-f20b-11e8-8268-7cd30abda240:1-838127

            Auto_Position: 1

     Replicate_Rewrite_DB:

             Channel_Name:

       Master_TLS_Version:

3,show variables like '%uuid%';

Step 3: comparison and confirmation

1ï ¼ Confirm whether the process status of the primary instance is master has sent all binlog to slave; Waiting for more updates (marked by the red box in the picture). If it is in this state, it indicates that the master has sent all binlogs to slave, and the primary instance is normal

2ï ¼ Confirm whether the process status of the read-only instance (slave Slave Library) is: Waiting for master to send event, and

Slave has read all relay log; waiting for more updates. If the status is changed, it means that the slave and the binlog sent by the master are applied and waiting for the master to continue sending

3ï ¼ Compare the executed results obtained by the read-only instance executing show slave status\G_ Gtid_ Set and the master instance execute show master status\G_ Gtid_ Whether the sets are consistent or similar. If they are consistent, it indicates that the transactions executed by the primary instance and the read-only instance have also been executed.

By comparing the above three steps, it can be confirmed that there is no delay between the master instance and the read-only instance (slave Slave Library)

Chapter II causes of delay and Solutions

According to the conclusion in Chapter 1, it can be concluded that there is no delay in the read-only instance. When there is a read-only delay, how to confirm the cause and solve it, please refer to the following specific examples:

2.1: mass update
2.1.1 reproduction method:

   The primary instance performs update operations such as mass writes: insert into MOCK_DATA3 select * from aaaaa.MOCK_DATA,among MOCK_DATA Table quantity 1000 w Line, resulting in a delay

2.1.2 occurrence of delay:

show processlist result:

21 system user Connect 41786 Waiting for master to send event

22 system user Connect Waiting for Slave Workers to free pending events

23 system user Connect 236 Executing event

24 system user Connect 571 Waiting for an event from Coordinator

25 system user Connect 41786 Waiting for an event from Coordinator

26 system user Connect 41786 Waiting for an event from Coordinator

27 system user Connect 41786 Waiting for an event from Coordinator

28 system user Connect 41786 Waiting for an event from Coordinator

29 system user Connect 41786 Waiting for an event from Coordinator

30 system user Connect 41786 Waiting for an event from Coordinator

show slave status\G result:

          Master_SSL_Key :

          Seconds_Behind_Master : 0

          Master_SSL_Verify_Server_Cert : No

          Last_IO_Errno : 0

          Last_IO_Error :

          Last_SQL_Errno : 0

          Last_SQL_Error :

          Replicate_Ignore_Server_Ids :

          Master_Server_Id : 999098802

          Master_UUID : a23bf85f-f20b-11e8-8268-7cd30abda240

          Master_Info_File : mysql.slave_master_info

          SQL_Delay : 0

          SQL_Remaining_Delay : undefined

          Slave_SQL_Running_State : Waiting for Slave Workers to free pending events

          Master_Retry_Count : 86400

          Master_Bind :

          Last_IO_Error_Timestamp :

          Last_SQL_Error_Timestamp :

          Last_SQL_Error_Gtid :

          Master_SSL_Crl :

          Master_SSL_Crlpath :

          Retrieved_Gtid_Set : a23bf85f-f20b-11e8-8268-7cd30abda240:1-861466

           Executed_Gtid_Set : 937ca3af-f20b-11e8-9447-7cd30ab8a5d8:1-178841,a23bf85f-f20b-11e8-8268-7cd30abda240:1-861455

show engine innodb status\G result:

—TRANSACTION 2819387, ACTIVE 65 sec inserting

mysql tables in use 1, locked 1

1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 5919707

MySQL thread id 23, OS thread handle 140536255158016, query id 102150 Executing event

2.1.3 solutions

It is recommended to wait for the read-only instance to apply the update of the main database, and it will be restored automatically.

2.2: update without primary key
2.2.1 reproduction method:

The primary instance performs update without primary key for more data: MySQL > Update mock_ DATA set id=90 where id<10;, Cause delay

2.2.2 occurrence of delay:

show processlist result:

show slave status\G result:

2.2.3 solutions

   It is suggested to increase the effective index to improve the update efficiency.

2.3: blocking (ddl)
2.3.1 reproduction method

The read-only instance starts an explicit transaction, executes the following query, and does not commit the transaction:

The main instance executes truncate table mock_data, read-only delay generation

2.3.2 delay generation

show processlist result:

show slave status\G result:

2.3.3 solutions

   Such problems are generally caused by other affairs ddl Operation blocked, recommended kill Drop blocked transactions,reference resources: https://help.aliyun.com/knowledge_detail/41723.html?spm=5176.11065259.1996646101.searchclickresult.3c2e73bf9s0jmI

2.4: Waiting for Slave Worker to release partition
reference resources: https://yq.aliyun.com/articles/188482?spm=a2c4e.11155435.0.0.120f63b5YRAxtQ

Posted by JoeyT2007 on Sat, 30 Oct 2021 10:37:38 -0700