Failed on my_net_write()

Keywords: Attribute MySQL SQL network

Introduction

Log error

2018-07-16T16:48:58.391994+08:00 77 [Note] Aborted connection 77 to db: 'unconnected' user: 'ashe' host: '127.0.0.1' (Failed on my_net_write())

We know that mysqld is a multi-threaded C/S architecture network application, so it is necessary to read and write data through the network, so there may be write data failure. If such errors occur in mysql's error log, it means that mysqld failed to send network packets to the client. Of course, extended to the replication scenario, it means that in the process of replication, when master pushed binlog to slave, writing network packets failed.

Demonstration of tcp congestion

Here's a demonstration of tcp congestion caused by slave suspension of reading network packets during master-slave replication

  • Using the read_event function from the gdb breakpoint to the slave machine, reading network packets from the slave machine will be suspended at this time.
    image_1cih49t83eql1o3gpn01cfukeu9.png-368.7kB

  • The main library operates continuously while observing tcp links
    image_1cih4efk21kufacm75qfqt1busp.png-839.9kB

  • View the main library links, logs
    image_1cih4k5pm10v1hl91dn513i41qs816.png-53.9kB
    image_1cih4kn821gs51k7415hg1ogn1nt01j.png-47.9kB

View master dump thread logic

It is necessary to determine the master's logic to exit dump thread s when sending binlog fails and enter the relevant code to view according to the error log prompt.
The error code is in the following position

inline int Binlog_sender::send_packet()
{
  DBUG_ENTER("Binlog_sender::send_packet");
  DBUG_PRINT("info",
             ("Sending event of type %s", Log_event::get_type_str(
                (Log_event_type)m_packet.ptr()[1 + EVENT_TYPE_OFFSET])));
  // We should always use the same buffer to guarantee that the reallocation
  // logic is not broken.
  if (DBUG_EVALUATE_IF("simulate_send_error", true,
                       my_net_write(
                         m_thd->get_protocol_classic()->get_net(),
                         (uchar*) m_packet.ptr(), m_packet.length())))
  {
    set_unknow_error("Failed on my_net_write()");
    DBUG_RETURN(1);
  }

The call relationship is

(gdb) bt
#0  Binlog_sender::send_packet (this=0x7fea741655d0) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:1158
#1  0x000000000190f74e in Binlog_sender::send_packet_and_flush (this=0x7fea741655d0) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:1182
#2  0x000000000190e181 in Binlog_sender::send_heartbeat_event (this=0x7fea741655d0, log_pos=504) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:1143
#3  0x000000000190ee01 in Binlog_sender::wait_with_heartbeat (this=0x7fea741655d0, log_pos=504) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:633
#4  0x000000000190ecd7 in Binlog_sender::wait_new_events (this=0x7fea741655d0, log_pos=504) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:599
#5  0x000000000190e938 in Binlog_sender::get_binlog_end_pos (this=0x7fea741655d0, log_cache=0x7fea74165020) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:365
#6  0x000000000190c5e0 in Binlog_sender::send_binlog (this=0x7fea741655d0, log_cache=0x7fea74165020, start_pos=123) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:313
#7  0x000000000190c1b4 in Binlog_sender::run (this=0x7fea741655d0) at /data/mysql-server-explain_ddl/sql/rpl_binlog_sender.cc:225

The result layer returns to Binlog_sender::run
Look roughly at the logic of Binlog_sender::run

void Binlog_sender::run()
{
    while (!has_error() && !m_thd->killed)
    {
     if (send_binlog(&log_cache, start_pos))
      break;
    }

}

It's probably clear when you explain it here.

Posted by wedge00 on Mon, 04 Feb 2019 15:36:16 -0800