Detailed nginx data receiving process

Keywords: Programming Nginx Attribute socket

In nginx Event Driven Process Details Based on epoll Model As we mentioned, epoll calls back the ngx_event_accept() method after triggering the accept event.This approach does two main things:

  • Get the client connection handle to which accept is going, and initialize a ngx_connection_t structure to characterize the connection;
  • Check that the new connection has readable data, and if so, read and process the data, otherwise add the current connection handle to the epoll framework to listen for its readable events.

The first step above is nginx Event Driven Process Details Based on epoll Model A detailed explanation has been given, and here we will focus on how the second step is implemented.At the end of the ngx_event_accept() method, it calls the following piece of code:

void ngx_event_accept(ngx_event_t *ev) {
  do {
    // Omit...
    
    // The callback method after a new connection is established by parsing the last HTTP configuration block in the ngx_http_block() method
    // Initialized in the ngx_http_optimize_servers() method, which points to the ngx_http_init_connection() method
    ls->handler(c);

    // Omit...

  } while (ev->available);
}

The ls->handler() method here points to ngx_http_init_connection(), which, as the name implies, initializes the currently generated ngx_connection_t connection.The essence of this method is an entry method for initializing event drivers for client connections, such as the source code for this method:

/**
 * The current method is stored in the handler property of ngx_listening_t and is called mainly in the ngx_event_accept() method.
 * That is, whenever a new client request is received, the ngx_event_accept() method invokes the current method to initialize the connection
 */
void ngx_http_init_connection(ngx_connection_t *c) {
  ngx_uint_t i;
  ngx_event_t *rev;
  struct sockaddr_in *sin;
  ngx_http_port_t *port;
  ngx_http_in_addr_t *addr;
  ngx_http_log_ctx_t *ctx;
  ngx_http_connection_t *hc;
#if (NGX_HAVE_INET6)
  struct sockaddr_in6 *sin6;
  ngx_http_in6_addr_t *addr6;
#endif

  hc = ngx_pcalloc(c->pool, sizeof(ngx_http_connection_t));
  if (hc == NULL) {
    ngx_http_close_connection(c);
    return;
  }

  c->data = hc;
  port = c->listening->servers;

  // naddrs indicates the number of server s listening on the current port
  // The main purpose of the if-else branch here is to find a server configuration that matches the current request.
  // Then assign it to the addr_conf attribute of the ngx_http_connection_t structure
  if (port->naddrs > 1) {
    // This is mainly to get the ip address of the client
    if (ngx_connection_local_sockaddr(c, NULL, 0) != NGX_OK) {
      ngx_http_close_connection(c);
      return;
    }
    
    switch (c->local_sockaddr->sa_family) {

#if (NGX_HAVE_INET6)
      case AF_INET6:
        sin6 = (struct sockaddr_in6 *) c->local_sockaddr;
        addr6 = port->addrs;
        for (i = 0; i < port->naddrs - 1; i++) {
          // Compare the obtained client addresses with the configured addresses to find the one that matches
          if (ngx_memcmp(&addr6[i].addr6, &sin6->sin6_addr, 16) == 0) {
            break;
          }
        }

        // Assign the corresponding server configuration found to the addr_conf attribute of the ngx_http_connection_t structure
        hc->addr_conf = &addr6[i].conf;
        break;
#endif

      default: /* AF_INET */
        sin = (struct sockaddr_in *) c->local_sockaddr;
        addr = port->addrs;
        for (i = 0; i < port->naddrs - 1; i++) {
          // Compare the obtained client addresses with the configured addresses to find the one that matches
          if (addr[i].addr == sin->sin_addr.s_addr) {
            break;
          }
        }

        // Assign the corresponding server configuration found to the addr_conf attribute of the ngx_http_connection_t structure
        hc->addr_conf = &addr[i].conf;
        break;
    }
  } else {
    // Here, the else branch indicates that there is only one server configuration listening on the current port
    switch (c->local_sockaddr->sa_family) {
#if (NGX_HAVE_INET6)
      case AF_INET6:
        addr6 = port->addrs;
        hc->addr_conf = &addr6[0].conf;
        break;
#endif
      default: /* AF_INET */
        addr = port->addrs;
        hc->addr_conf = &addr[0].conf;
        break;
    }
  }

  hc->conf_ctx = hc->addr_conf->default_server->ctx;
  ctx = ngx_palloc(c->pool, sizeof(ngx_http_log_ctx_t));
  if (ctx == NULL) {
    ngx_http_close_connection(c);
    return;
  }

  ctx->connection = c;
  ctx->request = NULL;
  ctx->current_request = NULL;
  
  c->log->connection = c->number;
  c->log->handler = ngx_http_log_error;
  c->log->data = ctx;
  c->log->action = "waiting for request";
  c->log_error = NGX_ERROR_INFO;

  rev = c->read;
  // Sets the processing that will be triggered when a read event occurs
  rev->handler = ngx_http_wait_request_handler;
  // Initially, there is no need to write events, so if the write event is triggered unexpectedly, the empty method here is called directly and does nothing.
  c->write->handler = ngx_http_empty_handler;

  if (hc->addr_conf->proxy_protocol) {
    hc->proxy_protocol = 1;
    c->log->action = "reading PROXY protocol";
  }

  // If the current read event is ready, process it directly
  if (rev->ready) {
    // If shared locks are set to handle events, the current read event is added to the ngx_posted_events queue
    if (ngx_use_accept_mutex) {
      ngx_post_event(rev, &ngx_posted_events);
      return;
    }

    // If the event is not handled by a shared lock set, the handler() method of the read event is called directly to handle the event.
    // Here the handler points to the ngx_http_wait_request_handler() method set above
    rev->handler(rev);
    return;
  }

  // Go here and say that the read event is not ready, add it to the event loop
  ngx_add_timer(rev, c->listening->post_accept_timeout);
  ngx_reusable_connection(c, 1);

  // Add a read listener for the current event to the current epoll handle
  if (ngx_handle_read_event(rev, 0) != NGX_OK) {
    ngx_http_close_connection(c);
    return;
  }
}

The overall logic of ngx_http_init_connection() above is relatively simple, and it mainly does the following:

  • Find all server configuration blocks configured for the port accessed by the current connection, and then compare these servers to find a matching server configuration block based on the configuration of the virtual host.
  • Setting the read event callback function of the current connection to the ngx_http_wait_request_handler() method is also a key method for reading data, which will be explained in more detail later.
  • Set the callback function of the currently connected write event to the ngx_http_empty_handler() method, which is an empty method. The main reason to do this is to prevent accidentally triggered write events, which do not need to be handled because the data is still being read.
  • To determine if the current read event is ready, call rev->handler (rev) directly to trigger a read event, where handler() is the ngx_http_wait_request_handler() method set in step 2;
  • If the current read event is not ready, add the current event to the event framework by calling ngx_add_timer(), and register the read event for the current connection in the epoll handle by calling the ngx_handle_read_event() method.

The relationship between the two key methods here, ngx_http_init_connection(), and ngx_http_wait_request_handler(), needs to be highlighted here.Because TCP transmits data according to the sliding window protocol, this is why it is called a data stream.That is, when the nginx server receives data from the client, it receives one segment of data, and how much each segment receives is indeterminate.Each time data arrives, it triggers a read event on the epoll handle to inform the nginx server that it needs to continue receiving data.Correspondingly, ngx_http_init_connection() first registers the read event on the epoll handle for the current client connection, while ngx_http_wait_request_handler() is a method of continuously callback after the read event is triggered, but here ngx_http_init_connection() first checks whether the read event has been triggered, and if it has already been triggered, reads the connection directlyIf there is no data on it, the read event will be registered.Here's how the ngx_http_wait_request_handler() method works:

static void ngx_http_wait_request_handler(ngx_event_t *rev) {
  u_char *p;
  size_t size;
  ssize_t n;
  ngx_buf_t *b;
  ngx_connection_t *c;
  ngx_http_connection_t *hc;
  ngx_http_core_srv_conf_t *cscf;

  c = rev->data;

  // Close the connection if the read event timed out
  if (rev->timedout) {
    ngx_log_error(NGX_LOG_INFO, c->log, NGX_ETIMEDOUT, "client timed out");
    ngx_http_close_connection(c);
    return;
  }

  // If the connection is closed, try closing the connection completely
  if (c->close) {
    ngx_http_close_connection(c);
    return;
  }

  hc = c->data;
  cscf = ngx_http_get_module_srv_conf(hc->conf_ctx, ngx_http_core_module);
  size = cscf->client_header_buffer_size;

  b = c->buffer;
  if (b == NULL) {
    // Initialize buffer if buffer is empty
    b = ngx_create_temp_buf(c->pool, size);
    if (b == NULL) {
      ngx_http_close_connection(c);
      return;
    }

    c->buffer = b;
  } else if (b->start == NULL) {
    // Here b->start is NULL, indicating that the current connection is the second triggered event after NGX_AGAIN.
    // This initializes the related resources for the current connection
    b->start = ngx_palloc(c->pool, size);
    if (b->start == NULL) {
      ngx_http_close_connection(c);
      return;
    }

    b->pos = b->start;
    b->last = b->start;
    b->end = b->last + size;
  }

  // C->recv is set in the ngx_event_accept() method (c->recv = ngx_recv;), where ngx_recv is a macro,
  // The value is ngx_io.recv, where ngx_io is assigned when ngx_epoll_module is initialized.
  // It points to ngx_os_io, which is defined as:
  // ngx_os_io_t ngx_os_io = {
  //    ngx_unix_recv,
  //    ngx_readv_chain,
  //    ngx_udp_unix_recv,
  //    ngx_unix_send,
  //    ngx_udp_unix_send,
  //    ngx_udp_unix_sendmsg_chain,
  //    ngx_writev_chain,
  //    0
  // };
  // That is, c->recv here points to the ngx_os_io.recv attribute, the ngx_unix_recv() method
  // The main purpose of the current method is to read data from the socket into the current buffer
  n = c->recv(c, b->last, size);

  // NGX_AGAIN indicates that this is the first time the data has been read, at which point the connection has been established properly, but no data has arrived.
  // So here's just to continue listening for the current connected reading events and release the requested resources until the data really arrives
  if (n == NGX_AGAIN) {

    // If the current event is not in the event queue, add it
    if (!rev->timer_set) {
      ngx_add_timer(rev, c->listening->post_accept_timeout);
      ngx_reusable_connection(c, 1);
    }

    // Add read events to the epoll handle to continue listening for read events
    if (ngx_handle_read_event(rev, 0) != NGX_OK) {
      ngx_http_close_connection(c);
      return;
    }

    // Since the data was not read this time, the current buffer was released and the next time the current method was called, the b->start would be checked to see if it was empty.
    // If it is empty, it will be reinitialized.The main purpose of release here is to provide for other requests
    if (ngx_pfree(c->pool, b->start) == NGX_OK) {
      b->start = NULL;
    }

    return;
  }

  // Close connection if reading data fails
  if (n == NGX_ERROR) {
    ngx_http_close_connection(c);
    return;
  }

  // If the length of the read data is 0, the client has closed the connection, so the connection structure is closed here
  if (n == 0) {
    ngx_log_error(NGX_LOG_INFO, c->log, 0, "client closed connection");
    ngx_http_close_connection(c);
    return;
  }

  // Update read data length
  b->last += n;
  
  if (hc->proxy_protocol) {
    hc->proxy_protocol = 0;
    p = ngx_proxy_protocol_read(c, b->pos, b->last);
    if (p == NULL) {
      ngx_http_close_connection(c);
      return;
    }

    b->pos = p;
    if (b->pos == b->last) {
      c->log->action = "waiting for request";
      b->pos = b->start;
      b->last = b->start;
      ngx_post_event(rev, &ngx_posted_events);
      return;
    }
  }

  c->log->action = "reading client request line";
  ngx_reusable_connection(c, 0);

  // Now that you have read the client's data, you begin to create the ngx_http_request_s structure to represent the current request
  c->data = ngx_http_create_request(c);
  if (c->data == NULL) {
    ngx_http_close_connection(c);
    return;
  }

  // Point the handler of the current event to the ngx_http_process_request_line() method
  // The main purpose of the ngx_http_process_request_line() method is to parse the complete request line
  // It's important to note that when you get here, you've read some of the client's data, but it's not necessarily complete
  rev->handler = ngx_http_process_request_line;
  ngx_http_process_request_line(rev);
}

Since ngx_http_wait_request_handler() is used to continuously receive client data, it must have a buffer for storing data, which is c->buffer, and the type of this field is ngx_buf_t.The ngx_http_wait_request_handler() method mainly accomplishes the following parts:

  • Check that c->buffer is initialized, and if not, initialize it.When initializing, there are two scenarios:
    • If the current buffer has not been initialized, request memory space directly;
    • The current buffer has been initialized, but c->buffer->start is empty. This happens because in the later if judgement, if the return value of the read data is NGX_AGAIN, this is the first triggered read event. At this time, the connection is established, but the client has not sent the data yet, so the requested memory will be freed up at this time, that is, c->buffer->startReleased, resulting in this property being NULL;
  • Call c->recv (c, b->last, size) to read data from the connection handle and return a value greater than 0 indicating the length of the read data; equal to 0, indicating that the client disconnected; if -1, indicating that the read data was abnormal; if -2, indicating that it needs to be read again;
  • Check the return values of the read data in turn, and if not positive, process them differently according to the specific situation.
  • When you get to this step, you will call the ngx_http_create_request(c) method to create a ngx_http_request_s structure to represent the current request.
  • Set the callback method of the read event to ngx_http_process_request_line(), and trigger a call to the method.The main purpose of this method is to read data and parse request rows in client data, such as "GET/index HTTP/1.1";

This paper mainly explains how nginx receives data from the client, emphasizing the overall process of receiving data from the client, then establishing a connection, receiving data from the client, and finally processing the data.

Posted by SeaJones on Mon, 09 Mar 2020 17:53:00 -0700