High efficiency IO - IO multiplexer select

Keywords: network select

catalogue

1, Concept

2, select function

        2.1 function prototype

         2.2 detailed introduction of parameters

                2.2.1 nfd

                2.2.2readfds,writefds,errorfds

                2.2.3 timeout

        2.3. Ready conditions for read / write / exception in the network

        2.4 select features

        2.5 select disadvantages

3, Use of select

1, Concept

        IO mainly has two actions: waiting for conditions to be ready and copying data. Efficient IO is to reduce the proportion of waiting time.

        IO multiplexing is a kind of high-efficiency io. Call select, poll and epoll to wait for multiple file descriptors at the same time. When at least one file descriptor is ready for IO operation, there is no need to wait.

        In this way, waiting for multiple file descriptors at one time increases the probability of condition readiness and reduces the waiting time.

        The following mainly introduces how select, poll and epoll realize multiplexing.

2, select function

        2.1 function prototype

#include <sys/select.h>

 int select(int nfds, fd_set *restrict readfds,fd_set *restrict writefds,
             fd_set *restrict errorfds,struct timeval *restrict timeout);

Function: select can monitor the status of multiple file descriptors. The program will stop here and wait until the status of one or more file descriptors changes.

Parameters:

parametereffect
nfdNeed to monitor the maximum file descriptor value plus 1
readfdsType fd_set, set of readable file descriptors, input / output type parameters
writefdsType fd_set, set of writable file descriptors, input / output type parameters
errorfdsType fd_set, set of exception file descriptors, input / output type parameters
timeoutThe structure is timeval, which is used to set the select waiting time. It is an input-output parameter

Return value:

  • Number of file descriptor state changes returned after successful execution
  • Return 0, indicating that the waiting time exceeds timeout.
  • When an error occurs, it returns - 1. The error reason is stored in errno. At this time, the values of the parameters readfds, writefds, errorfds and timeout become unpredictable.
    • The error code may be:
      • EBADF: the file descriptor is invalid or the file has been closed
      • EINTR: this call was interrupted by a signal
      • EINVAL: parameter n is negative
      • ENOMEM: out of core memory

         2.2 detailed introduction of parameters

                2.2.1 nfd

         nfd: the maximum file descriptor value to be monitored plus 1.

If the file descriptor to be monitored is 1, 2, 3, 4, nfd is equal to 5. If the file descriptor of carefree monitoring is 1, 5, nfd is equal to 6.

                2.2.2readfds,writefds,errorfds

         readfds, writefds, errorfds are the main types and fd_set related. And they are similar.

About fd_set structure:

typedef struct
{
/*XPG4.2requiresthismembername.Otherwiseavoidthename
fromtheglobalnamespace.*/
#ifdef__USE_XOPEN
__fd_maskfds_bits[__FD_SETSIZE/__NFDBITS];
#define__FDS_BITS(set)((set)->fds_bits)
#else
__fd_mask__fds_bits[__FD_SETSIZE/__NFDBITS];
#define__FDS_BITS(set)((set)->__fds_bits)
#endif
}fd_set;

         fd_set is the set of file descriptors, and the structure is actually a bitmap.

         readfds, writefds and errorfds are input and output parameters. When input, the user wants to tell the kernel which file descriptors to monitor. When output, the kernel wants to tell the user that those file descriptors are ready.

        The corresponding bit of the bitmap represents the file descriptor to be monitored. The content of the bit represents the file to be monitored when it is used as input. When it is used as output, the content represents those file conditions that are ready.

         For example: readfds: Take 8 bits as an example. When 1001 0101 is input, the user wants to tell the kernel that it needs to monitor the status of read events of files with file descriptors equal to 0, 2, 4 and 7. As an output, when the output is 1000 0001, the kernel wants to tell the user that the file reading event with file descriptors 0 and 7 is ready for reading.

        The number of kernel monitoring files is determined, indicating that the number of kernel monitoring files is limited. The kernel monitors multiple file descriptors by polling.

Due to different systems fd_set may be implemented in different ways. It may be an array or a structure, so it provides a set of operations fd_set interface to set the bitmap.

void FD_CLR(int fd, fd_set *fdset);  //Used to clear fd_ Related fd bits in set
int FD_ISSET(int fd, fd_set *fdset); //Used to test fd_ Whether the relevant fd bit in set is true
void FD_SET(int fd, fd_set *fdset);  //Used to set fd_ Related fd bits in set
void FD_ZERO(fd_set *fdset);         //Used to clear FD_ All parts in set, equivalent to initialization

        Note: if the user inputs and monitors the files that are ready for your file descriptors and kernel output conditions, they will only be a subset of these file descriptors.  

                2.2.3 timeout

         timeout. The structure is timeval. Used to set the select wait time.

About timeval structure:

struct timeval
{
    time_t      tv_sec;     /* seconds second*/
    suseconds_t tv_usec;    /* microseconds Microsecond*/
};

Value of parameter timeout:

  1. NULL: indicates that select will block waiting when no file condition is ready.
  2. 0: non blocking wait. It will return no matter the condition is not ready. It is used to detect the status of the monitored file.
  3. Specific time value: wait for a period of time. If the file conditions are ready within the time range, return. If the time exceeds, select and return 0.

Timeout is also an input / output parameter. When inputting, the user tells the kernel the waiting time timeout. When outputting, the kernel finishes waiting, and the waiting time timeout is 0.

         Note: during encoding, since readfds, writefds, errorfds and timeout are input-output parameters, when re selecting after a selection, the values of readfds, writefds, errorfds and timeout need to be reset because the parameter values have been changed during output.

        2.3. Ready conditions for read / write / exception in the network

Read ready

  • In the socket kernel, the number of bytes in the receive buffer is greater than or equal to the low watermark SO_RCVLOWAT, the file can be read without blocking.
  • There is a new connection request on the monitored socket, and the socket connection request is also obtained by reading.
  • For socket TCP communication, the opposite end closes the connection. At this time, reading the socket returns 0
  • There is an unhandled error on the socket.

Write ready

  • In the socket kernel, the size of the free position in the send buffer is greater than or equal to the low watermark.
  • socket uses non blocking connect to connect after success or failure.
  • The socket write operation is closed. The SIGPIPE signal will be triggered when the socket write operation is closed.
  • Error reading for on socket.

Exception ready:

  • socket received out of band data.

        2.4 select features

  • The files that the kernel can monitor are limited, depending on FD_ The number of bits of set, that is, the number of files that can be monitored, is sizeof(fd_set)*8.
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

int main(){
  printf("%lu\n",sizeof(fd_set)*8); 

  return 0;
}

Different system values are different.

  • To add fd to the select monitoring set, you also need an array array to store the file descriptor of the select monitoring.
    • After select returns, you need to judge whether each file descriptor saved in the array is ready. FD_ISSET.
    • After select returns, FD will be changed_ Set file descriptor set. Each time you restart selection, you need to reset the file descriptor saved in the array to fd_ Set. FD_ SET. The maximum value of the file descriptor needs to be saved for the select parameter.

This coding can be clearly observed below.

        2.5 select disadvantages

  • Every time select is called, the programmer needs to reset fd_set set.
  • Every time you call select, you need to copy the fd set from the user state to the kernel state. This overhead will be very large in fd.
  • When the kernel monitors all files, it needs to constantly traverse all the files passed in, and poll to detect Negev. I see that it is ready. This overhead will be very large when fd many files are available.
  • The number of files that select supports monitoring is limited.

3, Use of select

Write a single process echo server with select.

Note:

  1. All file descriptors to be monitored need to be saved in an array. The file descriptor includes a socket file descriptor for reading and a connection socket.
  2. When there is no connection, accept will block waiting, so the connection also needs to select waiting.
  3. The connection is made by the socket returned by the socket. When a new connection comes, it is also read ready.
  4. IO read / write is done by accept ing the return socket. The number of bytes in the receive buffer is greater than the low water mark, and the read is ready. The size of the remaining space in the send buffer is larger than the low watermark mark and is write ready.
  5. When the connection is obtained, instead of reading and writing directly, the return value of accept is placed in the array. If the connection is read and written directly, it will be blocked if the client does not send data.

Socket settings:  

#pragma once 

#include <iostream>
#include <string>
#include <stdlib.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <fcntl.h>

#define BLACKLOG 5
using namespace std;

class Sock{
  public:
  static int Socket(){
    int sock = 0;
    sock = socket(AF_INET, SOCK_STREAM, 0);
    if(sock < 0){
      cerr << "socket error"<<endl;
      exit(1);
    }
    return sock;
  }
  static void Bind(int sock, int port){
    struct sockaddr_in local;
    
    local.sin_family = AF_INET;
    local.sin_port = htons(port);
    local.sin_addr.s_addr = htonl(INADDR_ANY);
    if(bind(sock, (struct sockaddr *)&local, sizeof(local)) < 0){
      cerr << "bind error" <<endl;
      exit(3);
    }
  }
  static void Listen(int sock){
    if(listen(sock, BLACKLOG) < 0){
      cerr << "listen error"<<endl;
      exit(4);
    }

  }
  static int Accept(int lsock){
    struct sockaddr_in peer;
    socklen_t len = sizeof(peer);
    return  accept(lsock, (struct sockaddr *)&peer, &len);
  }


};

select server principal:

#pragma once 

#include "Sock.hpp"

#define NUM sizeof(fd_set)*8 / / array size = maximum number of files that can be monitored
#define DET_FD -1 / / array default file descriptor 

class SelectServer{
  private:
    int _lsock;//socket
    int _port;//Port number
    int array[NUM];//Save the file descriptor to monitor
  public:
    SelectServer(int lsock = -1, int port = 8080)
    :_lsock(lsock)
    ,_port(port)
    {}
    void InitServer(){
      for(size_t i =  0; i < NUM; i++){
        array[i] = DET_FD;
      }
      _lsock = Sock::Socket();
      //Port multiplexing                                                                                                                                         
      int opt = 1;                                                    
      setsockopt(_lsock, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt)); 
      Sock::Bind(_lsock, _port);
      Sock::Listen(_lsock);
      array[0] = _lsock;

    }
    void AddtoArray(int index){
      //No array unoccupied position found
      size_t i = 0;
      for(; i < NUM; i++){
        if(array[i] == DET_FD){
          break;
        }
      }
      //Full
      if(i >= NUM){
        cout<<"select is full, close fd"<<endl;
        close(index);
      }
      else{
        array[i] = index;
      }
    }
    void Delete(size_t index){
      if(index >= 0 && index < NUM){
        array[index] = DET_FD;
        
      }

    }
    void Handle(int i){
      //IO condition ready
      char buf[10240];
      ssize_t n = recv(array[i], buf, sizeof(buf), 0);
      if(n > 0){
        buf[n] = 0;
        cout<<buf<<endl;
      }
      else if(n == 0){
        //Opposite end closing
        cout<<"client close..."<<endl;
        close(array[i]);
        //The file has been closed, and the array file descriptor needs to be deleted
        Delete(i);
      }
      else{
        cerr << "read error"<<endl;
        close(array[i]);
        Delete(i);
      }


    }
    void Start(){
      while(1){
        int maxfd = DET_FD;
        //Reset the file to wait for
        fd_set readfds;
        //Initialize fd_set
        FD_ZERO(&readfds);
        //Find file descriptor, FD to be monitored_ Set the corresponding bit to 1
        for(size_t i =0; i < NUM; i++){
          if(array[i] == DET_FD){
            continue;
          }
          cout <<array[i];
          FD_SET(array[i], &readfds);
          //Maximum file descriptor found
          if(maxfd < array[i]){
            maxfd = array[i];
          }
        }
        cout<<endl;
        //struct timeval timeout = {5, 0};
        //Call select to wait for multiple files
        //Blocking wait
        int fdn = select(maxfd+1, &readfds, nullptr, nullptr, nullptr);
        if(fdn > 0){
          //There are files ready
          //Which file is ready to find
          for(size_t i =0; i < NUM; i++){
            if(array[i] != DET_FD && FD_ISSET(array[i] , &readfds)){
              if(array[i] == _lsock){
                //New connection
                int sock = Sock::Accept(array[i]);
                if(sock >= 0){
                  cout << "get a link...."<<endl;
                  //Add to array
                  AddtoArray(sock);
                }
                
              }
              else{
                //Perform IO operation
                Handle(i);
              }
            }
          }

        }
		//overtime
        else if(fdn == 0){
          cerr << "select timeout..."<<endl;

        }
		//abnormal
        else{
          cerr <<"fdn:"<<fdn<< "select error"<<endl;
        }


      }
    }

    ~SelectServer(){
      for(size_t i = 0; i < NUM; i++){
        if(array[i] != DET_FD){
          close(array[i]);
        }
      }
    }


};
#include"selectServer.hpp"

void Notice(string str){
  cout<<"Notice\n\t"<<"please enter port"<<endl;
}

int main(int argc, char *argv[]){
  if(argc != 2){
    Notice(argv[0]);
    exit(1);
  }

  SelectServer *sser = new SelectServer(atoi(argv[1]));
  sser->InitServer();
  sser->Start();
  delete sser;

  return 0;
}

Posted by dhiren22 on Thu, 21 Oct 2021 17:43:59 -0700