Detailed explanation of Reactor thread model for Netty source code analysis

Keywords: Java

Last article , we analyzed the initialization process of Netty server startup. Today, let's analyze the Reactor thread model in Netty

Before analyzing the source code, let's analyze where EventLoop is used?

  • Connection listening registration of NioServerSocketChannel
  • IO event registration for NioSocketChannel

NioServerSocketChannel connection listening

In the initAndRegister() method of AbstractBootstrap class, after NioServerSocketChannel initialization is completed, the code of case mark position will be called for registration.

final ChannelFuture initAndRegister() {
    Channel channel = null;
    try {
        channel = channelFactory.newChannel();
        init(channel);
    } catch (Throwable t) {
       
    }
   //Register with the selector of the boss thread.
    ChannelFuture regFuture = config().group().register(channel);
    if (regFuture.cause() != null) {
        if (channel.isRegistered()) {
            channel.close();
        } else {
            channel.unsafe().closeForcibly();
        }
    }
    return regFuture;
}

AbstractNioChannel.doRegister

According to the execution logic of the code, it will eventually be executed into the doRegister() method of AbstractNioChannel.

@Override
protected void doRegister() throws Exception {
    boolean selected = false;
    for (;;) {
        try {
            //Call the register method of ServerSocketChannel to register the current server object with the selector of boss thread
            selectionKey = javaChannel().register(eventLoop().unwrappedSelector(), 0, this);
            return;
        } catch (CancelledKeyException e) {
            if (!selected) {
                // Force the Selector to select now as the "canceled" SelectionKey may still be
                // cached and not removed because no Select.select(..) operation was called yet.
                eventLoop().selectNow();
                selected = true;
            } else {
                // We forced a select operation on the selector before but the SelectionKey is still cached
                // for whatever reason. JDK bug ?
                throw e;
            }
        }
    }
}

Startup process of NioEventLoop

NioEventLoop is a thread. Its startup process is as follows.

In the doBind0 method of AbstractBootstrap, obtain the NioEventLoop in NioServerSocketChannel, and then use it to perform the task of binding ports.

private static void doBind0(
    final ChannelFuture regFuture, final Channel channel,
    final SocketAddress localAddress, final ChannelPromise promise) {

    //start-up
    channel.eventLoop().execute(new Runnable() {
        @Override
        public void run() {
            if (regFuture.isSuccess()) {
                channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
            } else {
                promise.setFailure(regFuture.cause());
            }
        }
    });
}

SingleThreadEventExecutor.execute

Then execute all the way to the SingleThreadEventExecutor.execute method, and invoke the startThread() method to start the thread.

private void execute(Runnable task, boolean immediate) {
    boolean inEventLoop = inEventLoop();
    addTask(task);
    if (!inEventLoop) {
        startThread(); //Start thread
        if (isShutdown()) {
            boolean reject = false;
            try {
                if (removeTask(task)) {
                    reject = true;
                }
            } catch (UnsupportedOperationException e) {
                // The task queue does not support removal so the best thing we can do is to just move on and
                // hope we will be able to pick-up the task before its completely terminated.
                // In worst case we will log on termination.
            }
            if (reject) {
                reject();
            }
        }
    }

    if (!addTaskWakesUp && immediate) {
        wakeup(inEventLoop);
    }
}

startThread

private void startThread() {
    if (state == ST_NOT_STARTED) {
        if (STATE_UPDATER.compareAndSet(this, ST_NOT_STARTED, ST_STARTED)) {
            boolean success = false;
            try {
                doStartThread(); //Perform the startup process
                success = true;
            } finally {
                if (!success) {
                    STATE_UPDATER.compareAndSet(this, ST_STARTED, ST_NOT_STARTED);
                }
            }
        }
    }
}

Then, the doStartThread() method is called to execute a task through the executor.execute, in which the NioEventLoop thread is started

private void doStartThread() {
    assert thread == null;
    executor.execute(new Runnable() { //Execute a task through the thread pool
        @Override
        public void run() {
            thread = Thread.currentThread();
            if (interrupted) {
                thread.interrupt();
            }

            boolean success = false;
            updateLastExecutionTime();
            try {
                SingleThreadEventExecutor.this.run(); //Call the run method of boss's NioEventLoop to start polling
            }
            //Omit
        }
    });
}

Polling process for NioEventLoop

When the NioEventLoop thread is started, it directly enters the run method of NioEventLoop.

protected void run() {
    int selectCnt = 0;
    for (;;) {
        try {
            int strategy;
            try {
                strategy = selectStrategy.calculateStrategy(selectNowSupplier, hasTasks());
                switch (strategy) {
                    case SelectStrategy.CONTINUE:
                        continue;

                    case SelectStrategy.BUSY_WAIT:

                    case SelectStrategy.SELECT:
                        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
                        if (curDeadlineNanos == -1L) {
                            curDeadlineNanos = NONE; // nothing on the calendar
                        }
                        nextWakeupNanos.set(curDeadlineNanos);
                        try {
                            if (!hasTasks()) {
                                strategy = select(curDeadlineNanos);
                            }
                        } finally {
                            // This update is just to help block unnecessary selector wakeups
                            // so use of lazySet is ok (no race condition)
                            nextWakeupNanos.lazySet(AWAKE);
                        }
                        // fall through
                    default:
                }
            } catch (IOException e) {
                // If we receive an IOException here its because the Selector is messed up. Let's rebuild
                // the selector and retry. https://github.com/netty/netty/issues/8566
                rebuildSelector0();
                selectCnt = 0;
                handleLoopException(e);
                continue;
            }

            selectCnt++;
            cancelledKeys = 0;
            needsToSelectAgain = false;
            final int ioRatio = this.ioRatio;
            boolean ranTasks;
            if (ioRatio == 100) {
                try {
                    if (strategy > 0) {
                        processSelectedKeys();
                    }
                } finally {
                    // Ensure we always run tasks.
                    ranTasks = runAllTasks();
                }
            } else if (strategy > 0) {
                final long ioStartTime = System.nanoTime();
                try {
                    processSelectedKeys();
                } finally {
                    // Ensure we always run tasks.
                    final long ioTime = System.nanoTime() - ioStartTime;
                    ranTasks = runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
                }
            } else {
                ranTasks = runAllTasks(0); // This will run the minimum number of tasks
            }

            if (ranTasks || strategy > 0) {
                if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                                 selectCnt - 1, selector);
                }
                selectCnt = 0;
            } else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
                selectCnt = 0;
            }
        } catch (CancelledKeyException e) {
            // Harmless exception - log anyway
            if (logger.isDebugEnabled()) {
                logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?",
                             selector, e);
            }
        } catch (Error e) {
            throw (Error) e;
        } catch (Throwable t) {
            handleLoopException(t);
        } finally {
            // Always handle shutdown even if the loop processing threw an exception.
            try {
                if (isShuttingDown()) {
                    closeAll();
                    if (confirmShutdown()) {
                        return;
                    }
                }
            } catch (Error e) {
                throw (Error) e;
            } catch (Throwable t) {
                handleLoopException(t);
            }
        }
    }
}

Execution process of NioEventLoop

The run method in NioEventLoop is an infinite loop thread, which mainly does three things, as shown in Figure 9-1.

< center > figure 9-1 < / center >

  • Polling processing I/O events (select), polling I/O ready events of all channels registered in the Selector selector
  • Process I/O events. If there are I/O events of ready channels, call processSelectedKeys for processing
  • For processing asynchronous tasks (runAllTasks), Reactor thread has a very important responsibility, which is to process non-I/O tasks in the task queue. Netty provides ioRadio parameter to adjust the I/O time and the time ratio of task processing.

Polling for I/O ready events

Let's first look at the code snippet related to I/O time:

  1. Get the current execution policy through selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())
  2. According to different policies, it is used to control the execution policy at each polling.
protected void run() {
        int selectCnt = 0;
        for (;;) {
            try {
                int strategy;
                try {
                    strategy = selectStrategy.calculateStrategy(selectNowSupplier, hasTasks());
                    switch (strategy) {
                    case SelectStrategy.CONTINUE:
                        continue;

                    case SelectStrategy.BUSY_WAIT:
                        // fall-through to SELECT since the busy-wait is not supported with NIO

                    case SelectStrategy.SELECT:
                        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
                        if (curDeadlineNanos == -1L) {
                            curDeadlineNanos = NONE; // nothing on the calendar
                        }
                        nextWakeupNanos.set(curDeadlineNanos);
                        try {
                            if (!hasTasks()) {
                                strategy = select(curDeadlineNanos);
                            }
                        } finally {
                            // This update is just to help block unnecessary selector wakeups
                            // so use of lazySet is ok (no race condition)
                            nextWakeupNanos.lazySet(AWAKE);
                        }
                        // fall through
                    default:
                    }
                }
                //Omit
            }
        }
}

selectStrategy processing logic

@Override
public int calculateStrategy(IntSupplier selectSupplier, boolean hasTasks) throws Exception {
    return hasTasks ? selectSupplier.get() : SelectStrategy.SELECT;
}

If hasTasks is true, it means that there are asynchronous tasks in the current NioEventLoop thread, selectSupplier.get() will be called, otherwise SELECT will be returned directly.

The definition of selectSupplier.get() is as follows:

private final IntSupplier selectNowSupplier = new IntSupplier() {
    @Override
    public int get() throws Exception {
        return selectNow();
    }
};

This method calls the selectNow() method, which is a non blocking method provided by the Selector selector, and it will be returned immediately after execution.

  • If there are ready channels, the corresponding number of ready channels will be returned
  • Otherwise, return 0

Branch processing

After obtaining the strategy in the previous step, branch processing will be performed according to different results.

  • CONTINUE indicates that a retry is required.
  • BUSY_WAIT, because busy is not supported in NIO_ Wait, so busy_ The execution logic of wait and SELECT is the same
  • Select means that the list of ready channels needs to be obtained through the select method. When there are no asynchronous tasks in NioEventLoop, that is, the task queue is empty, this policy is returned.
switch (strategy) {
    case SelectStrategy.CONTINUE:
        continue;

    case SelectStrategy.BUSY_WAIT:
        // fall-through to SELECT since the busy-wait is not supported with NIO

    case SelectStrategy.SELECT:
        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
        if (curDeadlineNanos == -1L) {
            curDeadlineNanos = NONE; // nothing on the calendar
        }
        nextWakeupNanos.set(curDeadlineNanos);
        try {
            if (!hasTasks()) {
                strategy = select(curDeadlineNanos);
            }
        } finally {
            // This update is just to help block unnecessary selector wakeups
            // so use of lazySet is ok (no race condition)
            nextWakeupNanos.lazySet(AWAKE);
        }
        // fall through
    default:
}

SelectStrategy.SELECT

When there is no asynchronous task in the NioEventLoop thread, the SELECT policy is executed

//The deadline for triggering the next scheduled task is not a scheduled task by default, and - 1L is returned
long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
if (curDeadlineNanos == -1L) {
    curDeadlineNanos = NONE; // nothing on the calendar
}
nextWakeupNanos.set(curDeadlineNanos);
try {
    if (!hasTasks()) {
        //2. After the task in taskqueue is completed, start to execute select to block
        strategy = select(curDeadlineNanos);
    }
} finally {
    // This update is just to help block unnecessary selector wakeups
    // so use of lazySet is ok (no race condition)
    nextWakeupNanos.lazySet(AWAKE);
}

The select method is defined as follows. By default, deadlineNanos=NONE, so the select() method will be called to block.

private int select(long deadlineNanos) throws IOException {
    if (deadlineNanos == NONE) {
        return selector.select();
    }
    //Calculates the blocking timeout of the select() method
    long timeoutMillis = deadlineToDelayNanos(deadlineNanos + 995000L) / 1000000L;
    return timeoutMillis <= 0 ? selector.selectNow() : selector.select(timeoutMillis);
}

Finally, the number of ready channels will be returned. In subsequent logic, the execution logic will be determined according to the number of ready channels returned.

Business processing in NioEventLoop.run

The logic of business processing is relatively easy to understand

  • If there is a ready channel, the IO event of the ready channel is processed
  • After processing, tasks in the asynchronous queue are executed synchronously.
  • In addition, in order to solve the idling problem in Java NIO, the idling times are recorded through selectCnt. If idling occurs in a cycle (no IO needs to be processed and no tasks are executed), record it (selectCnt), If idling occurs continuously (selectCnt reaches a certain value), netty thinks that NIO BUG (unexpected selectorwakeup processing) is triggered;

There is a bug in Java NiO. The epoll null polling problem of Java NiO under Linux system. That is, in the select() method, if the timely ready channel is 0, it will also be awakened from the operation that should be blocked, resulting in 100% CPU utilization.

@Override
protected void run() {
    int selectCnt = 0;
    for (;;) {
        //Omit
        selectCnt++;//selectCnt records the number of select times that failed, that is, the number of idling of eventLoop, in order to solve NIO BUG
        cancelledKeys = 0;
        needsToSelectAgain = false;
        final int ioRatio = this.ioRatio;
        boolean ranTasks;
        if (ioRatio == 100) { //The proportion of ioRadio execution time is 100%, and the default is 50%
            try {
                if (strategy > 0) { //Strategy > 0 indicates that there is a ready SocketChannel
                    processSelectedKeys(); //Perform the task of ready SocketChannel
                }
            } finally {
             //Note that setting ioRatio to 100 does not mean that the task is not executed, but that the task queue is executed every time
                ranTasks = runAllTasks(); //Ensure that tasks in the queue are always executed
            }
        } else if (strategy > 0) { //Strategy > 0 indicates that there is a ready SocketChannel
            final long ioStartTime = System.nanoTime(); //io processing start time
            try {
                processSelectedKeys(); //Start processing IO ready events
            } finally {
                // io event execution end time
                final long ioTime = System.nanoTime() - ioStartTime;
                //Based on the IO processing time of this cycle, ioRatio, calculate the upper limit of task execution time, that is, how long asynchronous tasks are allowed to be processed
                ranTasks = runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
            }
        } else {
            //This branch represents: strategy=0, ioratio < 100. At this time, the task time limit = 0, which means to execute asynchronous tasks as few as possible
            //This branch is actually the same thing as strategy > 0. The code is simplified
            ranTasks = runAllTasks(0); // This will run the minimum number of tasks
        }

        if (ranTasks || strategy > 0) { //ranTasks=true, or strategy > 0 indicates that eventLoop is working without idling. Clear selectCnt
            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                             selectCnt - 1, selector);
            }
            selectCnt = 0;
        } 
         //Unexpected selectorwakeup handling NIO BUG
        else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
            selectCnt = 0;
        }
    }
}

processSelectedKeys

In the select method, we can get the number of I/O events ready to trigger the execution of the processSelectedKeys method.

private void processSelectedKeys() {
    if (selectedKeys != null) {
        processSelectedKeysOptimized();
    } else {
        processSelectedKeysPlain(selector.selectedKeys());
    }
}

When handling I/O events, there are two logical branches:

  • One is to handle the selectedKeys optimized by Netty,
  • The other is normal processing logic

The processSelectedKeys method determines which strategy to use according to whether selectedKeys is set. By default, the nety optimized selectedKeys are used, and the returned object is SelectedSelectionKeySet.

processSelectedKeysOptimized

private void processSelectedKeysOptimized() {
    for (int i = 0; i < selectedKeys.size; ++i) {
        //1. Fetch the IO event and the corresponding channel
        final SelectionKey k = selectedKeys.keys[i];
        selectedKeys.keys[i] = null;//The reference of k is set to null to facilitate gc recycling, which also means that the event processing of the channel is completed to avoid repeated processing

        final Object a = k.attachment(); //Get the attachment saved in the current channel. At this time, it should be NioServerSocketChannel
        //Process current channel
        if (a instanceof AbstractNioChannel) {
             //For boss NioEventLoop, the connection event is basically polled. The next thing is to throw the connection to a worker NioEventLoop through his pipeline
            //For worker NioEventLoop, the basic quotient of the round robin is the IO read / write event. The subsequent thing is to pass the read byte stream to each channelHandler through its pipeline for processing
            processSelectedKey(k, (AbstractNioChannel) a);
        } else {
            @SuppressWarnings("unchecked")
            NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
            processSelectedKey(k, task);
        }
        
        if (needsToSelectAgain) {
            // null out entries in the array to allow to have it GC'ed once the Channel close
            // See https://github.com/netty/netty/issues/2363
            selectedKeys.reset(i + 1);

            selectAgain();
            i = -1;
        }
    }
}

processSelectedKey

private void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
    final AbstractNioChannel.NioUnsafe unsafe = ch.unsafe();
    if (!k.isValid()) {
        final EventLoop eventLoop;
        try {
            eventLoop = ch.eventLoop();
        } catch (Throwable ignored) {
           
        }
        if (eventLoop == this) {
            // close the channel if the key is not valid anymore
            unsafe.close(unsafe.voidPromise());
        }
        return;
    }

    try {
        int readyOps = k.readyOps(); //Gets the operation type of the current key
      
        if ((readyOps & SelectionKey.OP_CONNECT) != 0) {//If connection type
            int ops = k.interestOps();
            ops &= ~SelectionKey.OP_CONNECT;
            k.interestOps(ops);

            unsafe.finishConnect();
        }
        if ((readyOps & SelectionKey.OP_WRITE) != 0) { //If it is write type
            ch.unsafe().forceFlush();
        }
        //If it is read type or ACCEPT type. Then execute the unsafe.read() method, and the instance object of unsafe is NioMessageUnsafe
        if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
            unsafe.read();
        }
    } catch (CancelledKeyException ignored) {
        unsafe.close(unsafe.voidPromise());
    }
}

NioMessageUnsafe.read()

Assuming that it is a read operation or the client establishes a connection, the code execution logic is as follows,

@Override
public void read() {
    assert eventLoop().inEventLoop();
    final ChannelConfig config = config();
    final ChannelPipeline pipeline = pipeline(); //If the connection is established for the first time, the pipeline at this time is serverbootstrap acceptor
    final RecvByteBufAllocator.Handle allocHandle = unsafe().recvBufAllocHandle();
    allocHandle.reset(config);

    boolean closed = false;
    Throwable exception = null;
    try {
        try {
            do {
                int localRead = doReadMessages(readBuf);
                if (localRead == 0) {
                    break;
                }
                if (localRead < 0) {
                    closed = true;
                    break;
                }

                allocHandle.incMessagesRead(localRead);
            } while (continueReading(allocHandle));
        } catch (Throwable t) {
            exception = t;
        }

        int size = readBuf.size();
        for (int i = 0; i < size; i ++) {
            readPending = false;
            pipeline.fireChannelRead(readBuf.get(i));  //Call the channelRead method in the pipeline
        }
        readBuf.clear();
        allocHandle.readComplete();
        pipeline.fireChannelReadComplete();

        if (exception != null) {
            closed = closeOnReadError(exception);

            pipeline.fireExceptionCaught(exception); //Call the exceptionguess method in the pipeline
        }

        if (closed) {
            inputShutdown = true;
            if (isOpen()) {
                close(voidPromise());
            }
        }
    } finally {
        if (!readPending && !config.isAutoRead()) {
            removeReadOp();
        }
    }
}

Optimization of SelectedSelectionKeySet

Netty encapsulates and implements a SelectedSelectionKeySet to optimize the structure of the original SelectorKeys. How is it optimized? Let's first look at its code definition

final class SelectedSelectionKeySet extends AbstractSet<SelectionKey> {

    SelectionKey[] keys;
    int size;

    SelectedSelectionKeySet() {
        keys = new SelectionKey[1024];
    }

    @Override
    public boolean add(SelectionKey o) {
        if (o == null) {
            return false;
        }

        keys[size++] = o;
        if (size == keys.length) {
            increaseCapacity();
        }

        return true;
    }
}

SelectedSelectionKeySet uses the SelectionKey array internally. In the processSelectedKeysOptimized method, you can directly traverse the array to get ready I/O events.

The original set < SelectionKey > returns the HashSet type. Compared with the two, SelectionKey [] does not need to consider the hash conflict, so it can realize the add operation with O(1) time complexity.

Initialization of SelectedSelectionKeySet

netty replaces selectedKeys and publicSelectedKeys inside the Selector object with SelectedSelectionKeySet through reflection.

The original two fields, selectedKeys and publicSelectedKeys, are of HashSet type. After replacement, they become SelectedSelectionKeySet. When there is a ready key, it will be directly filled into the array of SelectedSelectionKeySet. You only need to traverse later.

private SelectorTuple openSelector() {
    final Class<?> selectorImplClass = (Class<?>) maybeSelectorImplClass;
    final SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();
    //Use reflection
    Object maybeException = AccessController.doPrivileged(new PrivilegedAction<Object>() {
        @Override
        public Object run() {
            try {
                //selectedKeys field inside Selector
                Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
                //publicSelectedKeys field inside Selector
                Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");

                if (PlatformDependent.javaVersion() >= 9 && PlatformDependent.hasUnsafe()) {
                    //Gets the selectedKeysField offset
                    long selectedKeysFieldOffset = PlatformDependent.objectFieldOffset(selectedKeysField);
                    //Gets the offset of the publicSelectedKeysField field field
                    long publicSelectedKeysFieldOffset =
                        PlatformDependent.objectFieldOffset(publicSelectedKeysField);

                    if (selectedKeysFieldOffset != -1 && publicSelectedKeysFieldOffset != -1) {
                        //Replace with selectedKeySet
                        PlatformDependent.putObject(
                            unwrappedSelector, selectedKeysFieldOffset, selectedKeySet);
                        PlatformDependent.putObject(
                            unwrappedSelector, publicSelectedKeysFieldOffset, selectedKeySet);
                        return null;
                    }
                    // We could not retrieve the offset, lets try reflection as last-resort.
                }
                Throwable cause = ReflectionUtil.trySetAccessible(selectedKeysField, true);
                if (cause != null) {
                    return cause;
                }
                cause = ReflectionUtil.trySetAccessible(publicSelectedKeysField, true);
                if (cause != null) {
                    return cause;
                }
                selectedKeysField.set(unwrappedSelector, selectedKeySet);
                publicSelectedKeysField.set(unwrappedSelector, selectedKeySet);
                return null;
            } catch (NoSuchFieldException e) {
                return e;
            } catch (IllegalAccessException e) {
                return e;
            }
        }
    });
    if (maybeException instanceof Exception) {
        selectedKeys = null;
        Exception e = (Exception) maybeException;
        logger.trace("failed to instrument a special java.util.Set into: {}", unwrappedSelector, e);
        return new SelectorTuple(unwrappedSelector);
    }
    selectedKeys = selectedKeySet;
}

Execution process of asynchronous tasks

After analyzing the above process, let's continue to look at the processing process for asynchronous tasks in the run method in NioEventLoop

@Override
protected void run() {
    int selectCnt = 0;
    for (;;) {
        ranTasks = runAllTasks();
    }
}

runAllTask

Note that NioEventLoop can support the execution of scheduled tasks through nioEventLoop.schedule().

protected boolean runAllTasks() {
    assert inEventLoop();
    boolean fetchedAll;
    boolean ranAtLeastOne = false;

    do {
        fetchedAll = fetchFromScheduledTaskQueue(); //Merge scheduled tasks to normal task queue
        if (runAllTasksFrom(taskQueue)) { //Loop through tasks in taskQueue
            ranAtLeastOne = true;
        }
    } while (!fetchedAll);  

    if (ranAtLeastOne) { //If all tasks are completed, record the completion time
        lastExecutionTime = ScheduledFutureTask.nanoTime();
    }
    afterRunningAllTasks();//Perform closing tasks
    return ranAtLeastOne;
}

fetchFromScheduledTaskQueue

Traverse the tasks in the scheduledTaskQueue and add them to the taskQueue.

private boolean fetchFromScheduledTaskQueue() {
    if (scheduledTaskQueue == null || scheduledTaskQueue.isEmpty()) {
        return true;
    }
    long nanoTime = AbstractScheduledEventExecutor.nanoTime();
    for (;;) {
        Runnable scheduledTask = pollScheduledTask(nanoTime);
        if (scheduledTask == null) {
            return true;
        }
        if (!taskQueue.offer(scheduledTask)) {
            // No space left in the task queue add it back to the scheduledTaskQueue so we pick it up again.
            scheduledTaskQueue.add((ScheduledFutureTask<?>) scheduledTask);
            return false;
        }
    }
}

Task add method execute

There are two very important asynchronous task queues in NioEventLoop, namely, normal task queue and scheduled task queue. For these two queues, two methods are provided to add tasks to the two queues respectively.

  • execute()
  • schedule()

The execute method is defined as follows.

private void execute(Runnable task, boolean immediate) {
    boolean inEventLoop = inEventLoop();
    addTask(task); //Adds the current task to the blocking queue
    if (!inEventLoop) { //If non NioEventLoop
        startThread(); //Start thread
        if (isShutdown()) { //If the NioEventLoop is already in the stopped state
            boolean reject = false;
            try {
                if (removeTask(task)) { 
                    reject = true;
                }
            } catch (UnsupportedOperationException e) {
                // The task queue does not support removal so the best thing we can do is to just move on and
                // hope we will be able to pick-up the task before its completely terminated.
                // In worst case we will log on termination.
            }
            if (reject) {
                reject();
            }
        }
    }

    if (!addTaskWakesUp && immediate) {
        wakeup(inEventLoop);
    }
}

Nio's idling problem

The so-called empty rotation training means that when we execute the selector.select() method, if there is no ready SocketChannel, the current thread will be blocked. Empty polling means that when there is no ready SocketChannel, it will be triggered to wake up.

This wake-up does not have any read-write requests, resulting in invalid polling by the thread, resulting in high CPU utilization.

The root cause of this problem is:

In some Linux 2.6 kernel s, the socket of poll and epoll will set the returned eventSet event set to POLLHUP or POLLERR for the suddenly interrupted connection. The eventSet event set has changed, which may cause the Selector to wake up. This is related to the operating system mechanism. Although JDK is only a software compatible with various operating system platforms, it is a pity that in the initial versions of JDK5 and JDK6 (strictly speaking, some versions of JDK), this problem has not been solved, and this hat has been thrown to the operating system side, which is the reason why this bug was not finally repaired until 2013, The final influence is too wide.

How does Netty solve this problem? Let's go back to the run method of NioEventLoop

@Override
protected void run() {
    int selectCnt = 0;
    for (;;) {
        //selectCnt records the number of select times that failed, that is, the number of idling of eventLoop, in order to solve NIO BUG
        selectCnt++; 
        //ranTasks=true, or strategy > 0 indicates that eventLoop is working without idling. Clear selectCnt
        if (ranTasks || strategy > 0) {
            //If the value of the selected operation counter is greater than the minimum selector reconstruction threshold, log is output
            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                             selectCnt - 1, selector);
            }
            selectCnt = 0;
        } 
        //Unexpected selectorwakeup handling NIO BUG
        else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
            selectCnt = 0;
        }
    }
}

unexpectedSelectorWakeup

private boolean unexpectedSelectorWakeup(int selectCnt) {
    if (Thread.interrupted()) {
        if (logger.isDebugEnabled()) {
            logger.debug("Selector.select() returned prematurely because " +
                         "Thread.currentThread().interrupt() was called. Use " +
                         "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
        }
        return true;
    }
    //If the threshold selected for reconstruction is greater than 0, the default value is 512 times, and the number of empty polls currently triggered is greater than 512 times., Then the refactoring is triggered
    if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
        selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
        // The selector returned prematurely many times in a row.
        // Rebuild the selector to work around the problem.
        logger.warn("Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
                    selectCnt, selector);
        rebuildSelector();
        return true;
    }
    return false;
}

rebuildSelector()

public void rebuildSelector() {
    if (!inEventLoop()) { //If it is not executed in eventLoop, it is executed using asynchronous threads
        execute(new Runnable() {
            @Override
            public void run() {
                rebuildSelector0();
            }
        });
        return;
    }
    rebuildSelector0();
}

rebuildSelector0

The main function of this method is to recreate a selector to replace the selector in the current event loop

private void rebuildSelector0() {
    final Selector oldSelector = selector; //Get the old selector
    final SelectorTuple newSelectorTuple; //Define a new selector

    if (oldSelector == null) { //If the old selector is empty, return directly
        return;
    }

    try {
        newSelectorTuple = openSelector(); //Create a new selector
    } catch (Exception e) {
        logger.warn("Failed to create a new Selector.", e);
        return;
    }

    // Register all channels to the new Selector.
    int nChannels = 0;
    for (SelectionKey key: oldSelector.keys()) {//Traverse the selection key set registered to the selector
        Object a = key.attachment();
        try {
             //If the selected key is invalid or the selected associated channel has been registered with the new selector, the current loop will be skipped
            if (!key.isValid() || key.channel().keyFor(newSelectorTuple.unwrappedSelector) != null) {
                continue;
            }
             //Get the selected focus event set of key
            int interestOps = key.interestOps();
            key.cancel();//Deselect key
          //Register the selected key to the new selector
            SelectionKey newKey = key.channel().register(newSelectorTuple.unwrappedSelector, interestOps, a);
            if (a instanceof AbstractNioChannel) {//If it is a nio channel, update the channel selection key
                // Update SelectionKey
                ((AbstractNioChannel) a).selectionKey = newKey;
            }
            nChannels ++;
        } catch (Exception e) {
            logger.warn("Failed to re-register a Channel to the new Selector.", e);
            if (a instanceof AbstractNioChannel) {
                AbstractNioChannel ch = (AbstractNioChannel) a;
                ch.unsafe().close(ch.unsafe().voidPromise());
            } else {
                @SuppressWarnings("unchecked")
                NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
                invokeChannelUnregistered(task, key, e);
            }
        }
    }
    //Update current event loop selector
    selector = newSelectorTuple.selector;
    unwrappedSelector = newSelectorTuple.unwrappedSelector;

    try {
        // time to close the old selector as everything else is registered to the new one
        oldSelector.close(); //Close the original selector
    } catch (Throwable t) {
        if (logger.isWarnEnabled()) {
            logger.warn("Failed to close the old Selector.", t);
        }
    }

    if (logger.isInfoEnabled()) {
        logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
    }
}

From the above process, we found that Netty solves the NIO empty rotation problem by rebuilding the Selector object. In this reconstruction process, the core is to re register all selectionkeys in the Selector to the new Selector, so as to skillfully avoid the JDK epoll empty rotation problem.

Connection establishment and processing

In section 9.2.4.3, it is mentioned that when the client has a connection or a read event is sent to the server, the read() method of NioMessageUnsafe class will be called.

public void read() {
    assert eventLoop().inEventLoop();
    final ChannelConfig config = config();
    final ChannelPipeline pipeline = pipeline();
    final RecvByteBufAllocator.Handle allocHandle = unsafe().recvBufAllocHandle();
    allocHandle.reset(config);

    boolean closed = false;
    Throwable exception = null;
    try {
        try {
            do {
                //If a client is connected, localRead is 1, otherwise it returns 0
                int localRead = doReadMessages(readBuf);
                if (localRead == 0) {
                    break;
                }
                if (localRead < 0) {
                    closed = true;
                    break;
                }
                
                allocHandle.incMessagesRead(localRead); //Cumulatively increase the number of read messages
            } while (continueReading(allocHandle));
        } catch (Throwable t) {
            exception = t;
        }

        int size = readBuf.size(); //Traverse the list of client connections
        for (int i = 0; i < size; i ++) {
            readPending = false;
            pipeline.fireChannelRead(readBuf.get(i)); //Call the channelRead method of handler in pipeline.
        }
        readBuf.clear(); //Empty collection
        allocHandle.readComplete();
        pipeline.fireChannelReadComplete(); //Trigger the readComplete method of handler in pipeline

        if (exception != null) {
            closed = closeOnReadError(exception);

            pipeline.fireExceptionCaught(exception);
        }

        if (closed) {
            inputShutdown = true;
            if (isOpen()) {
                close(voidPromise());
            }
        }
    } finally {
        if (!readPending && !config.isAutoRead()) {
            removeReadOp();
        }
    }
}

pipeline.fireChannelRead(readBuf.get(i))

Continue to look at the triggering method of pipeline. At this time, the pipeline is composed. If it is a connection event, then pipeline = serverbootstrap $serverbootstrap acceptor.

static void invokeChannelRead(final AbstractChannelHandlerContext next, Object msg) {
    final Object m = next.pipeline.touch(ObjectUtil.checkNotNull(msg, "msg"), next);
    EventExecutor executor = next.executor();
    if (executor.inEventLoop()) {
        next.invokeChannelRead(m); //Get the next node in the pipeline and call the channelRead method of the handler
    } else {
        executor.execute(new Runnable() {
            @Override
            public void run() {
                next.invokeChannelRead(m);
            }
        });
    }
}

ServerBootstrapAcceptor

Serverbootstrap acceptor is a special handler in NioServerSocketChannel, which is specially used to handle client connection events. The core purpose of this method is to add the handler linked list for SocketChannel to the pipeline in the current NioSocketChannel.

public void channelRead(ChannelHandlerContext ctx, Object msg) {
    final Channel child = (Channel) msg;

    child.pipeline().addLast(childHandler);  //Add the childHandler configured on the server to the pipeline in the current NioSocketChannel

    setChannelOptions(child, childOptions, logger); //Set NioSocketChannel properties
    setAttributes(child, childAttrs); 

    try {
        //Register the current NioSocketChannel with the Selector and listen for an asynchronous event.
        childGroup.register(child).addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                if (!future.isSuccess()) {
                    forceClose(child, future.cause());
                }
            }
        });
    } catch (Throwable t) {
        forceClose(child, t);
    }
}

pipeline construction process

In section 9.6.2, child is actually a NioSocketChannel, which creates an object when a new link is received in NioServerSocketChannel.

@Override
protected int doReadMessages(List<Object> buf) throws Exception {
    SocketChannel ch = SocketUtils.accept(javaChannel());

    try {
        if (ch != null) {
            buf.add(new NioSocketChannel(this, ch)); //here
            return 1;
        }
    } catch (Throwable t) {
        logger.warn("Failed to create a new channel from an accepted socket.", t);

        try {
            ch.close();
        } catch (Throwable t2) {
            logger.warn("Failed to close a socket.", t2);
        }
    }

    return 0;
}

During the construction of NioSocketChannel, the construction method in the parent class AbstractChannel is called to initialize a pipeline

protected AbstractChannel(Channel parent) {
    this.parent = parent;
    id = newId();
    unsafe = newUnsafe();
    pipeline = newChannelPipeline();
}

DefaultChannelPipeline

The default instance of pipeline is DefaultChannelPipeline. The construction method is as follows.

protected DefaultChannelPipeline(Channel channel) {
    this.channel = ObjectUtil.checkNotNull(channel, "channel");
    succeededFuture = new SucceededChannelFuture(channel, null);
    voidPromise =  new VoidChannelPromise(channel, true);

    tail = new TailContext(this);
    head = new HeadContext(this);

    head.next = tail;
    tail.prev = head;
}

Initialize a head node and a tail node to form a two-way linked list, as shown in Figure 9-2

< center > figure 9-2 < / center >

Composition of handler chain in NioSocketChannel

Returning to the channelRead method of ServerBootstrapAccepter, when the client connection is received, the addition of pipeline in NioSocketChannel is triggered

The following code is the addLast method of DefaultChannelPipeline.

@Override
public final ChannelPipeline addLast(EventExecutorGroup executor, ChannelHandler... handlers) {
   ObjectUtil.checkNotNull(handlers, "handlers");

   for (ChannelHandler h: handlers) { //Traverse the handlers list. At this time, the handler here is the ChannelInitializer callback method
       if (h == null) {
           break;
       }
       addLast(executor, null, h);
   }

   return this;
}

addLast

Add the ChannelHandler configured on the server side to the pipeline. Note that the ChannelInitializer callback method is saved in the pipeline at this time.

@Override
public final ChannelPipeline addLast(EventExecutorGroup group, String name, ChannelHandler handler) {
    final AbstractChannelHandlerContext newCtx;
    synchronized (this) {
        checkMultiplicity(handler); //Check for duplicate handler s
        //Create a new DefaultChannelHandlerContext node
        newCtx = newContext(group, filterName(name, handler), handler);

        addLast0(newCtx);  //Add a new DefaultChannelHandlerContext to ChannelPipeline

      
        if (!registered) { 
            newCtx.setAddPending();
            callHandlerCallbackLater(newCtx, true);
            return this;
        }

        EventExecutor executor = newCtx.executor();
        if (!executor.inEventLoop()) {
            callHandlerAddedInEventLoop(newCtx, executor);
            return this;
        }
    }
    callHandlerAdded0(newCtx);
    return this;
}

When does this callback method trigger the call? In fact, when registering the current NioSocketChannel in the channelRead method of the ServerBootstrapAcceptor class

childGroup.register(child).addListener(new ChannelFutureListener() {}

Finally, according to the idea of source code analysis in the previous lesson, locate the register0 method in AbstractChannel.

private void register0(ChannelPromise promise) {
            try {
                // check if the channel is still open as it could be closed in the mean time when the register
                // call was outside of the eventLoop
                if (!promise.setUncancellable() || !ensureOpen(promise)) {
                    return;
                }
                boolean firstRegistration = neverRegistered;
                doRegister();
                neverRegistered = false;
                registered = true;
                //
                pipeline.invokeHandlerAddedIfNeeded();

            }
}

callHandlerAddedForAllHandlers

pipeline.invokeHandlerAddedIfNeeded() method, when executed downward, will enter the callHandlerAddedForAllHandlers method in the DefaultChannelPipeline class

private void callHandlerAddedForAllHandlers() {
    final PendingHandlerCallback pendingHandlerCallbackHead;
    synchronized (this) {
        assert !registered;

        // This Channel itself was registered.
        registered = true;

        pendingHandlerCallbackHead = this.pendingHandlerCallbackHead;
        // Null out so it can be GC'ed.
        this.pendingHandlerCallbackHead = null;
    }
    //Take the task from the list of handler callbacks waiting to be called for execution.
    PendingHandlerCallback task = pendingHandlerCallbackHead;
    while (task != null) {
        task.execute();
        task = task.next;
    }
}

We found that pendingHandlerCallbackHead, a one-way linked list, was added in the callHandlerCallbackLater method,

callHandlerCallbackLater is added in the addLast method, so it constitutes an asynchronous and complete closed loop.

ChannelInitializer.handlerAdded

The execution path of the task.execute() method is

callHandlerAdded0 -> ctx.callHandlerAdded ->

​ -------> AbstractChannelHandlerContext.callHandlerAddded()

​ ---------------> ChannelInitializer.handlerAdded

Call the initChannel method to initialize the Channel in NioSocketChannel

@Override
public void handlerAdded(ChannelHandlerContext ctx) throws Exception {
    if (ctx.channel().isRegistered()) {
        // This should always be true with our current DefaultChannelPipeline implementation.
        // The good thing about calling initChannel(...) in handlerAdded(...) is that there will be no ordering
        // surprises if a ChannelInitializer will add another ChannelInitializer. This is as all handlers
        // will be added in the expected order.
        if (initChannel(ctx)) {

            // We are done with init the Channel, removing the initializer now.
            removeState(ctx);
        }
    }
}

Next, call the initChannel abstract method, which is completed by the specific implementation class.

private boolean initChannel(ChannelHandlerContext ctx) throws Exception {
    if (initMap.add(ctx)) { // Guard against re-entrance.
        try {
            initChannel((C) ctx.channel());
        } catch (Throwable cause) {
            // Explicitly call exceptionCaught(...) as we removed the handler before calling initChannel(...).
            // We do so to prevent multiple calls to initChannel(...).
            exceptionCaught(ctx, cause);
        } finally {
            ChannelPipeline pipeline = ctx.pipeline();
            if (pipeline.context(this) != null) {
                pipeline.remove(this);
            }
        }
        return true;
    }
    return false;
}

The implementation of ChannelInitializer is an anonymous internal class in our custom Server, ChannelInitializer. Therefore, this callback is used to complete the construction process of the pipeline of the current NioSocketChannel.

public static void main(String[] args){
    EventLoopGroup boss = new NioEventLoopGroup();
    //2 thread workgroup used to accept read and write operations on client connections
    EventLoopGroup work = new NioEventLoopGroup();
    ServerBootstrap b = new ServerBootstrap();
    b.group(boss, work)    //Bind two worker groups
        .channel(NioServerSocketChannel.class)    //Set NIO mode
        // Initialize binding service channel
        .childHandler(new ChannelInitializer<SocketChannel>() {
            @Override
            protected void initChannel(SocketChannel sc) throws Exception {
                sc.pipeline()
                    .addLast(
                    new LengthFieldBasedFrameDecoder(1024,
                                                     9,4,0,0))
                    .addLast(new MessageRecordEncoder())
                    .addLast(new MessageRecordDecode())
                    .addLast(new ServerHandler());
            }
        });
}

Copyright notice: unless otherwise stated, all articles on this blog adopt CC BY-NC-SA 4.0 license agreement. Reprint please indicate from Mic to take you to learn architecture!
If this article is helpful to you, please pay attention and praise. Your persistence is the driving force of my continuous creation. Welcome to WeChat public official account for more dry cargo.

Posted by True`Logic on Sun, 21 Nov 2021 20:55:12 -0800