Analysis of Buffer implementation principle of Java NIO three piece suite

Keywords: Java

At present, many high-performance Java RPC frameworks are implemented based on Netty, and the design principle of Netty is inseparable from Java NIO. ***

1. Buffer inheritance system

As shown in the figure above, for all basic types in Java, there will be a specific Buffer type corresponding to it. Generally, we most often use ByteBuffer.

2. Buffer operation API use case

Take a use case of IntBuffer:

/**
 * @author csp
 * @date 2021-11-26 3:51 afternoon
 */
public class IntBufferDemo {
    public static void main(String[] args) {
        // Allocate a new int buffer. The parameter is the buffer capacity.
        // The current position of the new buffer is 0 and its limit (limit position) is its capacity. It has an underlying implementation array with an array offset of 0.
        IntBuffer buffer = IntBuffer.allocate(8);

        for (int i = 0; i < buffer.capacity(); i++) {
            int j = 2 * (i + 1);
            // Writes the given integer to the current position of this buffer, incrementing the current position.
            buffer.put(j);
        }

        // Reset this buffer, set the limit position to the current position, and then set the current position to 0.
        buffer.flip();

        // See if there are elements between the current position and the restricted position:
        while (buffer.hasRemaining()){
            // Reads the integer of the current position of this buffer, and then increments the current position.
            int j = buffer.get();
            System.out.print(j + " ");
        }
    }
}

Operation results:

2 4 6 8 10 12 14 16

It can be seen from this case that in essence, IntBuffer is used as an array container. You can read data from the container through the get method (write data to the container through the put method).

3. Basic principle of Buffer

Buffer buffer is essentially a special type of array object. Different from ordinary arrays, it has built-in mechanisms to track and record the state changes of the buffer. If we use the get() method to obtain data from the buffer or use the put() method to write data to the buffer, the state of the buffer will change.

The principle of Buffer built-in array to realize state change and tracking is essentially realized through three field variables:

  • Position: Specifies the next element index to be written or read. Its value is automatically updated by the get()/put() method. When a Buffer object is newly created, position is initialized to 0.
  • limit: specifies how much data needs to be fetched (when writing to the channel from the buffer) or how much space can be put into the data (when reading the buffer from the channel).
  • Capacity: Specifies the maximum data capacity that can be stored in the buffer. In fact, it specifies the size of the underlying array, or at least the capacity of the underlying array that we are allowed to use.

The source code is as follows:

public abstract class Buffer {
    // Numerical relationship between three field attributes: 0 < = position < = limit < = capacity
    private int position = 0;
    private int limit;
    private int capacity;
    ...
}

If we create a new ByteBuffer object with a capacity of 10, during initialization, position is set to 0 and limit and capacity are set to 10. In the later process of using ByteBuffer object, the value of capacity will not change, and the other two will change with use.

Let's take a look at an example introduced in Netty core principles and handwritten PRC framework practice:

Prepare a txt document and store it in the project directory. Enter the following contents in the document:

Java

We use a piece of code to verify the change process of position, limit and capacity. The code is as follows:

/**
 * @author csp
 * @date 2021-11-26 4:09 afternoon
 */
public class BufferDemo {

    public static void main(String[] args) throws IOException {
        FileInputStream fileInputStream = new FileInputStream("/Users/csp/IdeaProjects/netty-study/test.txt");

        // Operation pipeline for creating files
        FileChannel channel = fileInputStream.getChannel();

        // Allocate a buffer with a capacity of 10 (essentially a byte array with a capacity of 10)
        ByteBuffer buffer = ByteBuffer.allocate(10);

        output("initialization", buffer);
        channel.read(buffer);// Read data from the pipeline into the buffer container
        output("call read()", buffer);

        // Before preparing for operation, lock the operation range:
        buffer.flip();
        output("call flip()", buffer);

        // Determine whether there is readable data
        while (buffer.remaining() > 0){
            byte b = buffer.get();
        }
        output("call get()", buffer);

        // It can be understood as unlocking
        buffer.clear();
        output("call clear()", buffer);

        // Finally, close the pipe
        fileInputStream.close();
    }

    /**
     * Print out the real-time status in the buffer
     *
     * @param step
     * @param buffer
     */
    public static void output(String step, Buffer buffer) {
        System.out.println(step + " : ");
        // Capacity (array size):
        System.out.print("capacity" + buffer.capacity() + " , ");
        // The location of the current operation data can also be called a cursor:
        System.out.print("position" + buffer.position() + " , ");
        // Lock value, flip, data operation range index can only be between position - limit:
        System.out.println("limit" + buffer.limit());
        System.out.println();
    }
}

The output results are as follows:

initialization : 
capacity10 , position0 , limit10

call read() : 
capacity10 , position4 , limit10

call flip() : 
capacity10 , position0 , limit4

call get() : 
capacity10 , position4 , limit4

call clear() : 
capacity10 , position0 , limit10

Let's make a graphical analysis of the execution results of the above code (around the three field values of position, limit and capacity):

// Allocate a buffer with a capacity of 10 (essentially a byte array with a capacity of 10)
ByteBuffer buffer = ByteBuffer.allocate(10);

// Read data from the pipeline into the buffer container 
channel.read(buffer);
output("call read()", buffer);

First, read some data from the channel to the buffer (note that reading data from the channel is equivalent to writing data to the buffer). If 4 bytes of data are read, the value of position is 4, that is, the index of the next byte to be written is 4, and the limit is still 10, as shown in the following figure.

// Before preparing for operation, lock the operation range:
buffer.flip();
output("call flip()", buffer);

Next, write the read data to the output channel, which is equivalent to reading data from the buffer. Before that, you must call the flip() method. This method will accomplish the following two things:

  • First, set limit to the current position value.
  • Second, set position to 0.

Since position is set to 0, it can be ensured that the first byte of the buffer is read in the next output, and limit is set to the current position, which can ensure that the read data is exactly the data previously written to the buffer, as shown in the following figure.

// Determine whether there is readable data
while (buffer.remaining() > 0){
    byte b = buffer.get();
}
output("call get()", buffer);

Call the get() method to read data from the buffer and write it to the output channel, which will increase the position and keep the limit unchanged, but the position will not exceed the limit value. Therefore, after 4 bytes written to the buffer before reading, the values of position and limit are 4, as shown in the following figure.

// It can be understood as unlocking
buffer.clear();
output("call clear()", buffer);

// Finally, close the pipe
fileInputStream.close();

After reading data from the buffer, the value of limit remains the same as that when the flip() method is called. Calling the clear() method can set all state changes to the values during initialization, and finally close the flow, as shown in the following figure.

Through the above cases, it can be more highlighted that Buffer is a special array container. The difference from ordinary arrays is that it has three built-in "pointer variables": position, limit and capacity, which are used to track and record the state changes of the Buffer!

4. The allocate method initializes a buffer of a specified size

When creating a buffer object, the static method allocate() is called to specify the buffer capacity. In fact, calling allocate() is equivalent to creating an array of a specified size and wrapping it as a buffer object.

The source code of allocate() is as follows:

// Under ByteBuffer
public static ByteBuffer allocate(int capacity) {
    if (capacity < 0)
        throw new IllegalArgumentException();
    // Create a new ByteBuffer array object with the capacity of: capacity and the limit parameter value of: capacity
    return new HeapByteBuffer(capacity, capacity);
}

// Under HeapByteBuffer, the parent class is: ByteBuffer
HeapByteBuffer(int cap, int lim) {
    super(-1, 0, lim, cap, new byte[cap], 0);// Call the parameterized constructor of ByteBuffer
}

// Under ByteBuffer, the parent class is Buffer
ByteBuffer(int mark, int pos, int lim, int cap,
                 byte[] hb, int offset){
    super(mark, pos, lim, cap);// Call Buffer constructor
    this.hb = hb;// final byte[] hb;  Immutable byte array
    this.offset = offset;// Offset
}

// Buffer constructor
Buffer(int mark, int pos, int lim, int cap) {
    if (cap < 0)
        throw new IllegalArgumentException("Negative capacity: " + cap);
    this.capacity = cap;// Array capacity
    limit(lim);// limit of array
    position(pos);// Position of array
    if (mark >= 0) {
        if (mark > pos)
            throw new IllegalArgumentException("mark > position: ("
                                               + mark + " > " + pos + ")");
        this.mark = mark;
    }
}

Essentially equivalent to the following code:

// Initialize a byte array
byte[] bytes = new byte[10];
// Wrap the array in ByteBuffer
ByteBuffer buffer = ByteBuffer.wrap(bytes);

5. slice method buffer partition

In Java NIO, you can create a sub Buffer based on the Buffer object used first. That is, a slice is cut out of the existing Buffer as a new Buffer, but the existing Buffer and the created sub Buffer share data at the underlying array level.

The example code is as follows:

/**
 * @author csp
 * @date 2021-11-28 6:20 afternoon
 */
public class BufferSlice {
    public static void main(String[] args) {
        ByteBuffer buffer = ByteBuffer.allocate(10);

        // put data into buffer: 0 ~ 9
        for (int i = 0; i < buffer.capacity(); i++) {
            buffer.put((byte) i);
        }

        // Create a sub buffer: that is, from the position of array subscript 3 to the position of subscript 7
        buffer.position(3);
        buffer.limit(7);
        ByteBuffer slice = buffer.slice();

        // Change the contents of the sub buffer
        for (int i = 0; i < slice.capacity(); i++) {
            byte b = slice.get(i);
            b *= 10;
            slice.put(i, b);
        }

        // Position and limit are restored to the original position:
        buffer.position(0);
        buffer.limit(buffer.capacity());

        // Output the contents of the buffer container:
        while (buffer.hasRemaining()) {
            System.out.println(buffer.get());
        }
    }
}

In this example, a buffer with a capacity of 10 is allocated and data 0 ~ 9 are placed in it. On the basis of this buffer, a sub buffer is created and the contents of the sub buffer are changed. From the final output result, only the "visible" part of the data in the sub buffer has changed, It also shows that the sub buffer and the original buffer share data, and the output results are as follows:

0
1
2
30
40
50
60
7
8
9

6. Read only buffer

Read only buffer, as its name implies, can only read data from the buffer, not write data to it.

Convert an existing buffer to a read-only buffer by calling the asReadOnlyBuffer() method. This method returns a buffer exactly the same as the original buffer and shares data with the original buffer, but it is read-only. If the content of the original buffer changes, the content of the read-only buffer also changes.

The example code is as follows:

/**
 * @author csp
 * @date 2021-11-28 6:33 afternoon
 */
public class ReadOnlyBuffer {
    public static void main(String[] args) {
        // Initialize a buffer with a capacity of 10
        ByteBuffer buffer = ByteBuffer.allocate(10);

        // put data into buffer: 0 ~ 9
        for (int i = 0; i < buffer.capacity(); i++) {
            buffer.put((byte) i);
        }

        // Convert the buffer to a read-only buffer
        ByteBuffer readOnlyBuffer = buffer.asReadOnlyBuffer();

        // Since buffer and readOnlyBuffer essentially share a byte [] array object,
        // Therefore, when changing the contents of the buffer, the contents of the read-only buffer readOnlyBuffer will also change.
        for (int i = 0; i < buffer.capacity(); i++) {
            byte b = buffer.get(i);
            b *= 10;
            buffer.put(i, b);
        }

        // Position and limit are restored to the original position:
        readOnlyBuffer.position(0);
        readOnlyBuffer.limit(buffer.capacity());

        // Output the contents of the readOnlyBuffer container:
        while (readOnlyBuffer.hasRemaining()) {
            System.out.println(readOnlyBuffer.get());
        }
    }
}

The output results are as follows:

0
10
20
30
40
50
60
70
80
90

If you try to modify the contents of the read-only buffer, a ReadOnlyBufferException exception will be reported. You can only convert a regular buffer to a read-only buffer, not a read-only buffer to a writable buffer.

7. Direct buffer

Reference article: Direct buffer and indirect buffer in the learning chapter of Java NIO

For the definition of direct buffer, the book "understanding Java virtual machine in depth" is introduced as follows:

  • The Java NIO byte buffer is either direct or indirect. If it is a direct byte buffer, the java virtual machine will try its best to directly execute the local IO operation on this buffer, that is, before and after calling a local IO operation of the basic operating system, the virtual machine will try to avoid copying the contents of the kernel buffer to the user process buffer, or vice versa, Try to avoid copying from the user process buffer to the kernel buffer.
  • The direct buffer can be created by calling the allocateDirect(int capacity) method of the buffer class. The cost of allocating and deallocating the buffer returned by this method is higher than that of the indirect buffer. The contents of the direct buffers reside outside the garbage collection heap, so they have little demand for application memory (JVM memory). Therefore, it is suggested that direct buffers should be allocated to those large and persistent buffers (that is, the data in the buffer will be reused). Generally, it is best to allocate direct buffers only when they can bring very obvious benefits to program performance.
  • The direct buffer can also be created by mapping the file area into memory through filechnel's map() method, which returns MappedByteBuffer. The implementation of the Java platform helps to create direct byte buffers through JNI local code. If a buffer instance in the above buffers points to an inaccessible memory area, trying to control the area will not change the contents of the buffer, and an uncertain exception will be reported during access or at a later time.
  • Whether the byte buffer is a direct buffer or an indirect buffer can be determined by calling its isDIrect() method.

Case code:

/**
 * @author csp
 * @date 2021-11-28 7:07 afternoon
 */
public class DirectBuffer {
    public static void main(String[] args) throws IOException {
        // Read the contents of test.txt file from disk
        FileInputStream fileInputStream = new FileInputStream("/Users/csp/IdeaProjects/netty-study/test.txt");
        // Operation pipeline for creating files
        FileChannel inputStreamChannel = fileInputStream.getChannel();

        // Write the read content to a new file
        FileOutputStream fileOutputStream = new FileOutputStream("/Users/csp/IdeaProjects/netty-study/test2.txt");
        FileChannel outputStreamChannel = fileOutputStream.getChannel();
        
        // Create direct buffer
        ByteBuffer byteBuffer = ByteBuffer.allocateDirect(1024);
        
        while (true){
            byteBuffer.clear();

            int read = inputStreamChannel.read(byteBuffer);
            
            if (read == -1){
                break;
            }
            
            byteBuffer.flip();
            
            outputStreamChannel.write(byteBuffer);
        }
    }
}

To allocate a direct buffer, you need to call the allocateDirect() method instead of the allocate() method, which is used in the same way as a normal buffer.

8. Memory mapping

Memory mapping is a method of reading and writing file data, which can be much faster than conventional stream based or channel based I/O. Memory mapped file I/O is accomplished by making the data in the file represent the contents of the memory array. At first, it sounds like reading the whole file into memory, but in fact, it is not. In general, only the part of the file that is actually read or written will be mapped to memory. Let's look at the following example code:

/**
 * @author csp
 * @date 2021-11-28 7:16 afternoon
 */
public class MapperBuffer {
    static private final int start = 0;
    static private final int size = 10;

    public static void main(String[] args) throws IOException {
        RandomAccessFile randomAccessFile = new RandomAccessFile("/Users/csp/IdeaProjects/netty-study/test.txt", "rw");

        FileChannel channel = randomAccessFile.getChannel();

        // A mapping association is made between the buffer and the file system. As long as the contents in the buffer are operated, the file contents will change accordingly
        MappedByteBuffer mappedByteBuffer = channel.map(FileChannel.MapMode.READ_WRITE, start, size);

        mappedByteBuffer.put(4, (byte) 97);// a
        mappedByteBuffer.put(5, (byte) 122);// z

        randomAccessFile.close();
    }
}

The original test.txt file contains:

Java

After executing the above code, the content of test.txt file is updated to:

Javaaz

Reference books: Netty core principle and handwritten PRC framework combat, the book PDF can be obtained from my official account [interest hat straw road] reply 001!

Recommended information: Java NIO buffer

Posted by NTFS on Sun, 28 Nov 2021 06:14:41 -0800