Constructing. NET Communication Program with High Performance Pipelines

Keywords: C# socket ascii less

NET Standard supports a new set of APIs, System.Span, System.Memory, and System.IO.Pipelines. These new APIs have greatly improved the effectiveness of. NET programs, and many of the underlying. NET APIs will be rewritten in the future.

Pipelines aims to solve a lot of problems when writing Socket communication programs in. NET. difficulty I believe that the reader is also very annoyed about this, using stream model to program, even if it can be solved, it is also a real trouble.

System.IO.Pipelines uses simple memory fragments to manage data, greatly simplifying the process of writing programs. For a detailed introduction to Pipelines, see Here . Kestrel used in ASP.NET Core is already using this API. (In other words, it looks like the Kestrel team made it.)

It may be that the need for Socket scenarios directly is limited (the Internet of Things is still used a lot), and Pipelines related information does not feel very much. The official example is based on ASCII protocol and has a fixed-end protocol. Here I take BINARY binary custom protocol commonly used in Internet of Things equipment as an example to explain the program routine based on Pipelines.

System.IO.Pipelines

Unlike Stream-based methods, pipelines provide a pipe for storing data. The data stored in pipe is somewhat linked. It can slice based on SequencePosition, so that a ReadOnlySequence < T > object can be obtained. reader can customize the operation and tell pipe how much data has been processed after the operation is completed. The whole process does not need memory replication operation, so the performance has been improved, and a lot of trouble has been reduced. You can simply understand the process as a server side:

Receiving data loops: Receiving data - > Placing it in the pipe - > Telling the pipe how much data it has put in it
Processing Data Loop: Find a complete data in the pipe - > Hand it over to the process - > Tell the pipe how much data it has processed

Agreement

There is a device, the binary protocol, the data packet start 0x75, 0xbd, 0x7e, 0x97, a total of four bytes, followed by a packet length of 2 bytes (fixed 2400 bytes, variable length can also be referred to), followed by the data area. After the device is connected successfully, the data is sent from the device to the PC actively.

critical code

Although the. NET Core platform, but. NET FRAMEWORK 4.6.1 above can also be nuget installation, direct

install-package system.io.pipelines

Just install it. Socket-related processing code is no longer written, only listed key.

The first step in the code is to declare pipe.

private async void InitPipe(Socket socket)
{
    Pipe pipe = new Pipe();
    Task writing = FillPipeAsync(socket, pipe.Writer);
    Task reading = ReadPipeAsync(socket, pipe.Reader);

    await Task.WhenAll(reading, writing);
}

Pipe has a reader and a writer. Reader is responsible for reading pipe data, mainly in the data processing cycle. Writer is responsible for writing data to pipe, mainly in the data receiving cycle.

//Write cycle
private async Task FillPipeAsync(Socket socket, PipeWriter writer)
{
    //Data flow is relatively large, using 1M bytes as buffer
    const int minimumBufferSize = 1024 * 1024;

    while (running)
    {
        try
        {
            //From the writer, get a memory space of not less than the specified size
            Memory<byte> memory = writer.GetMemory(minimumBufferSize);

            //Turn memory space into ArraySegment for socket use
            if (!MemoryMarshal.TryGetArray((ReadOnlyMemory<byte>)memory, out ArraySegment<byte> arraySegment))
            {
                throw new InvalidOperationException("Buffer backed by array was expected");
            }
            //Accept data
            int bytesRead = await SocketTaskExtensions.ReceiveAsync(socket, arraySegment, SocketFlags.None);
            if (bytesRead == 0)
            {
                break;
            }

            //Once accepted, the data is already in the pipe, telling the pipe how much data it has written to it.
            writer.Advance(bytesRead);
        }
        catch
        {
            break;
        }

        // Prompt reader to read data, reader can continue to execute readAsync() method
        FlushResult result = await writer.FlushAsync();

        if (result.IsCompleted)
        {
            break;
        }
    }

    // Tell pipe it's over
    writer.Complete();
}

//Read loop
private async Task ReadPipeAsync(Socket socket, PipeReader reader)
{
    while (running)
    {
        //Waiting for writer to write data
        ReadResult result = await reader.ReadAsync();
        //Get the memory area
        ReadOnlySequence<byte> buffer = result.Buffer;
        SequencePosition? position = null;

        do
        {
            //Find the location of the first byte of the head
            position = buffer.PositionOf((byte)0x75);
            if (position != null)
            {
                //Because there are four consecutive bytes as the head, it needs to be compared. I use the ToArray method directly here or have the memory copy action. It is not ideal, but it is very convenient to write.
                //Scenarios with higher performance requirements can be compared separately after slice operation, so no memory copy action is required.
                var headtoCheck = buffer.Slice(position.Value, 4).ToArray();
                //SequenceEqual needs to refer to System.Linq
                if (headtoCheck.SequenceEqual(new byte[] { 0x75, 0xbd, 0x7e, 0x97 }))
                {
                    //At this point, I think I have found the beginning of the package (starting from position.value). Next, I need to intercept the length of the whole package from the beginning. First, I need to determine whether the length is enough.
                    if (buffer.Slice(position.Value).Length >= 2400)
                    {
                        //If the length is enough, take out ReadOnlySequence and operate on it.
                        var mes = buffer.Slice(position.Value, 2400);
                        //Here is the function of data processing. ReadOnlySequence can be operated with reference to official documents. The document uses span, so the performance will be better. I have a simple and practical ToArray() operation here, which also has the problem of memory copy, but it deals directly with byte arrays.
                        await ProcessMessage(mes.ToArray());
                        //Even if this section is completed, from the beginning, the length of the whole package will be completed.
                        var next = buffer.GetPosition(2400, position.Value);
                        //Replace the discarded buffers with the remaining buffer references
                        buffer = buffer.Slice(next);
                    }
                    else
                    {
                        //The length is not enough, indicating that the data package is incomplete. Wait until the next wave of data comes in and splices again, jumping out of the cycle.
                        break;
                    }
                }
                else
                {
                    //The first is 0x75, but the latter does not match, there may be data transmission problems, then need to abandon the first, 0x75 after the byte to start looking for 0x75 again.
                    var next = buffer.GetPosition(1, position.Value);
                    buffer = buffer.Slice(next);
                }
            }
        }
        while (position != null);

        //When the data is processed, tell pipe how much data is left unprocessed (incomplete data packet, no head can be found)
        reader.AdvanceTo(buffer.Start, buffer.End);

        if (result.IsCompleted)
        {
            break;
        }
    }

    reader.Complete();
}

The above code basically solves the following problems:

  • Incomplete data reception, missing the beginning and end, resulting in massive data discards, or maintaining the code complexity of a queue by oneself
  • Synchronization of Data Receiving and Processing
  • One-time receipt of multiple data

Epilogue

This article only explains the pipeline processing mode. For the vast ToArray method, the operation based on Span can be used to optimize (fill the pit when time is available). In addition, if you use Task. Run ()=> ProcessMessage (mes) directly in await ProcessMessage(mes.ToArray()); instead, there will be a mysterious problem in the actual measurement. It is likely that pipe runs fast and has released memory before the system schedules Task. If you need to optimize this block, you need to pay more attention.

Posted by habib009pk on Fri, 10 May 2019 10:47:58 -0700