Redis serialization protocol analysis

Introduction to RESP

The Redis client and the Redis server communicate based on a protocol called RESP. The full name of RESP is Redis Serialization Protocol, that is, Redis Serialization Protocol. Although RESP is designed for Redis, it can also be applied to other client server software projects. The following points are considered in the design of RESP:

Easy to implement.
Fast parsing.
High readability.

RESP can serialize different data types, such as integer, string, array, and a special Error type. The redis command to be executed will be encapsulated as a request similar to a string array, and then sent to the redis server through the redis client. Redis server will select a corresponding data type to reply based on a specific command type.

RESP is binary safe, and does not need to process batch data transferred from one process to another under RESP, because it uses prefix length (which will be analyzed later, that is, the number of data blocks has been defined in the prefix of each data block, similar to the fixed length encoding and decoding in Netty) to transmit batch data.

Note: the protocols outlined here only use client-server communication. Redis Cluster uses different binary protocols to exchange messages between multiple nodes (that is, RESP communication is not used between nodes in Redis Cluster).

network layer

The Redis client connects to the Redis server by creating a TCP connection on port 6379. Although RESP is not TCP specific in the underlying communication protocol technology, in the context of Redis, RESP is only used for TCP connections (or similar stream oriented connections, such as Unix sockets).

Request response model

The Redis server receives commands composed of different parameters. After receiving and processing the commands, it will send the reply back to the Redis client. This is the simplest model, but there are two exceptions:

Redis supports Pipelining, which is commonly referred to as pipeline in most cases. When the pipeline is used, the redis client can send multiple commands at one time and wait for a one-time reply (the reply in the text is replies, which is understood as that the redis server will return a batch reply result at one time).
When the Redis client subscribes to the Pub/Sub channel, the protocol will change its semantics and become a push protocol, that is, the client no longer needs to send commands, because the Redis server will automatically send new messages to the client (the client subscribing to the channel change) (this means that in the subscription / publish mode, the message is actively pushed by the Redis server to the Redis client subscribing to a specific channel).

In addition to the above two exceptions, Redis protocol is a simple request response protocol.

Data types supported by RESP

RESP was introduced in Redis 1.2. In Redis 2.0, RESP officially became the standard solution for communication with Redis server. That is, if you need to write a Redis client, you must implement this protocol in the client. RESP is essentially a serialization protocol. It supports data types such as single line string, error message, integer number, fixed length string and RESP array. RESP is in Redis S is used as a request response protocol as follows:

The Redis client encapsulates the command as an array type of RESP (array elements are fixed length string types, which is important to note) and sends it to the Redis server.
The Redis server selects one of the corresponding RESP data types to reply according to the command implementation.

In RESP, the data type depends on the first byte of the datagram:

The first byte of a single line string is +.
The first byte of the error message is -.
The first byte of an integer number is:.
The first byte of a fixed length string is $.
The first byte of the RESP array is *.

In addition, in RESP, fixed length strings or special variants of arrays can be used to represent Null values, which will be mentioned later. In RESP, different parts of the protocol are always terminated with \ r\n (CRLF).

A summary of the five data types in the current RESP is as follows:

data type	Text translation	basic feature	example
Simple String	Single line string	The first byte is +, the last two bytes are \ r\n, and the other bytes are string contents	+OK\r\n
Error	Error message	The first byte is -, the last two bytes are \ r\n, and the other bytes are the text content of the exception message	-ERR\r\n
Integer	Integer number	The first byte is:, the last two bytes are \ r\n, and the other bytes are the text content of numbers	:100\r\n
Bulk String	Fixed length string	The first byte is $, the next byte is the content string length \ r\n, the last two bytes are \ r\n, and the other bytes are the string content	$4\r\ndoge\r\n
Array	RESP array	The first byte is *, the next byte is the number of elements \ r\n, the last two bytes are \ r\n, and the other bytes are the contents of each element. Each element can be any data type	*2\r\n:100\r\n$4\r\ndoge\r\n

The following sections provide a more detailed analysis of each data type.

RESP Simple String - Simple String

Simple strings are encoded as follows:

(1) The first byte is +.
(2) Followed by a string that cannot contain CR or LF characters.
(3) Terminate with CRLF.

The simple string can ensure the transmission of non binary secure strings with minimum overhead. For example, after many Redis commands are successfully executed, the server needs to reply to the OK string. At this time, the datagram encoded into 5 bytes through the simple string is as follows:

+OK\r\n

If you need to send a binary secure string, you need to use a fixed length string.

When the Redis server responds with a simple string, the Redis client library should return a string to the caller. The string responding to the caller consists of characters from + to the end of the string (actually the content in part (2) above), excluding the last CRLF byte.

RESP Error message - Error

The error message type is a RESP specific data type. In fact, the error message type is basically the same as the simple string type, except that the first byte is -. The biggest difference between the error message type and the simple string type is that when the error message responds to the Redis server, it should be perceived as an exception to the client, and the string content in the error message should be Sense the error information returned by the Redis server. The encoding method of the error message is as follows:

(1) The first byte is -.
(2) Followed by a string that cannot contain CR or LF characters.
(3) Terminate with CRLF.

A simple example is as follows:

-Error message\r\n

The Redis server will reply to the error message only when there is a real error or perception error, such as trying to perform an operation on the wrong data type or the command does not exist. When the Redis client receives an error message, it should trigger an exception (generally, it throws an exception directly, which can be classified according to the content of the error message). Here are some examples of error message responses:

-ERR unknown command 'foobar'
-WRONGTYPE Operation against a key holding the wrong kind of value

-The content from the first word after to the first space or newline character represents the type of error returned. This is only the convention used by Redis, not part of the RESP error message format.

For example, ERR is a general error and WRONGTYPE is a more specific error, indicating that the client is trying to perform an operation on the wrong data type. This definition method is called error prefix. It is a method to enable the client to understand the error type returned by the server without relying on the exact message definition given. The message may change over time.

The client implementation can return different kinds of exceptions for different error types, or provide a general method to catch errors by directly providing the name of the error type as a string to the caller.

However, the function of error message classification and processing should not be regarded as a critical function, because it is not very useful, and some client implementations may simply return specific values to mask error messages as general exception processing, such as directly returning false.

RESP Integer number Integer

Integer numbers are encoded as follows:

(1) The first byte is:.
(2) Followed by a string that cannot contain CR or LF characters, that is, numbers should be converted into character sequences and finally output as bytes.
(3) Terminate with CRLF.

For example:

:0\r\n
:1000\r\n

Many Redis commands return integer numbers, such as INCR, LLEN and LASTSAVE commands. The returned integer number has no special meaning. For example, INCR returns the total amount of increment, while LASTSAVE is the UNIX timestamp. However, the Redis server guarantees that the returned integer number is within the signed 64 bit integer range. In some cases, the returned integer number refers to true or false. For example, when the EXISTS or SISMEMBER command is executed, 1 represents true and 0 represents false. In some cases, the returned integer number indicates whether the command actually produced an effect. For example, when SADD, SREM and SETNX commands are executed, 1 means that the command execution is effective, and 0 means that the command execution is not effective (equivalent to that the command is not executed). After the following commands are executed, integer numbers are returned: SETNX, del, EXISTS, INCR, incrby, decr, decrby, dbsize, LASTSAVE, renamex, move, LLEN, SADD, SREM, SISMEMBER, and scar.

RESP fixed length string - Bulk String

A fixed length string is used to represent a binary secure string with a maximum length of 512MB (Bulk itself has the meaning of large volume). The encoding method of fixed length string is as follows:

(1) The first byte is $.
(2) Next is the byte length of the string (called the prefixed length, that is, the prefix length). The prefix length block is terminated by CRLF.
(3) Then there is a string that cannot contain CR or LF characters, that is, numbers should be converted into character sequences and finally output as bytes.
(4) Terminate with CRLF.

For example, doge uses fixed length string encoding as follows:

First byte	Prefix Length	CRLF	String content	CRLF		Fixed length string
$	4	\r\n	doge	\r\n	===>	$4\r\ndoge\r\n

foobar uses fixed length string encoding as follows:

First byte	Prefix Length	CRLF	String content	CRLF		Fixed length string
$	6	\r\n	foobar	\r\n	===>	$6\r\nfoobar\r\n

Represents an Empty String (Empty String, corresponding to "") in Java The fixed length string encoding is as follows:

First byte	Prefix Length	CRLF	CRLF		Fixed length string
$	0	\r\n	\r\n	===>	$0\r\n\r\n

A fixed length string can also use a special format to represent a Null value, indicating that the value does not exist. In this special format, the prefix length is - 1 and there is no data, so the Null value is encoded with a fixed length string as follows:

First byte	Prefix Length	CRLF		Fixed length string
$	-1	\r\n	===>	$-1\r\n

When the Redis server returns a null value encoded by a fixed length string, the client should not return an empty string, but a null object in the corresponding programming language. For example, it corresponds to nil in Ruby, null in C, null in Java, and so on.

RESP Array

The Redis client uses the RESP array to send commands to the Redis server. Similarly, after some Redis commands are executed, the server needs to use the RESP array type to return the element set to the client, such as the LRANGE command that returns an element list. The RESP array is not completely consistent with the array in our cognition. Its coding format is as follows:

(1) The first byte is *.
(2) Next is the number of elements constituting the RESP array (decimal number, but it needs to be converted into byte sequence finally. For example, 10 needs to be converted into two adjacent bytes of 1 and 0). The number of elements is blocked and terminated by CRLF.
(3) The content of each element of the RESP array. Each element can be any RESP data type.

The encoding of an empty RESP array is as follows:

*0\r\n

A RESP array containing two fixed length string elements with contents of foo and bar is encoded as follows:

*2\r\n$3\r\nfoo\r\n$3\r\nbar\r\n

The general format is: * CRLF is used as the prefix of the RESP array, while the elements of other data types that make up the RESP array are just connected in series one by one. For example, the encoding of a RESP array containing three integer type elements is as follows:

*3\r\n:1\r\n:2\r\n:3\r\n

The elements of the RESP array are not necessarily of the same data type, and can contain elements of mixed types. For example, the following is the encoding of a RESP array containing 4 integer type elements and 1 fixed length string type element (a total of 5 elements) (in order to see more clearly, it is encoded in multiple lines. In fact, this cannot be done):

# Number of elements
*5\r\n
# Element of the first integer type
:1\r\n
# Element of the second integer type
:2\r\n
# Element of the 3rd integer type
:3\r\n
# Element of the 4th integer type
:4\r\n # Fixed length string type element
$6\r\n
foobar\r\n

The first line of the Redis server response message * 5\r\n defines that five reply data will be followed immediately, and then each reply data will be used as an element item to form a multi-element fixed length reply (Multi Bulk Reply, which is difficult to translate. The general meaning here is that each reply line is an item in the whole reply message). This can be compared to ArrayList (generic erasure) in Java, which is somewhat similar to the following pseudo code:

List encode = new ArrayList();
// Number of added elements
encode.add(elementCount);
encode.add(CRLF);
// Add element of the first integer type - 1
encode.add(':');
encode.add(1);
encode.add(CRLF);
// Add element of 2nd integer type - 2
encode.add(':');
encode.add(2);
encode.add(CRLF);
// Add element of 3rd integer type - 3
encode.add(':');
encode.add(3);
encode.add(CRLF);
// Add element of 4th integer type - 4
encode.add(':');
encode.add(4);
encode.add(CRLF);
// Add elements of fixed length string type
encode.add('$');
// Prefix Length 
encode.add(6);
// String content
encode.add("foobar");
encode.add(CRLF);

The concept of Null value also exists in RESP array, which is called RESP Null Array below. For historical reasons, another special encoding format is adopted in RESP array to define Null value, which is different from Null value string in fixed length string. For example, when the execution of the BLPOP command times out, a RESP Null Array type response will be returned. The encoding of RESP Null Array is as follows:

*-1\r\n

When the reply of Redis server is RESP Null Array type, the client should return a Null object instead of an empty array or empty list. This is more important. It is the key to distinguish whether the reply is an empty array (that is, the command is executed correctly and the return result is normal) or other reasons (such as the timeout of the BLPOP command). The element of the RESP array can also be a RESP array. The following is a RESP array containing two RESP array type elements. The encoding is as follows (in order to see more clearly, it is encoded in multiple lines, which can not be done in practice):

# Number of elements
*2\r\n
# 1st RESP array element
*3\r\n
:1\r\n
:2\r\n
:3\r\n
# 2nd RESP array element
*2\r\n
+Foo\r\n
-Bar\r\n

The above RESP array contains 2 RESP array type elements, the first RESP array element contains 3 integer type elements, and the second RESP array element contains 1 simple string type element and 1 error message type element.

Null element in RESP array

A single element in the RESP array also has the concept of Null value, which is called Null element below. The Redis server replies that if it is a RESP array type and there is a Null element in the RESP array, it means that the element is lost and can never be replaced by an empty string. This can happen with the SORT command when used with the GET mode option without the specified key.

The following is an example of a RESP array containing Null elements (in order to see more clearly, it is encoded in multiple lines, which can not be done in practice):

*3\r\n
$3\r\n
foo\r\n
$-1\r\n
$3\r\n
bar\r\n

The second element in the RESP array is a Null element. The final content returned by the client API should be:

# Ruby
["foo",nil,"bar"]
# Java
["foo",null,"bar"]

RESP other relevant contents

It mainly includes:

An example of sending a command to the Redis server.
Batch command and pipeline.
Inline Commands.

In fact, there is a section in the document that uses C language to write a high-performance RESP parser, which is not translated here, because after mastering the relevant contents of RESP, you can write a parser based on any language.

Send the command to Redis server

If you are familiar with the serialization format in RESP, it will be easy to write Redis client class library. We can further specify the interaction mode between the client and the server:

The Redis client sends a RESP array containing only fixed length string type elements to the Redis server.
The Redis server can reply to the Redis client with any RESP data type. The specific data type generally depends on the command type.

The following is a typical interaction example: the Redis client sends the command LLEN mylist to obtain the length with KEY as mylist, and the Redis server will reply with integer type, as shown in the following example (C is the client and S is the server). The pseudo code is as follows:

C: *2\r\n
C: $4\r\n
C: LLEN\r\n
C: $6\r\n
C: mylist\r\n

S: :48293\r\n

For simplicity, we use line breaks to separate different parts of the protocol (shown in the above code line by line). However, during actual interaction, the Redis client sends * 2\r\n\r\nLLEN\r\n\r\nmylist\r\n as a whole.

Batch command and pipeline

Redis clients can send batch commands using the same connection. Redis supports the pipeline feature, so the redis client can send multiple commands through one write operation without reading the reply of the redis server to the previous command before sending the next command. After sending commands in batches, all replies can be obtained at the end (combined into one reply). More relevant information can be viewed Using pipelining to speedup Redis queries.

Inline command

In some scenarios, we may only use the telnet command. Under this condition, we need to send the command to the Redis server. Although Redis protocol is easy to implement, it is not ideal in interactive sessions, and Redis cli may not be available in some cases. For this reason, Redis has designed a command format specially designed for human beings, which is called Inline Command format.

The following is an example of server / client chat using inline commands (S for server and C for client):

C: PING
S: +PONG

The following is another example of using an inline command to return an integer:

C: EXISTS somekey
S: :0

Basically, you only need to write parameters separated by spaces in the telnet session. Since no command starts with * except for the unified request protocol, Redis can detect this situation and parse the entered command.

Writing high performance parser based on RESP

Because the byte buffer java.nio.ByteBuffer provided by the JDK natively cannot be automatically expanded, and the read-write mode needs to be switched, Netty is directly introduced here, and the ByteBuf fer provided by Netty is used for RESP data type resolution. Dependency:

<dependency>
    <groupId>io.netty</groupId>
    <artifactId>netty-buffer</artifactId>
    <version>4.1.42.Final</version>
</dependency>

Define decoder interface:

public interface RespDecoder<V>{
    
    V decode(ByteBuf buffer);
}

Define constants:

public class RespConstants {

    public static final Charset ASCII = StandardCharsets.US_ASCII;
    public static final Charset UTF_8 = StandardCharsets.UTF_8;

    public static final byte DOLLAR_BYTE = '$';
    public static final byte ASTERISK_BYTE = '*';
    public static final byte PLUS_BYTE = '+';
    public static final byte MINUS_BYTE = '-';
    public static final byte COLON_BYTE = ':';

    public static final String EMPTY_STRING = "";
    public static final Long ZERO = 0L;
    public static final Long NEGATIVE_ONE = -1L;
    public static final byte CR = (byte) '\r';
    public static final byte LF = (byte) '\n';
    public static final byte[] CRLF = "\r\n".getBytes(ASCII);

    public enum ReplyType {

        SIMPLE_STRING,

        ERROR,

        INTEGER,

        BULK_STRING,

        RESP_ARRAY
    }
}

The implementation of the parsing module in the following chapters has ignored the parsing of the first byte, because the first byte determines the specific data type.

Parse simple string

The simple String type is a single line String, and its parsing result corresponds to the String type in Java. The decoder is implemented as follows:

// Parse single line string
public class LineStringDecoder implements RespDecoder<String> {

    @Override
    public String decode(ByteBuf buffer) {
        return CodecUtils.X.readLine(buffer);
    }
}

public enum CodecUtils {

    X;

    public int findLineEndIndex(ByteBuf buffer) {
        int index = buffer.forEachByte(ByteProcessor.FIND_LF);
        return (index > 0 && buffer.getByte(index - 1) == '\r') ? index : -1;
    }

    public String readLine(ByteBuf buffer) {
        int lineEndIndex = findLineEndIndex(buffer);
        if (lineEndIndex > -1) {
            int lineStartIndex = buffer.readerIndex();
            // Calculate byte length
            int size = lineEndIndex - lineStartIndex - 1;
            byte[] bytes = new byte[size];
            buffer.readBytes(bytes);
            // Reset read cursor to\r\n First byte after
            buffer.readerIndex(lineEndIndex + 1);
            buffer.markReaderIndex();
            return new String(bytes, RespConstants.UTF_8);
        }
        return null;
    }
}

public class RespSimpleStringDecoder extends LineStringDecoder {
    
}

Here, a class LineStringDecoder is extracted to parse single line strings, so that you can inherit once when parsing error messages. Test:

public static void main(String[] args) throws Exception {
    ByteBuf buffer = ByteBufAllocator.DEFAULT.buffer();
    // +OK\r\n
    buffer.writeBytes("+OK".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    String value = RespCodec.X.decode(buffer);
    log.info("Decode result:{}", value);
}
// Decode result:OK

Parse error message

The essence of error message is also a single line string, so its decoding implementation can be consistent with that of simple string. The decoder of error message data type is as follows:

public class RespErrorDecoder extends LineStringDecoder {

}

Test:

public static void main(String[] args) throws Exception {
    ByteBuf buffer = ByteBufAllocator.DEFAULT.buffer();
    // -ERR unknown command 'foobar'\r\n
    buffer.writeBytes("-ERR unknown command 'foobar'".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    String value = RespCodec.X.decode(buffer);
    log.info("Decode result:{}", value);
}
// Decode result:ERR unknown command 'foobar'

Parsing integer numbers

The essence of integer digital type is to restore the signed 64 bit long integer from the byte sequence. Because it is signed, the type identification bit: the first byte after it needs to judge whether it is a negative character -, because it is parsed from left to right, and then the current digital value is multiplied by 10 for each new bit parsed. The implementation of its decoder is as follows:

public class RespIntegerDecoder implements RespDecoder<Long> {

    @Override
    public Long decode(ByteBuf buffer) {
        int lineEndIndex = CodecUtils.X.findLineEndIndex(buffer);
        // No end of line, exception
        if (-1 == lineEndIndex) {
            return null;
        }
        long result = 0L;
        int lineStartIndex = buffer.readerIndex();
        boolean negative = false;
        byte firstByte = buffer.getByte(lineStartIndex);
        // negative
        if (RespConstants.MINUS_BYTE == firstByte) {
            negative = true;
        } else {
            int digit = firstByte - '0';
            result = result * 10 + digit;
        }
        for (int i = lineStartIndex + 1; i < (lineEndIndex - 1); i++) {
            byte value = buffer.getByte(i);
            int digit = value - '0';
            result = result * 10 + digit;
        }
        if (negative) {
            result = -result;
        }
        // Reset read cursor to\r\n First byte after
        buffer.readerIndex(lineEndIndex + 1);
        return result;
    }
}

The parsing of integer numeric types is relatively complex. Be sure to pay attention to the judgment of negative numbers. Test:

public static void main(String[] args) throws Exception {
    ByteBuf buffer = ByteBufAllocator.DEFAULT.buffer();
    // :-1000\r\n
    buffer.writeBytes(":-1000".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    Long value = RespCodec.X.decode(buffer);
    log.info("Decode result:{}", value);
}
// Decode result:-1000

Parse fixed length string

The key to fixed length string type parsing is to read the first byte sequence after the type identifier $, and parse it into 64bit signed integer blocks to determine the byte length of the string content to be parsed later, and then read the following bytes according to the length. The decoder is implemented as follows:

public class RespBulkStringDecoder implements RespDecoder<String> {

    @Override
    public String decode(ByteBuf buffer) {
        int lineEndIndex = CodecUtils.X.findLineEndIndex(buffer);
        if (-1 == lineEndIndex) {
            return null;
        }
        // use RespIntegerDecoder Read length
        Long length = (Long) DefaultRespCodec.DECODERS.get(ReplyType.INTEGER).decode(buffer);
        if (null == length) {
            return null;
        }
        // Bulk Null String
        if (RespConstants.NEGATIVE_ONE.equals(length)) {
            return null;
        }
        // Bulk Empty String
        if (RespConstants.ZERO.equals(length)) {
            return RespConstants.EMPTY_STRING;
        }
        // Length of real byte content
        int readLength = (int) length.longValue();
        if (buffer.readableBytes() > readLength) {
            byte[] bytes = new byte[readLength];
            buffer.readBytes(bytes);
            // Reset read cursor to\r\n First byte after
            buffer.readerIndex(buffer.readerIndex() + 2);
            return new String(bytes, RespConstants.UTF_8);
        }
        return null;
    }
}

Test:

public static void main(String[] args) throws Exception{
    ByteBuf buffer = ByteBufAllocator.DEFAULT.buffer();
    // $6\r\nthrowable\r\n
    buffer = ByteBufAllocator.DEFAULT.buffer();
    buffer.writeBytes("$9".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    buffer.writeBytes("throwable".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    String value = RespCodec.X.decode(buffer);
    log.info("Decode result:{}", value);
}
// Decode result:throwable

Parse RESP array

Key to RESP array type resolution:

First read the first byte sequence after the type identifier * and parse it into 64bit signed integers to determine the number of elements in the array.
Recursively parse each element.

Many Redis protocol parsing frameworks are implemented by stack or state machine. Here, simply implement them by recursion. The decoder code is as follows:

public class RespArrayDecoder implements RespDecoder {

    @Override
    public Object decode(ByteBuf buffer) {
        int lineEndIndex = CodecUtils.X.findLineEndIndex(buffer);
        if (-1 == lineEndIndex) {
            return null;
        }
        // Number of resolved elements
        Long length = (Long) DefaultRespCodec.DECODERS.get(ReplyType.INTEGER).decode(buffer);
        if (null == length) {
            return null;
        }
        // Null Array
        if (RespConstants.NEGATIVE_ONE.equals(length)) {
            return null;
        }
        // Array Empty List
        if (RespConstants.ZERO.equals(length)) {
            return Lists.newArrayList();
        }
        List<Object> result = Lists.newArrayListWithCapacity((int) length.longValue());
        // recursion
        for (int i = 0; i < length; i++) {
            result.add(DefaultRespCodec.X.decode(buffer));
        }
        return result;
    }
}

Test:

public static void main(String[] args) throws Exception {
    ByteBuf buffer = ByteBufAllocator.DEFAULT.buffer();
    //*2\r\n$3\r\nfoo\r\n$3\r\nbar\r\n
    buffer = ByteBufAllocator.DEFAULT.buffer();
    buffer.writeBytes("*2".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    buffer.writeBytes("$3".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    buffer.writeBytes("foo".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    buffer.writeBytes("$3".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    buffer.writeBytes("bar".getBytes(RespConstants.UTF_8));
    buffer.writeBytes(RespConstants.CRLF);
    List value = RespCodec.X.decode(buffer);
    log.info("Decode result:{}", value);
}
// Decode result:[foo, bar]

Summary

After having a relatively deep understanding of the content of RESP and its encoding and decoding process, you can write the encoding and decoding module of Redis service based on Netty, which can be used as a very meaningful example for the introduction of Netty.

Appendix all codes in this document:

public class RespConstants {

    public static final Charset ASCII = StandardCharsets.US_ASCII;
    public static final Charset UTF_8 = StandardCharsets.UTF_8;

    public static final byte DOLLAR_BYTE = '$';
    public static final byte ASTERISK_BYTE = '*';
    public static final byte PLUS_BYTE = '+';
    public static final byte MINUS_BYTE = '-';
    public static final byte COLON_BYTE = ':';

    public static final String EMPTY_STRING = "";
    public static final Long ZERO = 0L;
    public static final Long NEGATIVE_ONE = -1L;
    public static final byte CR = (byte) '\r';
    public static final byte LF = (byte) '\n';
    public static final byte[] CRLF = "\r\n".getBytes(ASCII);

    public enum ReplyType {

        SIMPLE_STRING,

        ERROR,

        INTEGER,

        BULK_STRING,

        RESP_ARRAY
    }
}

public enum CodecUtils {

    X;

    public int findLineEndIndex(ByteBuf buffer) {
        int index = buffer.forEachByte(ByteProcessor.FIND_LF);
        return (index > 0 && buffer.getByte(index - 1) == '\r') ? index : -1;
    }

    public String readLine(ByteBuf buffer) {
        int lineEndIndex = findLineEndIndex(buffer);
        if (lineEndIndex > -1) {
            int lineStartIndex = buffer.readerIndex();
            // Calculate byte length
            int size = lineEndIndex - lineStartIndex - 1;
            byte[] bytes = new byte[size];
            buffer.readBytes(bytes);
            // Reset read cursor to\r\n First byte after
            buffer.readerIndex(lineEndIndex + 1);
            buffer.markReaderIndex();
            return new String(bytes, RespConstants.UTF_8);
        }
        return null;
    }
}

public interface RespCodec {

    RespCodec X = DefaultRespCodec.X;

    <IN, OUT> OUT decode(ByteBuf buffer);

    <IN, OUT> ByteBuf encode(IN in);
}

public enum DefaultRespCodec implements RespCodec {

    X;

    static final Map<ReplyType, RespDecoder> DECODERS = Maps.newConcurrentMap();
    private static final RespDecoder DEFAULT_DECODER = new DefaultRespDecoder();

    static {
        DECODERS.put(ReplyType.SIMPLE_STRING, new RespSimpleStringDecoder());
        DECODERS.put(ReplyType.ERROR, new RespErrorDecoder());
        DECODERS.put(ReplyType.INTEGER, new RespIntegerDecoder());
        DECODERS.put(ReplyType.BULK_STRING, new RespBulkStringDecoder());
        DECODERS.put(ReplyType.RESP_ARRAY, new RespArrayDecoder());
    }

    @SuppressWarnings("unchecked")
    @Override
    public <IN, OUT> OUT decode(ByteBuf buffer) {
        return (OUT) DECODERS.getOrDefault(determineReplyType(buffer), DEFAULT_DECODER).decode(buffer);
    }

    private ReplyType determineReplyType(ByteBuf buffer) {
        byte firstByte = buffer.readByte();
        ReplyType replyType;
        switch (firstByte) {
            case RespConstants.PLUS_BYTE:
                replyType = ReplyType.SIMPLE_STRING;
                break;
            case RespConstants.MINUS_BYTE:
                replyType = ReplyType.ERROR;
                break;
            case RespConstants.COLON_BYTE:
                replyType = ReplyType.INTEGER;
                break;
            case RespConstants.DOLLAR_BYTE:
                replyType = ReplyType.BULK_STRING;
                break;
            case RespConstants.ASTERISK_BYTE:
                replyType = ReplyType.RESP_ARRAY;
                break;
            default: {
                throw new IllegalArgumentException("first byte:" + firstByte);
            }
        }
        return replyType;
    }

    @Override
    public <IN, OUT> ByteBuf encode(IN in) {
        // TODO
        throw new UnsupportedOperationException("encode");
    }
}

public interface RespDecoder<V> {

    V decode(ByteBuf buffer);
}

public class DefaultRespDecoder implements RespDecoder {

    @Override
    public Object decode(ByteBuf buffer) {
        throw new IllegalStateException("decoder");
    }
}

public class LineStringDecoder implements RespDecoder<String> {

    @Override
    public String decode(ByteBuf buffer) {
        return CodecUtils.X.readLine(buffer);
    }
}

public class RespSimpleStringDecoder extends LineStringDecoder {

}

public class RespErrorDecoder extends LineStringDecoder {

}

public class RespIntegerDecoder implements RespDecoder<Long> {

    @Override
    public Long decode(ByteBuf buffer) {
        int lineEndIndex = CodecUtils.X.findLineEndIndex(buffer);
        // No end of line, exception
        if (-1 == lineEndIndex) {
            return null;
        }
        long result = 0L;
        int lineStartIndex = buffer.readerIndex();
        boolean negative = false;
        byte firstByte = buffer.getByte(lineStartIndex);
        // negative
        if (RespConstants.MINUS_BYTE == firstByte) {
            negative = true;
        } else {
            int digit = firstByte - '0';
            result = result * 10 + digit;
        }
        for (int i = lineStartIndex + 1; i < (lineEndIndex - 1); i++) {
            byte value = buffer.getByte(i);
            int digit = value - '0';
            result = result * 10 + digit;
        }
        if (negative) {
            result = -result;
        }
        // Reset read cursor to\r\n First byte after
        buffer.readerIndex(lineEndIndex + 1);
        return result;
    }
}

public class RespBulkStringDecoder implements RespDecoder<String> {

    @Override
    public String decode(ByteBuf buffer) {
        int lineEndIndex = CodecUtils.X.findLineEndIndex(buffer);
        if (-1 == lineEndIndex) {
            return null;
        }
        Long length = (Long) DefaultRespCodec.DECODERS.get(ReplyType.INTEGER).decode(buffer);
        if (null == length) {
            return null;
        }
        // Bulk Null String
        if (RespConstants.NEGATIVE_ONE.equals(length)) {
            return null;
        }
        // Bulk Empty String
        if (RespConstants.ZERO.equals(length)) {
            return RespConstants.EMPTY_STRING;
        }
        // Length of real byte content
        int readLength = (int) length.longValue();
        if (buffer.readableBytes() > readLength) {
            byte[] bytes = new byte[readLength];
            buffer.readBytes(bytes);
            // Reset read cursor to\r\n First byte after
            buffer.readerIndex(buffer.readerIndex() + 2);
            return new String(bytes, RespConstants.UTF_8);
        }
        return null;
    }
}

public class RespArrayDecoder implements RespDecoder {

    @Override
    public Object decode(ByteBuf buffer) {
        int lineEndIndex = CodecUtils.X.findLineEndIndex(buffer);
        if (-1 == lineEndIndex) {
            return null;
        }
        // Number of resolved elements
        Long length = (Long) DefaultRespCodec.DECODERS.get(ReplyType.INTEGER).decode(buffer);
        if (null == length) {
            return null;
        }
        // Null Array
        if (RespConstants.NEGATIVE_ONE.equals(length)) {
            return null;
        }
        // Array Empty List
        if (RespConstants.ZERO.equals(length)) {
            return Lists.newArrayList();
        }
        List<Object> result = Lists.newArrayListWithCapacity((int) length.longValue());
        // recursion
        for (int i = 0; i < length; i++) {
            result.add(DefaultRespCodec.X.decode(buffer));
        }
        return result;
    }
}

Posted by bharanikumarphp on Fri, 03 Dec 2021 13:36:26 -0800

Programmer Group

Redis serialization protocol analysis

Introduction to RESP

network layer

Request response model

Data types supported by RESP

RESP Simple String - Simple String

RESP Error message - Error

RESP Integer number Integer

RESP fixed length string - Bulk String

RESP Array

RESP other relevant contents

Send the command to Redis server

Batch command and pipeline

Inline command

Writing high performance parser based on RESP

Parse simple string

Parse error message

Parsing integer numbers

Parse fixed length string

Parse RESP array

Summary

Hot Keywords