Golang’s solution to implement Redis protocol parser

This article is the second article in the series "Implementing a Redis with Golang". This article will introduce the implementation of the Redis communication protocol and protocol parser, respectively. If you have any understanding of the protocol, you can directly read the protocol parser section.

Redis Communication Protocol

Redis has used a unified protocol RESP (REdis Serialization Protocol) since version 2.0. This protocol is easy to implement, and computers can efficiently parse and be easily understood by humans.

RESP is a binary secure text protocol that works on the TCP protocol. RESP takes lines as units, and commands or data sent by clients and servers are always \r\n (CRLF) as the newline character.

Binary security is to allow arbitrary characters to appear in the protocol without causing a failure. For example, a string in C language is\0As the end, the middle of the string is not allowed to appear\0, while Go strings are allowed to appear\0, we say that the Go string is binary-safe, while the C string is not binary-safe.

RESP's binary security allows us to include in key or value\ror\nSuch special characters. Binary security is particularly important when using redis to store binary data such as protobuf and msgpack.

RESP defines 5 formats:

Simple String: The server uses to return simple results, such as "OK". Non-binary security and line breaks are not allowed.
Error message (Error): The server uses to return simple error messages, such as "ERR Invalid Synatx". Non-binary security and line breaks are not allowed.
Integer: The return value of commands such as llen, scand, etc., 64-bit signed integer
Bulk String: A binary safe string, such as the return value of commands such as get
Array (also known as Multi Bulk Strings): Bulk String array, format for client sending instructions and lrange and other command responses

RESP represents the format by the first character:

Simple string: start with "+", such as: "+OK\r\n"
Error: Start with "-", such as: "-ERR Invalid Synatx\r\n"
Integer: Start with ":", such as: ":1\r\n"
String:$start
Array:*start

Bulk String has two lines, the first one$+The length of the text, the actual content of the second behavior. like:

$3\r\nSET\r\n

Bulk String is binary safe and can contain any bytes, which means that it can contain "\r\n" characters inside the Bulk String (the CRLF at the end of the line is hidden):

$4a\r\nb

$-1Indicates nil, for example, when using the get command to query a non-existent key, the response is$-1。

The first behavior of the Array format is "*" + array length, followed by the corresponding number of Bulk Strings. like,["foo", "bar"]Messages:

*2
$3
foo
$3
bar

The client also uses the Array format to send instructions to the server. The command itself will be the first parameter, such asSET key valueRESP packets for the instruction:

*3
$3
SET
$3
key
$5
value

Print out the newline:

*3\r\n$3\r\nSET\r\n$3\r\nkey\r\n$5\r\nvalue\r\n

Protocol parser

We're inImplement TCP serverThe implementation of TCP server has been introduced in the article, and the protocol parser will implement its Handler interface as an application layer server.

The protocol parser will receive data from the Socket and restore its data to[][]byteFormat, such as"*3\r\n$3\r\nSET\r\n$3\r\nkey\r\n$5\r\value\r\n"Will be restored to['SET', 'key', 'value']。

The complete code of this article:/hdt3213/godis/redis/parser

Requests from the client are in array format, which marks the total number of rows of messages in the first row and usesCRLFAs a line break.

bufioThe standard library can cache data read from the reader into a buffer until it encounters a delimiter or returns after reading, so we use('\n')to ensure that the complete line is read each time.

It should be noted that RESP isBinary securityprotocol that allows use in the bodyCRLFcharacter. For example, Redis can be received and executed correctlySET "a\r\nb" 1Instruction, the correct message of this command is as follows:

*3  
$3
SET
$4
a\r\nb 
$7
myvalue

whenReadBytesWhen reading the fifth line "a\r\nb\r\n" it will mistake it for two lines:

*3  
$3
SET
$4
a  // Wrong branchb // Wrong branch$7
myvalue

So when reading the fourth line$4After that, you should not continue to use itReadBytes('\n')Read the next line, it should be used(reader, msg)Method to read contents of a specified length.

msg = make([]byte, 4 + 2) // Body length 4 + line break length 2_, err = (reader, msg)

First, let's define the parser interface:

// Payload stores  or error
type Payload struct {
	Data 
	Err  error
}

// ParseStream reads the data and returns the result to the caller through the channel// The streaming interface is suitable for client/server usefunc ParseStream(reader ) &lt;-chan *Payload {
	ch := make(chan *Payload)
	go parse0(reader, ch)
	return ch
}

// ParseOne parses []byte and returnsfunc ParseOne(data []byte) (, error) {
	ch := make(chan *Payload)
	reader := (data)
	go parse0(reader, ch)
	payload := &lt;-ch // parse0 will close the channel
	if payload == nil {
		return nil, ("no reply")
	}
	return , 
}

Next we can take a look at the pseudo-code of the core process of the parser, and you can see the full code:

func parse0(reader , ch chan&lt;- *Payload) {
    // Initialize the read state    readingMultiLine := false
    expectedArgsCount := 0
    var args [][]byte
    var bulkLen int64
    for {
        // In the above we mentioned that RESP is based on behavioral units        // Because lines are divided into simple strings and binary-safe BulkString, we need to encapsulate a readLine function to be compatible        line, err = readLine(reader, bulkLen)
        if err != nil { 
            // Handle errors            return
        }
        // Next we parse the line we just read        // We simply divide Reply into two categories:        // Single line: StatusReply, IntReply, ErrorReply        // Multi-line: BulkReply, MultiBulkReply
        if !readingMultiLine {
            if isMulitBulkHeader(line) {
                // We received the first line from MulitBulkReply                // Get the number of BulkStrings in MulitBulkReply                expectedArgsCount = parseMulitBulkHeader(line)
                // Wait for MulitBulkReply to follow up                readingMultiLine = true
            } else if isBulkHeader(line) {
                // We received the first line of BulkReply                // Get the length of the second line of BulkReply, and tell the readLine function the length of the BulkString next line through bulkLen                bulkLen = parseBulkHeader()
                // There are 1 BulkString in this Reply                expectedArgsCount = 1 
                // Wait for BulkReply to follow up                readingMultiLine = true
            } else {
                // Handle single lines such as StatusReply, IntReply, ErrorReply, etc. Reply                reply := parseSingleLineReply(line)
                // Return the result via ch                emitReply(ch)
            }
        } else {
            // Entering this branch means we are waiting for the follow-up line of MulitBulkReply or BulkReply            // There are two subsequent lines of MulitBulkReply, BulkHeader or BulkString            if isBulkHeader(line) {
                bulkLen = parseBulkHeader()
            } else {
                // We are reading a BulkString, which may be MulitBulkReply or BulkReply                args = append(args, line)
            }
            if len(args) == expectedArgsCount { // We have read all subsequent lines                // Return the result via ch                emitReply(ch)
                // Reset the status, prepare to parse the next Reply                readingMultiLine = false
                expectedArgsCount = 0
                args = nil
                bulkLen = 0
            }
        }
    }
}

Let's post the implementation of the tool function:

func readLine(bufReader *, state *readState) ([]byte, bool, error) {
	var msg []byte
	var err error
	if  == 0 { // read simple line
		msg, err = ('\n')
		if err != nil {
			return nil, true, err
		}
		if len(msg) == 0 || msg[len(msg)-2] != '\r' {
			return nil, false, ("protocol error: " + string(msg))
		}
	} else { // read bulk line (binary safe)
		msg = make([]byte, +2)
		_, err = (bufReader, msg)
		if err != nil {
			return nil, true, err
		}
		if len(msg) == 0 ||
			msg[len(msg)-2] != '\r' ||
			msg[len(msg)-1] != '\n' {
			return nil, false, ("protocol error: " + string(msg))
		}
		 = 0
	}
	return msg, false, nil
}

func parseMultiBulkHeader(msg []byte, state *readState) error {
	var err error
	var expectedLine uint64
	expectedLine, err = (string(msg[1:len(msg)-2]), 10, 32)
	if err != nil {
		return ("protocol error: " + string(msg))
	}
	if expectedLine == 0 {
		 = 0
		return nil
	} else if expectedLine > 0 {
		// first line of multi bulk reply
		 = msg[0]
		 = true
		 = int(expectedLine)
		 = make([][]byte, 0, expectedLine)
		return nil
	} else {
		return ("protocol error: " + string(msg))
	}
}

func parseBulkHeader(msg []byte, state *readState) error {
	var err error
	, err = (string(msg[1:len(msg)-2]), 10, 64)
	if err != nil {
		return ("protocol error: " + string(msg))
	}
	if  == -1 { // null bulk
		return nil
	} else if  > 0 {
		 = msg[0]
		 = true
		 = 1
		 = make([][]byte, 0, 1)
		return nil
	} else {
		return ("protocol error: " + string(msg))
	}
}

func parseSingleLineReply(msg []byte) (, error) {
	str := (string(msg), "\n")
	str = (str, "\r")
	var result 
	switch msg[0] {
	case '+': // status reply
		result = (str[1:])
	case '-': // err reply
		result = (str[1:])
	case ':': // int reply
		val, err := (str[1:], 10, 64)
		if err != nil {
			return nil, ("protocol error: " + string(msg))
		}
		result = (val)
	default:
		// parse as text protocol
		strs := (str, " ")
		args := make([][]byte, len(strs))
		for i, s := range strs {
			args[i] = []byte(s)
		}
		result = (args)
	}
	return result, nil
}

This is all about this article about Golang’s implementation of Redis protocol parser. For more related contents of go redis protocol parser, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!