This article is the second article in the series "Implementing a Redis with Golang". This article will introduce the implementation of the Redis communication protocol and protocol parser, respectively. If you have any understanding of the protocol, you can directly read the protocol parser section.
Redis Communication Protocol
Redis has used a unified protocol RESP (REdis Serialization Protocol) since version 2.0. This protocol is easy to implement, and computers can efficiently parse and be easily understood by humans.
RESP is a binary secure text protocol that works on the TCP protocol. RESP takes lines as units, and commands or data sent by clients and servers are always \r\n (CRLF) as the newline character.
Binary security is to allow arbitrary characters to appear in the protocol without causing a failure. For example, a string in C language is\0
As the end, the middle of the string is not allowed to appear\0
, while Go strings are allowed to appear\0
, we say that the Go string is binary-safe, while the C string is not binary-safe.
RESP's binary security allows us to include in key or value\r
or\n
Such special characters. Binary security is particularly important when using redis to store binary data such as protobuf and msgpack.
RESP defines 5 formats:
- Simple String: The server uses to return simple results, such as "OK". Non-binary security and line breaks are not allowed.
- Error message (Error): The server uses to return simple error messages, such as "ERR Invalid Synatx". Non-binary security and line breaks are not allowed.
- Integer: The return value of commands such as llen, scand, etc., 64-bit signed integer
- Bulk String: A binary safe string, such as the return value of commands such as get
- Array (also known as Multi Bulk Strings): Bulk String array, format for client sending instructions and lrange and other command responses
RESP represents the format by the first character:
- Simple string: start with "+", such as: "+OK\r\n"
- Error: Start with "-", such as: "-ERR Invalid Synatx\r\n"
- Integer: Start with ":", such as: ":1\r\n"
- String:
$
start - Array:
*
start
Bulk String has two lines, the first one$
+The length of the text, the actual content of the second behavior. like:
$3\r\nSET\r\n
Bulk String is binary safe and can contain any bytes, which means that it can contain "\r\n" characters inside the Bulk String (the CRLF at the end of the line is hidden):
$4a\r\nb
$-1
Indicates nil, for example, when using the get command to query a non-existent key, the response is$-1
。
The first behavior of the Array format is "*" + array length, followed by the corresponding number of Bulk Strings. like,["foo", "bar"]
Messages:
*2 $3 foo $3 bar
The client also uses the Array format to send instructions to the server. The command itself will be the first parameter, such asSET key value
RESP packets for the instruction:
*3 $3 SET $3 key $5 value
Print out the newline:
*3\r\n$3\r\nSET\r\n$3\r\nkey\r\n$5\r\nvalue\r\n
Protocol parser
We're inImplement TCP serverThe implementation of TCP server has been introduced in the article, and the protocol parser will implement its Handler interface as an application layer server.
The protocol parser will receive data from the Socket and restore its data to[][]byte
Format, such as"*3\r\n$3\r\nSET\r\n$3\r\nkey\r\n$5\r\value\r\n"
Will be restored to['SET', 'key', 'value']
。
The complete code of this article:/hdt3213/godis/redis/parser
Requests from the client are in array format, which marks the total number of rows of messages in the first row and usesCRLF
As a line break.
bufio
The standard library can cache data read from the reader into a buffer until it encounters a delimiter or returns after reading, so we use('\n')
to ensure that the complete line is read each time.
It should be noted that RESP isBinary security
protocol that allows use in the bodyCRLF
character. For example, Redis can be received and executed correctlySET "a\r\nb" 1
Instruction, the correct message of this command is as follows:
*3 $3 SET $4 a\r\nb $7 myvalue
whenReadBytes
When reading the fifth line "a\r\nb\r\n" it will mistake it for two lines:
*3 $3 SET $4 a // Wrong branchb // Wrong branch$7 myvalue
So when reading the fourth line$4
After that, you should not continue to use itReadBytes('\n')
Read the next line, it should be used(reader, msg)
Method to read contents of a specified length.
msg = make([]byte, 4 + 2) // Body length 4 + line break length 2_, err = (reader, msg)
First, let's define the parser interface:
// Payload stores or error type Payload struct { Data Err error } // ParseStream reads the data and returns the result to the caller through the channel// The streaming interface is suitable for client/server usefunc ParseStream(reader ) <-chan *Payload { ch := make(chan *Payload) go parse0(reader, ch) return ch } // ParseOne parses []byte and returnsfunc ParseOne(data []byte) (, error) { ch := make(chan *Payload) reader := (data) go parse0(reader, ch) payload := <-ch // parse0 will close the channel if payload == nil { return nil, ("no reply") } return , }
Next we can take a look at the pseudo-code of the core process of the parser, and you can see the full code:
func parse0(reader , ch chan<- *Payload) { // Initialize the read state readingMultiLine := false expectedArgsCount := 0 var args [][]byte var bulkLen int64 for { // In the above we mentioned that RESP is based on behavioral units // Because lines are divided into simple strings and binary-safe BulkString, we need to encapsulate a readLine function to be compatible line, err = readLine(reader, bulkLen) if err != nil { // Handle errors return } // Next we parse the line we just read // We simply divide Reply into two categories: // Single line: StatusReply, IntReply, ErrorReply // Multi-line: BulkReply, MultiBulkReply if !readingMultiLine { if isMulitBulkHeader(line) { // We received the first line from MulitBulkReply // Get the number of BulkStrings in MulitBulkReply expectedArgsCount = parseMulitBulkHeader(line) // Wait for MulitBulkReply to follow up readingMultiLine = true } else if isBulkHeader(line) { // We received the first line of BulkReply // Get the length of the second line of BulkReply, and tell the readLine function the length of the BulkString next line through bulkLen bulkLen = parseBulkHeader() // There are 1 BulkString in this Reply expectedArgsCount = 1 // Wait for BulkReply to follow up readingMultiLine = true } else { // Handle single lines such as StatusReply, IntReply, ErrorReply, etc. Reply reply := parseSingleLineReply(line) // Return the result via ch emitReply(ch) } } else { // Entering this branch means we are waiting for the follow-up line of MulitBulkReply or BulkReply // There are two subsequent lines of MulitBulkReply, BulkHeader or BulkString if isBulkHeader(line) { bulkLen = parseBulkHeader() } else { // We are reading a BulkString, which may be MulitBulkReply or BulkReply args = append(args, line) } if len(args) == expectedArgsCount { // We have read all subsequent lines // Return the result via ch emitReply(ch) // Reset the status, prepare to parse the next Reply readingMultiLine = false expectedArgsCount = 0 args = nil bulkLen = 0 } } } }
Let's post the implementation of the tool function:
func readLine(bufReader *, state *readState) ([]byte, bool, error) { var msg []byte var err error if == 0 { // read simple line msg, err = ('\n') if err != nil { return nil, true, err } if len(msg) == 0 || msg[len(msg)-2] != '\r' { return nil, false, ("protocol error: " + string(msg)) } } else { // read bulk line (binary safe) msg = make([]byte, +2) _, err = (bufReader, msg) if err != nil { return nil, true, err } if len(msg) == 0 || msg[len(msg)-2] != '\r' || msg[len(msg)-1] != '\n' { return nil, false, ("protocol error: " + string(msg)) } = 0 } return msg, false, nil } func parseMultiBulkHeader(msg []byte, state *readState) error { var err error var expectedLine uint64 expectedLine, err = (string(msg[1:len(msg)-2]), 10, 32) if err != nil { return ("protocol error: " + string(msg)) } if expectedLine == 0 { = 0 return nil } else if expectedLine > 0 { // first line of multi bulk reply = msg[0] = true = int(expectedLine) = make([][]byte, 0, expectedLine) return nil } else { return ("protocol error: " + string(msg)) } } func parseBulkHeader(msg []byte, state *readState) error { var err error , err = (string(msg[1:len(msg)-2]), 10, 64) if err != nil { return ("protocol error: " + string(msg)) } if == -1 { // null bulk return nil } else if > 0 { = msg[0] = true = 1 = make([][]byte, 0, 1) return nil } else { return ("protocol error: " + string(msg)) } } func parseSingleLineReply(msg []byte) (, error) { str := (string(msg), "\n") str = (str, "\r") var result switch msg[0] { case '+': // status reply result = (str[1:]) case '-': // err reply result = (str[1:]) case ':': // int reply val, err := (str[1:], 10, 64) if err != nil { return nil, ("protocol error: " + string(msg)) } result = (val) default: // parse as text protocol strs := (str, " ") args := make([][]byte, len(strs)) for i, s := range strs { args[i] = []byte(s) } result = (args) } return result, nil }
This is all about this article about Golang’s implementation of Redis protocol parser. For more related contents of go redis protocol parser, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!