SoFunction
Updated on 2025-03-05

The principle of RawMessage function in Golang json library

text

As a general codec protocol, json is better readable than protocols such as thrift and protobuf. At the same time, the coded size is smaller than protocols such as xml, and is used a lot on the market. Even in many businesses, the most consumed part of our online instance is the serialization and deserialization of json. This is why many Gophers are committed to studying how to optimize this process most effectively.

Today we will learn a Golang official json library that provides a classic ability: RawMessage.

What is serialization

First, let’s think about what the so-called serialization refers to?

Refer to the two interface definitions of Marshaler and Unmarshaler in the json package:

// Marshaler is the interface implemented by types that
// can marshal themselves into valid JSON.
type Marshaler interface {
    MarshalJSON() ([]byte, error)
}
Serialization,That is Marshal,Need to convert a type to a byte array,That is这里接口返回值的 []byte。
goCopy the code// Unmarshaler is the interface implemented by types
// that can unmarshal a JSON description of themselves.
// The input can be assumed to be a valid encoding of
// a JSON value. UnmarshalJSON must copy the JSON data
// if it wishes to retain the data after returning.
//
// By convention, to approximate the behavior of Unmarshal itself,
// Unmarshalers implement UnmarshalJSON([]byte("null")) as a no-op.
type Unmarshaler interface {
    UnmarshalJSON([]byte) error
}

Deserialization is the inverse process of serialization, which receives an array of bytes and converts it into the target type value.

In fact, if you implement the above two interfaces for a custom type, your implementation will be executed when calling the json package and functions.

In short, essentially, serialization is the process of converting an object into a byte array, that is, []byte.
RawMessage

RawMessage is a raw encoded JSON value. It implements Marshaler and Unmarshaler and can be used to delay JSON decoding or precompute a JSON encoding.

RawMessage is specifically a type defined in the json library. It implements the Marshaler interface and the Unmarshaler interface to support the ability to serialize. Note the description we quoted above from the official doc. Let's take a look at the implementation in the source code:

// RawMessage is a raw encoded JSON value.
// It implements Marshaler and Unmarshaler and can
// be used to delay JSON decoding or precompute a JSON encoding.
type RawMessage []byte
// MarshalJSON returns m as the JSON encoding of m.
func (m RawMessage) MarshalJSON() ([]byte, error) {
    if m == nil {
        return []byte("null"), nil
    }
    return m, nil
}
// UnmarshalJSON sets *m to a copy of data.
func (m *RawMessage) UnmarshalJSON(data []byte) error {
    if m == nil {
        return (": UnmarshalJSON on nil pointer")
    }
    *m = append((*m)[0:0], data...)
    return nil
}
var _ Marshaler = (*RawMessage)(nil)
var _ Unmarshaler = (*RawMessage)(nil)

Very direct, in fact, the underlying layer of RawMessage is a []byte. When serializing, you directly return yourself. When deserializing, copy the []byte entered into the parameter and write it to your own memory address.

Interesting. We mentioned in the previous section that the output after serialization is originally a []byte, so why do we need to create another RawMessage? What is the function?

That's right, RawMessage is actually just as its name means an end state. What does it mean? I was originally a byte array, so if you want to serialize me, it would not cost much, just take my byte array. If you want to deserialize it, it's okay, you just need to get the original byte array.

This is what Raw means, what it used to be, what it is now. Just take it as it is.

Here we refer to Using Go’s classic explanation.

We can think of the raw message as a piece of information that we decide to ignore at the moment. The information is still there but we choose to keep it in its raw form — a byte array.

We can regard RawMessage as part of information that can be temporarily ignored, and we can further parse it in the future, but it is not used at this time. So, we can keep its original form and just be a byte array.

Use scenarios

In software development, we often say that we should not over-design. Good code should have clear usage scenarios and can effectively solve a type of problem, rather than creating an unverified castle in the sky based on ideas and concepts.
So is RawMessage such a castle in the air? Actually not.
We can treat it as a [placeholder]. Imagine that we define a common model for a certain business scenario, where some data needs to correspond to different structures in different scenarios. How can Marshal be used to form a byte array, store it into a database, read out the data, and restore the model at this time?
We can define this variable field as , and use its ability to adapt to all things to read and write.

Multiplexing pre-computed json values

package main
import (
    "encoding/json"
    "fmt"
    "os"
)
func main() {
    h := (`{"precomputed": true}`)
    c := struct {
        Header * `json:"header"`
        Body   string           `json:"body"`
    }{Header: &h, Body: "Hello Gophers!"}
    b, err := (&c, "", "\t")
    if err != nil {
        ("error:", err)
    }
    (b)
}

Here c is the structure we temporarily define, body is an explicit string, and header is mutable.

Do you still remember? RawMessage is essentially a []byte, so we can use it

(`{"precomputed": true}`)

To convert a string to RawMessage. Then Marshal it, and the output result is as follows:

{
    "header": {
        "precomputed": true
    },
    "body": "Hello Gophers!"
}

Have you found it?

Here "precomputed": true is exactly the same as the RawMessage we construct, so it corresponds to the first capability: using a pre-calculated json value during serialization.

Delay parsing json structure

package main
import (
    "encoding/json"
    "fmt"
    "log"
)
func main() {
    type Color struct {
        Space string
        Point  // delay parsing until we know the color space
    }
    type RGB struct {
        R uint8
        G uint8
        B uint8
    }
    type YCbCr struct {
        Y  uint8
        Cb int8
        Cr int8
    }
    var j = []byte(`[
    {"Space": "YCbCr", "Point": {"Y": 255, "Cb": 0, "Cr": -10}},
    {"Space": "RGB",   "Point": {"R": 98, "G": 218, "B": 255}}
]`)
    var colors []Color
    err := (j, &colors)
    if err != nil {
        ("error:", err)
    }
    for _, c := range colors {
        var dst any
        switch  {
        case "RGB":
            dst = new(RGB)
        case "YCbCr":
            dst = new(YCbCr)
        }
        err := (, dst)
        if err != nil {
            ("error:", err)
        }
        (, dst)
    }
}

The examples here are actually more typical. There may be two structural descriptions for Point in Color, one is RGB and the other is YCbCr. We correspond to the underlying storage and hope to be reused, which is very common.
Therefore, the [two-level deserialization] strategy is adopted here:

At the first level, the public fields are parsed and the analysis of these differences is delayed.

The second level, based on the parsed fields (usually with type-like semantics), determine the structure to be used when deserializing again, and obtain the final data based on Unmarshal again.

The above example output is as follows:

YCbCr &{255 0 -10}
RGB &{98 218 255}

Summarize

The RawMessage provided by json directly exposes the underlying []byte as interactive credentials, which can be embedded in various structures. Placeholder as an immutable field type, delay parsing. More efficient than string type. From the implementation point of view, it is very simple, it just encapsulates the interaction of a layer of byte array, so everyone can use it with confidence.

The above is the detailed content of the RawMessage functional principle in the Golang json library. For more information about the Golang json library RawMessage, please follow my other related articles!