1. Go language memory layout
Imagine you have a structure like this.
type MyData struct {
aByte byte
aShort int16
anInt32 int32
aSlice []byte
}
So what exactly is this structure? Fundamentally, it describes how data is laid out in memory. What does this mean? How does the compiler show it? Let's take a look. First let's use reflection to check fields in the structure.
2. Above reflection
Here is some code that uses reflection to find out the field size and its offsets (where they are in memory relative to the beginning of the structure). Reflection can tell us how the compiler views types (including structures).
// First ask Go to give us some information about the MyData type
typ := (MyData{})
("Struct is %d bytes long\n", ())
// We can run through the fields in the structure in order
n := ()
for i := 0; i < n; i++ {
field := (i)
("%s at offset %v, size=%d, align=%d\n",
, , (),
())
}
In addition to the offset and size of each field, I also printed the alignment of each field, which I will explain later. The results are as follows:
Struct is 32 bytes long
aByte at offset 0, size=1, align=1
aShort at offset 2, size=2, align=2
anInt32 at offset 4, size=4, align=4
aSlice at offset 8, size=24, align=8
aByte is the first field in our structure with an offset of 0. It uses 1 byte of memory.
aShort is the second field. It uses 2 bytes of memory. Strangely, the offset is 2. Why is this? The answer is alignment, the CPU better accesses 2 bytes at the address of a multiple of a 2 byte ("2 byte boundary") and accesses 4 bytes on a 4 byte boundary until the natural integer size of the CPU, which is 8 bytes (64 bits) on modern CPUs.
In some older RISC CPUs accessing wrongly aligned numbers causes a failure: on some UNIX systems, this will be a SIGBUS that will stop your program (or kernel). Some systems are able to handle these errors and fix the errors: your code will run, but will run slowly, as the extra code will be run by the operating system to fix the errors. I believe Intel and ARM's CPUs are also just handling any misalignment on the chip: maybe we'll test this in a future post, and any performance impact.
Anyway, alignment is why the Go compiler skips a byte to put the field aShort so that it lies at the 2-byte boundary. Because of this, we can put another field into the structure without making it take up more memory. Here is a new version of our structure, with a new field anotherByte immediately after aByte.
type MyData struct {
aByte byte
anotherByte byte
aShort int16
anInt32 int32
aSlice []byte
}
We run the reflection code again and we can see that anotherByte happens to be free space between aByte and aShort. It sits at offset 1 and aShort is still offset 2. Now it may be time to notice that mysterious alignment field I mentioned earlier. It tells us and the Go compiler how this field needs to be aligned.
Struct is 32 bytes long
aByte at offset 0, size=1, align=1
anotherByte at offset 1, size=1, align=1
aShort at offset 2, size=2, align=2
anInt32 at offset 4, size=4, align=4
aSlice at offset 8, size=24, align=8
3. Look at the memory
However, what exactly does our structure look like in memory? Let's see if we can find the answer. First let's build a MyData instance and populate some values. I chose a value that should be easily found in memory.
data := MyData{
aByte: 0x1,
aShort: 0x0203,
anInt32: 0x04050607,
aSlice: []byte{
0x08, 0x09, 0x0a,
},
}
Now some code accesses the bytes that make up this structure. We want to get an instance of this structure, find its address in memory, and print out the bytes in that memory.
We use the unsafe package to help us do this. This allows us to bypass the Go type system to convert a pointer to our structure into a 32-byte array, which is the memory data that makes up our structure.
dataBytes := (*[32]byte)((&data))
("Bytes are %#v\n", dataBytes)
We run the above code. This is the result, the first field, aByte, is displayed in bold from our structure. This is what you expect, single byte aByte = 0x01 in offset 0.
Bytes are &[32]uint8{**0x1**, 0x0, 0x3, 0x2, 0x7, 0x6, 0x5, 0x4, 0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
Next, let’s take a look at AShort. This is at the position of offset 2 and the length is 2. If you remember,aShort = 0x0203
, but the bytes displayed in reverse order. This is because most modern CPUs are Little-Endian: the lowest bit byte of this value appears in memory first.
Bytes are &[32]uint8{0x1, 0x0, **0x3, 0x2**, 0x7, 0x6, 0x5, 0x4, 0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
The same thing happened in Int32 = 0x04050607. The lowest bit byte appears in memory first.
Bytes are &[32]uint8{0x1, 0x0, 0x3, 0x2, **0x7, 0x6, 0x5, 0x4**, 0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
4. Mysterious episodes
What do we see now? This isaSlice = [] byte {0x08,0x09,0x0a}
, 24 bytes in offset 8. I don't see any symbol anywhere in my sequence 0x08, 0x09, 0x0a. What's going on?
Bytes are &[32]uint8{0x1, 0x0, 0x3, 0x2, 0x7, 0x6, 0x5, 0x4, **0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0**, **0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0**}
There are answers in the Go reflection package. slice is represented in Go by the following structure starting with pointer data, which points to the memory that holds the data in the slice; then the length of useful data in the memory, Len, and the size of the memory Cap.
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
If you give it to our code, we get the following offset and size. The data pointer and the two lengths are 8 bytes each with 8 byte alignment.
Struct is 24 bytes long
Data at offset 0, size=8, align=8
Len at offset 8, size=8, align=8
Cap at offset 16, size=8, align=8
If we look at the memory structure later, we can see that the data is at address 0x000000c42001055a. Afterwards, we see that both Len and Cap are 3, which is the length of our data.
Bytes are &[32]uint8{0x1, 0x0, 0x3, 0x2, 0x7, 0x6, 0x5, 0x4, **0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0**, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
We can directly access these data bytes with the following code. First let's access the slice header directly, and then print out the memory pointed to by the data.
dataslice := *(*)((&))
("Slice data is %#v\n",
(*[3]byte)(()))
Here is the output:
Slice data is &[3]uint8{0x8, 0x9, 0xa}
Summarize
The above is all about the memory layout of Go language. I hope the content of this article will be helpful to everyone to learn or use Go language. If you have any questions, you can leave a message to communicate.