Detailed explanation of the precautions when using maps in Golang

1. Define value as struct to save memory

1. Eliminate pointer references

whenmapWhen the value of is struct type, the data is stored directly in the map instead of being referenced through a pointer. This reduces the overhead of memory allocation and the burden of GC (garbage collection).

type User struct {
    ID   int
    Name string
}

m := make(map[string]User)
m["user1"] = User{ID: 1, Name: "John"}

// Example with pointer to struct
m2 := make(map[string]*User)
m2["user1"] = &User{ID: 1, Name: "John"}

In the second example, the map stores pointerUserPointer to the structure, which means that in addition to storing the pointer itself, additional memory is needed to store it.UserStructure and will increase the burden on GC.

2. Avoid memory fragmentation

When storing pointers, since the pointers may point to different locations in the heap, this leads to memory fragmentation, increasing the uncertainty of memory usage. Storage struct makes data more compact and reduces fragmentation.

3. Higher cache hit rate

Since the data of struct is stored compactly, the data of struct is more likely to be in adjacent memory locations than the storage pointer. This increases the hit rate of the CPU cache, thereby improving performance.

Example: Memory saving

Here is an example showing how to save memory by defining a struct type:

package main

import (
	"fmt"
	"runtime"
)

type User struct {
	ID   int
	Name string
}

func main() {
	// Use struct as value	users := make(map[string]User)
	for i := 0; i &lt; 1000000; i++ {
		users[("user%d", i)] = User{ID: i, Name: ("Name%d", i)}
	}

	printMemUsage("With struct values")

	// Use pointer as value	userPtrs := make(map[string]*User)
	for i := 0; i &lt; 1000000; i++ {
		userPtrs[("user%d", i)] = &amp;User{ID: i, Name: ("Name%d", i)}
	}

	printMemUsage("With pointer values")
}

func printMemUsage(label string) {
	var m 
	(&amp;m)
	("%s: Alloc = %v MiB\n", label, bToMb())
}

func bToMb(b uint64) uint64 {
	return b / 1024 / 1024
}

4. Set implementation comparison

map[int]bool{}

In this case, the value type of map isbool. Each key takes up a bool type space (usually one byte).

set := make(map[int]bool)
set[1] = true
set[2] = true

map[int]struct{}{}

In this case, the value type of map is an empty struct. An empty struct does not occupy any memory, so each key only occupies the memory of the key itself.

set := make(map[int]struct{})
set[1] = struct{}{}
set[2] = struct{}{}

Memory usage comparison

map[int]bool{} will use more memory than map[int]struct{}{}, because the bool type needs to store one byte (there may be additional memory alignment and management overhead in real applications), while struct{} is empty and does not add any memory overhead.

Sample code to compare memory usage

Here is a sample code that compares the memory usage of these two map types:

package main

import (
	"fmt"
	"runtime"
)

func main() {
	// Use bool as value	boolMap := make(map[int]bool)
	for i := 0; i &lt; 1000000; i++ {
		boolMap[i] = true
	}

	printMemUsage("With bool values")

	// Use struct as value	structMap := make(map[int]struct{})
	for i := 0; i &lt; 1000000; i++ {
		structMap[i] = struct{}{}
	}

	printMemUsage("With struct values")
}

func printMemUsage(label string) {
	var m 
	(&amp;m)
	("%s: Alloc = %v MiB\n", label, bToMb())
}

func bToMb(b uint64) uint64 {
	return b / 1024 / 1024
}

result

Run the above code and you will find that the memory usage using struct as value is significantly smaller than the memory usage using pointers as value. This is because:

Reduces pointer storage overhead。
Reduced extra heap memory allocation。
Reduced the burden on GC, because struct's memory management is simpler and does not involve pointer tracking and recycling.

2. The structure of hash buckets

1. Hash calculation

When we insert a key-value pair into the map, we first have the hash calculation of the key.GoA hash function is built in to calculate the hash value of a key. The hash value is a64integer of bits.

2. Bucketing basis

Maps in Go are stored in multiple buckets. The number of buckets is usually a power of 2, which makes it easy to locate a specific bucket through bit operations. The high and low eight bits of the hash value are used for bucket and in-bucket positioning, respectively:

Top 8 bits: Used to determine the bucket position in the hash table.
Low 8 bits: used for in-bucket search.

3. Bucket structure

8 key-value pairs can be stored in each bucket. When there are more than 8 elements in a bucket, Go uses the overflow bucket to store additional key-value pairs. The structure of the barrel is as follows:

type bmap struct {
    tophash [bucketCnt]uint8
    keys    [bucketCnt]keyType
    values  [bucketCnt]valueType
    overflow *bmap
}

tophash: The high eight digits of the hash value of the stored key.

keys: Storage key.

values: Store the corresponding value.

overflow: Pointer to overflow bucket.

4. Insert process

When inserting a key-value pair, the process is as follows:

Calculate hash value: Hash the key to get the hash valuehash。
Positioning bucket:passhash >> (64 - B)（Bis the logarithm of the number of buckets) to get the index of the bucketindex。
Search in the bucket:passhash & (bucketCnt - 1)Get the index in the bucket. Then by comparisontophashThe values in the array are positioned to the specific key-value pair storage location.
Store key-value pairs: Store key-value pairs to the corresponding location, and if the current bucket is full, a new overflow bucket is allocated to store additional key-value pairs.

5. Search process

The search process is similar to insertion:

Calculate hash value: Hash the key to get the hash valuehash。
Positioning bucket:passhash >> (64 - B)Get the index of the bucketindex。
Search in the bucket:passhash & (bucketCnt - 1)Get the index in the bucket and then in the correspondingbmapSearch intophashandkeysMatching keys in the array. If not found in the current bucket, continue to look for the overflow bucket.

3. Map expansion process

1. Expansion trigger conditions

Capacity expansion is usually triggered in two cases:

Excessive loading factor: The load factor is the ratio of the number of elements to the number of buckets in map. The load factor threshold in Go language is usually 6.5, and capacity expansion will be triggered when the load factor exceeds this value.
Too many overflow buckets: When there are too many overflow buckets, capacity expansion will also be triggered.

2. Specific steps of the expansion process

Initialize a new bucket array: When capacity expansion is required, Go allocates a new bucket array, which is usually twice the size of the old bucket array, and sets the relevant metadata to indicate that the map is scaling.
Mark migration status: In the internal structure of map, there will be a flag (rehash index) indicating the currently migrated bucket location. The initial value is 0.
Migrate some data: Every time a map is inserted or searched, some data from the old bucket will be migrated to the new bucket. One or more buckets are migrated at a time, the number depends on the complexity of the operation.
Update rehash index: After the migration is completed, update the rehash index so that the next operation continues to migrate the data in the next bucket.
Complete expansion: When all data from old buckets are migrated to the new bucket, update the map's metadata, point to the new bucket array, and clear the expansion status flag.

4. Recover map's panic

Working mechanism of panic and recover

panic：
- panicUsed to cause a panic, usually used when encountering severe errors that cannot be recovered.
- whenpanicWhen called, the normal execution process of the program will be interrupted and begins to expand upward along the call stack, calling the function layer by layer.deferStatement until it encountersrecoverOr the program crashes.
recover：
- recoverUsed to restore normal execution of the program, usually indeferCalled in the function.
- If indeferCalled in the statementrecover, and the current stack frame is in a panic state, thenrecoverWill catch this panic, stop the expansion of the stack, and return to pass it topanicvalue.
- If you are not in panic state, callrecover, it will returnnil, no treatment is done.

In Go,panicandrecoveris two mechanisms used to deal with exceptions and error recovery. Understanding how they work is very important for writing robust Go code. The following is correctpanicandrecoverThe detailed explanation of the mechanism and their inmapApplication in .

Working mechanism of panic and recover

panic：
- panicUsed to cause a panic, usually used when encountering severe errors that cannot be recovered.
- whenpanicWhen called, the normal execution process of the program will be interrupted and begins to expand upward along the call stack, calling the function layer by layer.deferStatement until it encountersrecoverOr the program crashes.
recover：
- recoverUsed to restore normal execution of the program, usually indeferCalled in the function.
- If indeferCalled in the statementrecover, and the current stack frame is in a panic state, thenrecoverWill catch this panic, stop the expansion of the stack, and return to pass it topanicvalue.
- If you are not in panic state, callrecover, it will returnnil, no treatment is done.

Use panic and recover in map

On GomapIn , some operations (such as concurrent read and write unlockedmap) will triggerpanic. ThesepanicCan berecoverCapture and process to prevent program crashes.

package main

import (
    "fmt"
)

func main() {
    defer func() {
        if r := recover(); r != nil {
            ("Recovered from panic:", r)
        }
    }()

    // Create a map    m := make(map[string]string)

    // Raise panic operation    causePanic(m)

    ("This line will be executed because panic was recovered.")
}

func causePanic(m map[string]string) {
    // Trying to access map concurrently here may cause panic    // Simulate concurrency problems and directly cause panic    panic("simulated map access panic")
}

5. How does a map detect that it is in a competition state

In Go language, the concurrent access of maps refers to multiple goroutines reading and writing the same map at the same time without proper synchronization protection. Go's built-in map type raises panic when concurrent read and write to prevent data race and undefined behavior. This detection is mainly done through Go compiler and runtime implementations, rather than features directly supported by the underlying hardware.

Competition detection mechanism

Compiler stake：
- At compile time, the Go compiler inserts specific detection code at the location of the code that reads and writes the map. These detection codes check at runtime whether the map is in concurrent access state.
Runtime check：
- The runtime detection code tracks the map's access. Panic is raised when multiple goroutines are detected to read and write maps at the same time. Specifically, the Go runtime records the access status of each map. If concurrent access is detected, it does not pass the synchronization mechanism (such as), will trigger panic.

package main

import (
    "fmt"
    "sync"
)

func main() {
    m := make(map[int]int)
    var wg 
    var mu 

    // Start multiple goroutines to write maps concurrently, unlocked protection will trigger panic    for i := 0; i &lt; 10; i++ {
        (1)
        go func(i int) {
            defer ()
            // Uncomment the following line to view unlocked concurrent write operations            // m[i] = i

            // Use mutex to protect concurrent write operations            ()
            m[i] = i
            ()
        }(i)
    }

    ()

    // Print map content    ()
    for k, v := range m {
        ("key: %d, value: %d\n", k, v)
    }
    ()
}

6. The difference between map locking

- Use scenarios：
  - Applicable to concurrent scenarios with more reads and fewer reads, and is simple and efficient.
  - useorProtecting ordinary maps is suitable for scenarios where complex concurrent control or more write operations are required.
- performance：
  - The performance is superior when there are more reads and fewer writes, but when the write operations are frequent, the performance may not be as good as ordinary maps protected by using mutex locks.
  - useorIt can provide a better performance balance between read and write operations, especially when there are many write operations.
- Complexity：
  - The concurrent control is encapsulated, easy to use, and does not require manual locking.
  - useorIt requires manual locking and unlocking, and the code is relatively complex, but more flexible.
- Method support：
  - Provide some special methods (such asLoadOrStore、Range) is convenient for use in specific scenarios.
  - useorThe protected ordinary map can freely define its own methods, which is more flexible, but requires more code.

The above is a detailed explanation of the attention issues when using maps in Golang. For more information about Golang's use of maps, please follow my other related articles!