introduction
Ordinary maps in Go are non-thread-safe. If you want to access a map thread-safely, there are two ways: map+mutex and the other is native. This article will introduce in detail how the underlying layer is implemented and some commonly used scenarios.
How to ensure thread safety?
The data structure is as follows:
type Map struct { mu Mutex // Lock to ensure thread safety of writing operations and dirty promotion to read read // readOnly read-only map dirty map[any]*entry // Dirty map, when there is data inside, it must contain the data in the read misses int // Read misses, when a certain number of times is reached, the protective gear in dirty will be promoted to read}
If we only look at this structure, we may have the following questions:
- I also used mutex. Isn’t the implementation method of map+mutex the same?
- What are misses used for?
- The type of read is a
And dirty is
map[any]*entry
, Why is it different?
I also used mutex. Isn’t the implementation method of map+mutex the same?
- In essence, it is implemented through map+mutex implementation
- By increasing the read map, the lock probability is reduced when performing a read operation and the read performance is increased.
What are misses used for?
- Misses is used to mark the number of misses in read
- When misses reaches a certain value, the promotion of dirty will be triggered (promoted to read)
The specific source code is as follows:
// When the key is not hit in the read while performing the Load operation, a miss record is performedfunc (m *Map) missLocked() { // Add 1 to count ++ // 2. Determine whether dirty meets the promotion conditions if < len() { // 2.1 returns directly if it is not satisfied return } // 3. Transfer the data in dirty to the m of the read, and the data in the old read is discarded (readOnly{m: }) // 4. Clear dirty = nil // 5. Reset the miss count = 0 }
From the code you can see:
- When the misses value is greater than or equal to the number of data in dirty, the promotion of dirty will be triggered
- When the dirty promotion is promoted, the read is reset to a newly generated readOnly, where m is the new dirty, and amended is the default value false, ensuring that the amended is automatically set to false every time the promotion is triggered.
- When the dirty promotion, no copy of the data was triggered
The type of read is a
And dirty ismap[any]*entry
, Why is it different?
type readOnly struct { m map[any]*entry // Data in read map amended bool // Mark whether there is a key in the dirty map that is not in the read. If so, this value is true} type entry struct { p // *interface{} A pointer to specific data}
- The underlying layer of read type is the stored readOnly type, and the readOnly type is just
map[any]*entry
Added aamended
mark - If amended is false, it means that there is no data in the dirty. At this time, a dirty operation can be avoided (locking will be added), thereby reducing meaningless locking.
- read is declared as
Type is to satisfy the data consistency when multiple Goroutines read read simultaneously in the case of lock-free
Which scenarios are suitable for?
It is more suitable for scenarios where more reads, writes less, and when maps need to be frequently written, the map+mutex solution can achieve better performance by controlling the lock force.
Traversal operations are not supported because the design of read and write separation causes some unfinished modification operations during the traversal process, resulting in uncertain traversal results.
Why is it suitable for scenarios where more reads, less writes?
The reading method isLoad
Method, the specific source code implementation is as follows:
func (m *Map) Load(key any) (value any, ok bool) { // 1. Force the data in read to readOnly read, _ := ().(readOnly) // 2. Query the key from the read to check whether the data exists e, ok := [key] // 3. If the read does not exist and the amended mark shows that there is a key that is not in the read in dirty, go to dirty to query if !ok && { // 3.1 starts operating dirty, lock is required to ensure thread safety () // 3.2 Check again from read to avoid the data in dirty before Lock execution triggering the promotion to read operation read, _ = ().(readOnly) e, ok = [key] // 3.3 Same as 3 if !ok && { // 3.4 Query from dirty e, ok = [key] // 3.5 Whether or not the data is queried from dirty, it is equivalent to missing from read. The miss count needs to be updated (updating the count may trigger the promotion of dirty data) () } // 3.5 Operation is completed and unlocked () } // 4. The detection result, ok is false means that no data was found // ok is true in two situations: 1. Query the data from read, and the read hits; 2. Query the data from dirty // ok is false is divided into two situations: // No hit, but false // No hit is true, but it does not exist in dirty either if !ok { return nil, false } // Return the query data (this value does not necessarily exist, you need to determine whether it is a normal value or a nil based on p) return () } // When you get a pointer to the value of a key from read or dirty, you need to load the value of the corresponding pointerfunc (e *entry) load() (value any, ok bool) { // 1. Get the value of the corresponding address in the yard operation p := (&) // 2. If the value already does not exist or is marked as deleted, return nil, false if p == nil || p == expunged { return nil, false } // 3. Return the specific value, true return *(*any)(p), true }
Each time data is read, read from the read first, and the reading of data in the read does not require locking; when the read missed in the read and the amended mark shows that there is data not in the read, a dirty query is performed and locking is added. When there are more reads and fewer writes, most of the time the data is in the read, so locking can be avoided to improve the performance of concurrent reads.
The writing operation method isStore
method
func (m *Map) Store(key, value any) { // 1. Query from read whether the data already exists, and try to modify it if it exists read, _ := ().(readOnly) if e, ok := [key]; ok && (&value) { // 1.1 The data exists in read and the update is completed. Return directly return } //Not in it, prepare to operate dirty to add locks to ensure thread safety () read, _ = ().(readOnly) if e, ok := [key]; ok { // 3. If the data is queried in the read, check whether it has been marked as deleted. if () { // 3.1 If marked as deleted, you need to clear the mark and add it to dirty [key] = e } // 3.2 Update the entry's value (&value) } else if e, ok := [key]; ok { // 4 If it exists in dirty, the value of entry is directly updated (&value) } else { // 5 If the previous key does not exist, add the new key-value to dirty if ! { // 5.1 If the tag shows that there is no key in read before dirty, reset dirty and mark amended to true // 5.1.1 Re-add all entry in the read that is not marked as deleted to dirty () // 5.1.2 Update tags (readOnly{m: , amended: true}) } // 5.2 Add new key-value to dirty [key] = newEntry(value) } () } // Determine whether entry is marked as deleted, if it is modified to nilfunc (e *entry) unexpungeLocked() (wasExpunged bool) { return (&, expunged, nil) }
Write operations are divided into the following situations:
- Data in read
- Data is in read but has been marked as deleted
- Data in dirty
- A brand new data
When the data is in read, the Store will try to modify the data through atomic operations. If the atomic operation is successful, it is equivalent to the completion of data update;
The specific code is as follows:
func (e *entry) tryStore(i *any) bool { for { // 1. Get the specific value of entry p := (&) // 2. If the data has been deleted by mark, return false if p == expunged { return false } // 3. Update the current value and return true if (&, p, (i)) { return true } } }
When the data has been marked as deleted in the read, the entry needs to be added to the dirty and updated the value (here, the new entry is essentially added, just taking the address space of the previous entry)
When the data is in dirty, the pointer of the entry is directly updated through atomic operations;
When the data is a new data, a new entry will be created and added to the dirty, and if it is the first data in the dirty, it will be executeddirtyLocked
Method, add the current data (unmarked deleted) in the read to dirty.
The specific implementation of dirtyLocked is as follows:
func (m *Map) dirtyLocked() { //1 Reset operation can only be performed if dirty is nil before if != nil { return } // 2 Get the data in the read read, _ := ().(readOnly) // 3 Initialize dirty = make(map[any]*entry, len()) // 4 traversal read for k, e := range { // 4.1 Add objects that are not nil and are not marked as deleted to dirty if !() { [k] = e } } } // Determine whether the entry is marked as deleted. If the entry value is nil, mark it as deletedfunc (e *entry) tryExpungeLocked() (isExpunged bool) { // 1 Get the value of entry p := (&) for p == nil { // 2 If the value of the current entry is empty, try to mark this key as delete if (&, nil, expunged) { return true } p = (&) } // 3 Determine whether p is marked as deleted return p == expunged }
Through the above analysis, we can find that when writing operations are frequent, there are the following problems
- There is a large amount of data in dirty, and the read query will be likely to be unable to hit, which will cause the query to query the two maps of read and dirty and there are additional redundant operations, so the read performance is greatly reduced.
- Frequent unhooks lead to the promotion of dirty data. Although the promotion is only pointer switching and dirty clearing, the first write after each promotion will cause dirty to copy the read, greatly reducing performance.
- Each write operation requires additional checks and operations in order to be uncertain whether the data is on read or dirty or new data.
- In some cases, data duplication exists in dirty and read, and the memory usage will be higher.
In summary, when writing operations are frequent, the performance in all aspects is greatly reduced; and for some data that only rarely write operations (such as table data that is loaded once at the server startup), the performance of concurrent operations can be improved.
How to delete data
In the above dirtyLoacked method we see that after initializing dirty, the data in the read will be traversed and objects that are not nil and are not marked as deleted are added to the dirty. From this we can see that the data in the read will not be deleted immediately when deleted, just mark the object as nil or expunged.
The specific code is as follows: (The Delete method is essentially the execution of LoadAndDelete)
func (m *Map) LoadAndDelete(key any) (value any, loaded bool) { read, _ := ().(readOnly) // 1 Query data from read e, ok := [key] // 2 If the read does not exist, and the amended mark shows that there is a key that is not in the read in dirty, go to dirty to query if !ok && { // 3.1 starts operating dirty, lock is required to ensure thread safety () read, _ = ().(readOnly) // 3.2 Check again from read to avoid the data in dirty before Lock execution triggering the promotion to read operation e, ok = [key] if !ok && { // 3.3 Query from dirty e, ok = [key] // 3.4 Remove from dirty delete(, key) // 3.5 Whether or not the data is queried from dirty, it is equivalent to missing from read. The miss count needs to be updated (the dirty promotion will be triggered after the condition is met) () } () } // 4 Like load, here ok is true, which may be to read data in read or dirty. // Although it has been deleted in dirty, you need to clear the pointer p in the entry if ok { // 5 Tag Delete return () } return nil, false }
As shown in the above code5
As shown in the section, no matter where the entry exists, it is ultimately necessary to mark the entry as deleted. If there is a read, it will not be added to the dirty when the dirty is initialized, and the data in the read will be discarded when the dirty is promoted again. If there is a dirty, the data is directly cleared and the entry is marked to be deleted.
Range method
It does not support traversal, but provides a Range method, which is not the samerange
The same keyword traversal of map.
The specific functions of the Range method:
- Iterate through all elements in read and execute function f on each element in it
- If any element executes function f as an argument, the traversal will be interrupted immediately
Although Range will promote dirty data once in the initial stage of execution, it still cannot guarantee that there will be no new data during the execution process, so Range only traverses the data in the latest read, not all the data.
//Travelfunc (m *Map) Range(f func(key, value any) bool) { read, _ := ().(readOnly) // 1 If there is unpromoted data, then perform a dirty data promotion first. if { () read, _ = ().(readOnly) if { read = readOnly{m: } (read) = nil = 0 } () } // 2 traverse all entry in read and execute f functions separately for k, e := range { v, ok := () if !ok { continue } // 3 When an entry executes the f function and returns false, the traversal will be interrupted if !f(k, v) { break } } }
other
Q: When to clear the value that is markedly deleted
Answer: When data is stored in dirty for the first time, dirty will be triggered to copy the contents in the read. When copying again, only non-nil and unmarked deleted entry is copied. When dirty is promoted again, the data in the read is overwritten, realizing the deletion of the tagged deleted entry.
The above is the detailed explanation of the example of Golang's underlying implementation scenario. For more information about Golang's underlying implementation, please pay attention to my other related articles!