Today, let’s talk about how Go uses set. This article will cover two data structures: set and bitset.
Go's data structure
There are not many built-in data structures in Go. In work, the two most commonly used data structures are slice and map, namely slice and map. In fact, there are arrays in Go, and the bottom layer of the slice is an array, but because of the existence of slices, we rarely use them.
In addition to Go's built-in data structures, some data structures are provided by Go's official container packages, such as heap heap, list bidirectional linked lists and ring loopback linked lists. But today we won’t talk about them. For experienced people, these data structures will be used after reading the document.
What we will talk about today is set and bitset. As far as I know, some other languages, such as Java, have these two data structures. But Go is not available in any form yet.
Implementation ideas
Let’s first look at an article, and read it by accessing the address 2 basic set implementations[1]. The article introduces two ideas for implementing set in go, namely map and bitset.
If you are interested, you can read this article, let’s introduce it in detail next.
map
We know that the map's key is definitely unique, and this happens to be consistent with the characteristics of set, which naturally guarantees the uniqueness of members in set. Moreover, by implementing set through map, you can directly use the syntax of _, ok := m[key] when checking whether there is a certain element, which is highly efficient.
Let’s first look at a simple implementation, as follows:
set := make(map[string]bool) // New empty set set["Foo"] = true // Add for k := range set { // Loop (k) } delete(set, "Foo") // Delete size := len(set) // Size exists := set["Foo"] // Membership
It is easier to understand by creating map[string]bool to store a collection of strings. But there is another problem here. The value of map is of boolean type, which will cause set to occupy a certain amount of memory space, and set should not have this problem.
How to solve this problem?
Set value to an empty structure. In Go, the empty structure does not occupy any memory. Of course, if you are not sure, you can also prove this conclusion.
(struct{}{}) // The result is 0
The optimized code is as follows:
type void struct{} var member void set := make(map[string]void) // New empty set set["Foo"] = member // Add for k := range set { // Loop (k) } delete(set, "Foo") // Delete size := len(set) // Size _, exists := set["Foo"] // Membership
I saw someone on the Internet that encapsulated according to this idea and wrote an article [2], you can read it.
In fact, there is already a mature package on github called golang-set, which is also implemented using this idea. The access address golang-set[3], which says that Docker uses it. There are two set implementations in the package, thread-safe set and non-thread-safe set.
Demonstrate a simple case.
package main import ( "fmt" mapset "/deckarep/golang-set" ) func main() { // The default created thread-safe, if no thread-safe is required // It can be created using NewThreadUnsafeSet, and the usage methods are the same. s1 := (1, 2, 3, 4) ("s1 contains 3: ", (3)) ("s1 contains 5: ", (5)) // interface parameter, can pass any type ("poloxue") ("s1 contains poloxue: ", ("poloxue")) (3) ("s1 contains 3: ", (3)) s2 := (1, 3, 4, 5) // Converge ((s2)) }
The output is as follows:
s1 contains 3: true
s1 contains 5: false
s1 contains poloxue: true
s1 contains 3: false
Set{4, polxue, 1, 2, 3, 5}
The example demonstrates a simple way of using it. If you don’t understand it, look at the source code. The operation method names of these data structures are very common, such as Intersect, Difference, etc., you can understand it at first glance.
bitset
Let’s continue to talk about bitset. Each number in bitset can be represented by a bit. For an int8 number, we can use it to represent 8 numbers, which can help us greatly save data storage space.
The most common applications of bitset are bitmap and flag, namely bitmap and flag bits. Here, we first try to use it to represent the flag bits of some operations. For example, in a certain scenario, we need three flags to represent permissions 1, permissions 2 and permissions 3 respectively, and several permissions can coexist. We can use three constants F1, F2, and F3 to represent bit Mask respectively.
The sample code is as follows (cited from the article Bitmasks, bitsets and flags[4]):
type Bits uint8 const ( F0 Bits = 1 << iota F1 F2 ) func Set(b, flag Bits) Bits { return b | flag } func Clear(b, flag Bits) Bits { return b &^ flag } func Toggle(b, flag Bits) Bits { return b ^ flag } func Has(b, flag Bits) bool { return b&flag != 0 } func main() { var b Bits b = Set(b, F0) b = Toggle(b, F2) for i, flag := range []Bits{F0, F1, F2} { (i, Has(b, flag)) } }
In the example, we originally needed three numbers to represent these three flags, but now it can be done with a uint8. Some operations of bitset, such as setting Set, clear Clear, switching Toggle, and checking Has can be achieved through bit operations, and it is very efficient.
bitset has a natural advantage for collection operations and can be achieved directly through bit operators. For example, intersection, union, and difference sets, the examples are as follows:
- Intersection: a & b
- Unity: a | b
- Difference set: a & (~b)
The underlying languages, libraries, and frameworks often use this method to set flag bits.
In the above example, only a small amount of data is processed. Uint8 occupies 8 bits of space and can only represent 8 numbers. So can this idea be used in big data scenarios?
We can combine bitset and slices in Go and redefine the Bits type as follows:
type Bitset struct { data []int64 }
But this will also cause some problems. Setting up bit, how do we know where it is? If you think about it carefully, this position information contains two parts, namely, the number that holds the bit at the slice index position and which bit in the number, name them index and position respectively. Then how to get it?
The index can be obtained by dividing the entire index. For example, if we want to know which index of the bit representing 65 is in the slice, it can be obtained through 65 / 64. If it is for efficiency, it can also be implemented using bit operations, that is, substituting the division with shift, such as 65 >> 6, 6 represents the shift offset, that is, n of 2^n = 64.
position is the remainder of division, which we can obtain through modulus operations, such as 65 % 64 = 1. Also for efficiency, there are corresponding bit operations implementations, such as 65 & 0b00111111, that is, 65 & 63.
A simple example, as follows:
package main import ( "fmt" ) const ( shift = 6 mask = 0x3f // i.e. 0b00111111) type Bitset struct { data []int64 } func NewBitSet(n int) *Bitset { // Get location information index := n >> shift set := &Bitset{ data: make([]int64, index+1), } // Set bitset according to n [index] |= 1 << uint(n&mask) return set } func (set *Bitset) Contains(n int) bool { // Get location information index := n >> shift return [index]&(1<<uint(n&mask)) != 0 } func main() { set := NewBitSet(65) ("set contains 65", (65)) ("set contains 64", (64)) }
Output result
set contains 65 true
set contains 64 false
The above example functions are very simple. For demonstration, only two functions are created: bitset and contains. Others such as adding, deleting, intersecting, converging, and difference between different bitsets have not been implemented. Interested friends can continue to try.
In fact, someone has implemented the bitset package, github address bit[5]. You can read its source code, and the implementation idea is similar to the above introduction.
Here is a use case.
package main import ( "fmt" "/yourbasic/bit" ) func main() { s := (2, 3, 4, 65, 128) ("s contains 65", (65)) ("s contains 15", (15)) (15) ("s contains 15", (15)) ("next 20 is ", (20)) ("prev 20 is ", (20)) s2 := (10, 22, 30) s3 := (s2) ("next 20 is ", (20)) (func(n int) bool { (n) return false // Return true to terminate the traversal }) }
Execution results:
s contains 65 true
s contains 15 false
s contains 15 true
next 20 is 65
prev 20 is 15
next 20 is 22
2
3
4
10
15
22
30
65
128
The meaning of the code is easy to understand, which is some operations such as adding, deleting, retrieval and collection. It should be noted that the difference between bitset and the previous set is that the members of bitset can only be int integers, and there is no set flexibility. There are also fewer usage scenarios in normal times, mainly used in scenarios with high requirements for efficiency and storage space.
Summarize
This article introduces the implementation principles of two sets in Go, and on this basis, introduces the simple use of two packages corresponding to them. I think that through this article, the use of set in Go can basically be done.
In addition to these two packages, add two, zoumo/goset[6] and /willf/bitset[7].
References
[1]2 basic set implementations: /golang/implement-set/
[2] One article:https:///article/
[3]golang-set: /deckarep/golang-set
[4]Bitmasks, bitsets and flags: /golang/bitmask-flag-set-clear/
[5]bit: /yourbasic/bit
[6]zoumo/goset: /zoumo/goset
[7]/willf/bitset: /willf/bitset
The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.