SoFunction
Updated on 2025-03-05

Go language custom collection Set

1. Go language practical combat—Custom collection Set

It works in Go languageHash TableImplemented dictionary (Map) type, but there is no set in the standard data type (Set) This data type. CompareSet andMap The main characteristics of the following are similar characteristics:

All elements in them are non-repeatable.

They can only take out all the elements in iteratively.

The order in which the elements in them is iterated has nothing to do with the order in which the element is inserted, and there is no guarantee of any order.

However, there are some differences between them, as follows:

Set The element ofMap The element of , is a key-value pair.

Set The element of the non-repeatable means that no two single values ​​can be equal.MapThe element of the non-repeatable means that the values ​​of the keys in any two key-value pairs cannot be equal.

From the above features, we can see that the set type can be (Set) as dictionary type (MapA simplified version of ) That is, it can be usedMap Let's write oneSet Type implementation. In fact, in Java, Class is used Classes are supported as underlying. So here is fromHashSetStart and gradually abstract the collectionSet

1. Define HashSet

First, in the workspacesrc Directory code package basic/set(Can be defined by yourself, but it should be consistent later) Create a name calledhash_set.go source code file.

According to the code packagebasic/set It can be seen that the source code filehash_set.go The package declaration statement (for some rules, you can see the previous series of blog posts) as follows:

package set

It is mentioned above that the collection type can be used as a simplified version of the dictionary type. Now oursHashSet It uses dictionary type as its underlying implementation.HashSet The statement is as follows:

type HashSet struct {
  m map[interface{}]bool
}

As stated aboveHashSet The only field in the type ismap[interface{}]bool. Such a dictionary type is selected because the key type of dictionary m is set tointerface{},letHashSet The element of the um can be of any type, because here it needs to be stored using the key in the value of mHashSet Element value of type. The benefits of using the bool type as the element type of m are as follows:

From the perspective of the storage form of values,bool The type value only takes up one byte.

From the perspective of the representation of the value,bool There are only two values ​​of type -true and false. Moreover, both values ​​are predefined constants.

Bundlebool A type as a value type is more conducive to determining whether there is a certain key in the dictionary type value. For example: If true is always used as the value of the element in it when adding a key-value pair to the value of m, the resultant value of the index expression m["a"] can always reflect whether the value of m contains a key-value pair with a key-value pair with a key "a". formap[interface{}]boolFor values ​​of types, as follows:

if m["a"] {// Determine whether m contains key-value pairs with keys of "a"  //Omit other statements}

As aboveHashSet The basic structure of the type has been determined, now consider how to initialize itHashSet Type value. Since the zero value of the dictionary type isnil, and usenew Function to create aHashSet Type value, that isnew(HashSet).mThe evaluation result will be anil (aboutnew You can check another blog post by Go Language Learning Notes 5). Therefore, here we need to write a dedicated creation and initializationHashSet A function of type value, the function declared as follows:

func NewHashSet() *HashSet {
  return &HashSet{m: make(map[interface{}]bool)}
}

As can be seen above, usemakeThe function initializes the field m. At the same time, pay attention to the observation functionNewHashSet The type of result declaration is *HashSet InsteadHashSet, the purpose is to make this result value method set contains the call receiver typeHashSet or *HashSet all methods. The benefits of doing this will be written laterSet It will be explained when the interface type is.

2. Implement the basic functions of HashSet

According to other programming languagesHashSet As you can see, most of them should provide basic functions as follows:

Add element value.

Delete element values.

Clears all element values.

Determines whether an element value is included.

Gets the number of element values.

Determine whether the value is the same as other HashSet types.

Get all element values, i.e. generate an iterable snapshot.

Gets its own string representation.

Now these functions are implemented one by one, and readers can implement them by themselves. The following is for reference only.

(1).Add element values

//Method Add will return a result value of bool type to indicate whether the operation of adding element values ​​is successful.//The receiver type in the declaration of method Add is *HashSet.func (set *HashSet) Add(e interface{}) bool {
  if ![e] {//The current value of m has not yet contained a key-value pair with the value of e as the key.    [e] = true//Add key-value pairs with keys as e (represented by the value of m) and element as true    return true //Added successfully  }
  return false //Add failed}

Used here*HashSet InsteadHashSet, mainly from the perspective of saving memory space, the analysis is as follows:

whenAdd The receiver type of the method isHashSet When every call to it needs to be the current oneHashSet The type value is copied once. AlthoughHashSet There is only one field in the type that references the type, but this is also an overhead. And there is no consideration hereHashSet Fields in types may become more common.

whenAdd The receiver type of the method is*HashSet When it is called, the current copy is*HashSet The type value is just a pointer value. In most cases, the memory space occupied by a pointer value will always be small by the other type of value it points to. No matter how much memory space is required for the other type of value pointed to by a pointer value, the memory space it consumes is always unchanged.

(2). Delete element values

//Calling delete built-in function to delete the dictionary value supported by HashSet internallyfunc (set *HashSet) Remove(e interface{}) {
  delete(, e)//The first parameter is the target dictionary type, and the second parameter is the key of the key-value pair to be deleted}

(3). Clear all elements

//Reassign the field m in HashSetfunc (set *HashSet) Clear() {
   = make(map[interface{}]bool)
}

If the receiver type isHashSet, The assignment statement in this method only serves assigning a value to a field m in a copy of the current value, while the field m in the current value will not be reassigned. methodClear After the assignment statement in the currentHashSet Elements in type values ​​are equivalent to being emptied. The old dictionary value that has been unbound to field m becomes useless data because it no longer has a binding relationship with any program entity. It will be discovered and recycled by the Go garbage collector at some point later.

(4). Determine whether an element value is included.

//Method Contains is used to determine whether its value contains a certain element value.//The result of this judgment is due to the field m with element type boolfunc (set *HashSet) Contains(e interface{}) bool {
  return [e]
}

When put oneinterface{} When a type value is added as a key to a dictionary value, Go will get this firstinterface{} The actual type of the type value (i.e., dynamic type), and then use the corresponding typehash The function performs the valuehash Operation, that is,interface{} Type values ​​can always be calculated correctlyhash value. However, the keys of dictionary type cannot be a function type, dictionary type or slice type, otherwise it will cause a runtime panic and prompt as follows:
panic: runtime error: hash of unhashable type <The name of a function type, dictionary type, or slice type>

(5). Get the number of element values.

//Method Len is used to obtain the number of HashSet element valuesfunc (set *HashSet) Len() int {
  return len()
}

(6). Determine whether the values ​​are the same as other HashSet types.

//Method Same is used to determine whether the two HashSet types are the same.func (set *HashSet) Same(other *HashSet) bool {
  if other == nil {
    return false
  }
  if () != () {
    return false
  }
  for key := range  {
    if !(key) {
      return false
    }
  }
  return true
}

twoHashSet A necessary condition for the same type value is that the elements they contain should be exactly the same. becauseHashSet The iteration order of elements in type values ​​is always uncertain, so there is no need to worry about whether the two values ​​are consistent in this regard. If you want to judge twoHashSet Whether the type value is the same value needs to be compared with the memory address using pointer operation.

(7). Get all element values, that is, generate an iterable snapshot.

The so-called snapshot is the image of the target value at a certain moment. For aHashSet For type values, the iteration order of elements in its snapshot can always be determined, and the snapshot only reflects theHashSet The state of the type value at a certain moment. In addition, it is also necessary to select a type as a snapshot from the data types that are iterable and sequentially determined. This type must have a single value as the element, so the dictionary type should not be excluded first. BecauseHashSet The number of elements in a type value is always not fixed, so it cannot be represented by an array type value. As shown in the above analysis, the type of snapshot that can be used in Go language should be a slice type or channel type.

//Method Elements is used to generate snapshotsfunc (set *HashSet) Elements() []interface{} {
  initialLen := len()//Get the length of field m in HashSet, that is, the number of elements contained in m  //Initialize a variable snapshot of type []interface{} to store the element value in m's value  snapshot := make([]interface{}, initialLen)
  actualLen := 0
  //Set the iteration value to the specified element position of the snapshot value (the value of the variable snapshot) in the established order, and this process does not create any new values.  for key := range  {
    if actualLen &lt; initialLen {
      snapshot[actualLen] = key
    } else {//The number of elements in the value of m increases, so that the actual number of iterations is greater than the length of the previously initialized snapshot value      snapshot = append(snapshot, key)//Use the append function to append element value to the snapshot value.    }
    actualLen++//The actual number of iterations  }
  //For the slice value of type []interface{} that has been initialized, the values ​​at the location of the element that has not been displayed are nil.  //The number of elements in the value of m is reduced, so that the actual number of iterations is less than the length of the previously initialized snapshot value.  //There are several elements with no meaning at the end of the snapshot value,  //You can remove useless element values ​​from the snapshot value by snapshot = snapshot[:actualLen].  if actualLen &lt; initialLen {
    snapshot = snapshot[:actualLen]
  }
  return snapshot
}

Notice:existElements Some measures are taken in the method to concurrently access and modify the value of m. However, since the value of m itself is not concurrently safe, it cannot be guaranteedElements The execution of the method will always be accurate. To achieve true concurrency security, some auxiliary means are also needed, such as read and write mutex.

(8). Get your own string representation.

//The signature of this String method is considered an idiom.  //The print function in the code package fmt always uses the result value of the String method with such signature that comes with the parameter value as the string representation of the parameter value.func (set *HashSet) String() string {
  var buf //Buffer as result value  ("HashSet{")
  first := true
  for key := range  {
    if first {
      first = false
    } else {
      (",")
    }
    (("%v", key))
  }
  //n := 1
  //for key := range  {
  // (("%v", key))
  // if n == len() {//No commas are added after the last element  // break;
  // } else {
  // (",")
  // }
  // n++;
  //}
  ("}")
  return () 
}

As mentioned above, the implementation type of Set with common functions has been fully written, and more advanced functions will be explained later to improve it.

3. Advanced features

The true judgment function contained in the collection Set. According to the description in set algebra, if set A really contains set B, then it can be said that set A is a superset of set B.

// Determine whether the set is a superset of the set otherfunc (set *HashSet) IsSuperset(other *HashSet) bool {
  if other == nil {//If other is nil, other is not a subset of set    return false
  }
  setLen := ()//Get the number of element values ​​of set  otherLen := ()//Get the number of other element values  if setLen == 0 || setLen == otherLen {//The number of element values ​​of set is equal to 0 or equal to the number of elements other    return false
  }
  if setLen &gt; 0 &amp;&amp; otherLen == 0 {//Other is the number of elements 0, and the number of set elements is greater than 0, then set is also the superset of other    return true
  }
  for _, v := range () {
    if !(v) {//Return false as long as there is a data in the set containing other data      return false
    }
  }
  return true
}

The operation of the set includesunion, intersection, differenceandSymmetrical difference set

Unity operationIt means combining all elements in two sets and combining them into one set.

Intersection operationIt refers to finding common elements in two sets and forming them into a set.

Collection ArightCollection BconductDifferential set operationThe meaning of finding only exists inCollection Abut does not exist inCollection Band make them into a collection.

Symmetrical difference set operation is similar but different. Symmetrical difference set operation refers to finding only existing inCollection Abut does not exist inCollection BThe elements in the group are found, and then the elements that only exist in set B but not in set A are found, and finally merge them and form a set.

Implement union operation

// Generate union of set set and set otherfunc (set *HashSet) Union(other *HashSet) *HashSet {
  if set == nil || other == nil {// Both set and other are nil, then their union is nil    return nil
  }
  unionedSet := NewHashSet()// Create a new HashSet type value, its length is 0, that is, the number of elements is 0  for _, v := range () {//Add elements in set to unionedSet    (v)
  }
  if () == 0 {
    return unionedSet
  }
  for _, v := range () {//Add elements from other to unionedSet, if the same is encountered, it will not be added (expressed in the Add method logic)    (v)
  }
  return unionedSet
}

Implement intersection operation

// Generate the intersection of the set set and the otherfunc (set *HashSet) Intersect(other *HashSet) *HashSet {
  if set == nil || other == nil {// Both set and other are nil, then their intersection is nil    return nil
  }
  intersectedSet := NewHashSet()// Create a new HashSet type value, its length is 0, that is, the number of elements is 0  if () == 0 {//The number of other elements is 0, and directly return to intersectedSet    return intersectedSet
  }
  if () &lt; () {//The number of elements of set is less than the number of elements of other    for _, v := range () {//Travel the set      if (v) {//Just add the shared common to set and other to the intersectedSet        (v)
      }
    }
  } else {//The number of elements of set is greater than the number of elements of other    for _, v := range () {//Travel over other      if (v) {//Just add the shared common to set and other to the intersectedSet        (v)
      }
    }
  }
  return intersectedSet
}

Difference set

// Generate set set to set other setfunc (set *HashSet) Difference(other *HashSet) *HashSet {
  if set == nil || other == nil {// Both set and other are nil, then their difference is nil    return nil
  }
  differencedSet := NewHashSet()// Create a new HashSet type value, its length is 0, that is, the number of elements is 0  if () == 0 { // If the number of other elements is 0    for _, v := range () {//Transf the set and add the element v in the set to the differentSet      (v)
    }
    return differencedSet//Return differentSet directly  }
  for _, v := range () {//The number of other elements is not 0, traverse set    if !(v) {//If other does not contain v, add v to differentSet      (v)
    }
  }
  return differencedSet
}

Symmetrical difference set

// Generate symmetric difference sets of sets one and set otherfunc (set *HashSet) SymmetricDifference(other *HashSet) *HashSet {
  if set == nil || other == nil {// Both set and other are nil, then their symmetric difference set is nil    return nil
  }
  diffA := (other)//Generate set set for the difference set for the other set  if () == 0 {//If the number of other elements is equal to 0, then the difference set of other for set set is empty, then diffA is directly returned    return diffA
  }
  diffB := (set)//Generate the difference set of other sets for set set  return (diffB)//Return the union of the set diffA and the set diffB}

4. Further reconstruct
Currently implementedHashSet Types provide some necessary collection operation functions, but in different application scenarios, more functional collection types may be required. When there are multiple collection types, an interface type should be extracted on them to identify their common behavior. in accordance withHashSet For a type declaration, you can declare the Set interface type as follows:

type Set interface {
  Add(e interface{}) bool
  Remove(e interface{})
  Clear()
  Contains(e interface{}) bool
  Len() int
  Same(other Set) bool
  Elements() []interface{}
  String() string
}

Notice: Set In-houseSame The signature and attachment of the methodHashSetType ofSame The methods vary. Here, the interface type method's signature cannot contain its implementation type. Therefore, the changes here are as follows:

func (set *HashSet) Same(other Set) bool {
  //Omit several statements}

ModifiedSame The purpose of the method is to*HashSet Type becomes an implementation type of Set interface type.

Advanced functions should be suitable for all implementation types and can be completely extracted and independent functions. And, these advanced methods should not be implemented repeatedly in each implementation type. The following is the modifiedIsSuperset Method declaration:

// Determine whether a collection one is a superset of a collection other// Readers should focus on the difference between IsSuperset and IsSuperset methods attached to HashSet typefunc IsSuperset(one Set, other Set) bool {
  if one == nil || other == nil {
    return false
  }
  oneLen := ()
  otherLen := ()
  if oneLen == 0 || oneLen == otherLen {
    return false
  }
  if oneLen &gt; 0 &amp;&amp; otherLen == 0 {
    return true
  }
  for _, v := range () {
    if !(v) {
      return false
    }
  }
  return true
}

The above is the entire content of the custom collection Set of Go language. I hope it will be helpful for everyone to learn Go language.