Python's random module and weighted random algorithm python implementation method

random is used to generate random numbers, which we can use to generate numbers randomly or select strings.

•(x) Change the seed of the random number generator.

Generally, there is no need to set seed specifically, Python will automatically select seed.

•() Used to generate a random floating point number n,0 <= n < 1

•(a,b) Used to generate a random floating point number within a specified range, and the generated random integer a<=n<=b;

•(a,b) Used to generate an integer within a specified range, a is the lower limit and b is the upper limit, and the generated random integer a<=n<=b; if a=b, n=a; if a>b, an error is reported

•([start], stop [,step]) From the specified range [start,stop), get a random number from the set incremented by the specified cardinality. The default value of the cardinality is 1

•(sequence) Get a random element from the sequence. The parameter sequence represents an ordered type, not a specific type, generally refers to list, tuple, string, etc.

•(x[,random]) Used to disrupt (shuffle) elements in a list, which will change the original list

•(sequence,k) Randomly obtain k elements from the specified sequence as a fragment and returns it without changing the original sequence

Now that we have the basic knowledge, let’s implement a weighted random algorithm:

Weighted random algorithms are generally used in the following scenarios: there is a set S, such as A, B, C, and D. At this time, we want to randomly draw one from it, but the probability of drawing is different. For example, we hope that the probability of drawing A is 50%, the probability of drawing B and C is 20%, and the probability of D is 10%. Generally speaking, we can attach a weight to each item, and the probability of extraction is proportional to this weight. Then the above set becomes:

{A:5，B:2，C:2，D:1}

Method 1:

The easiest way is this:

Expand the sequence according to the weight value to: lists=[A,A,A,A,A,A,B,B,C,C,D], and then (lists) randomly select one. Although the time complexity of the selection in this way is O(1), once the data volume is large, the space consumption will be too high.

# coding:utf-8
import random


def weight_choice(list, weight):
  """
   :param list: Sequence to be selected
   :param weight: list weight sequence
   :return: The selected value
   """
  new_list = []
  for i, val in enumerate(list):
    new_list.extend(val * weight[i])
  return (new_list)


if __name__ == "__main__":
  print(weight_choice(['A', 'B', 'C', 'D'], [5, 2, 2, 1]))

Method 2:

The most commonly used method is:

Calculate the sum of the weights, and then randomly select a number R between 1 and sum, then traverse the entire set, count the sum of the weights of the traversed items. If it is greater than or equal to R, stop traversing and select the item you encountered.

Take the above set as an example. Sum equals 10. If it goes to 1-5 randomly, it will exit the traversal when it traversing the first number. Meets the selected probability.

When selecting, you need to traverse the set, and its time complexity is O(n).

# coding:utf-8
import random

list = ['A', 'B', 'C', 'D']


def weight_choice(weight):
  """
   :param weight: list weight sequence
   :return: The index of the selected value in the original list
   """
  t = (0, sum(weight) - 1)
  for i, val in enumerate(weight):
    t -= val
    if t &lt; 0:
      return i


if __name__ == "__main__":
  print(list[weight_choice([5, 2, 2, 1])])

Method 3:

The original sequence can be sorted by weight first. When traversing in this way, terms with high probability can be encountered quickly, reducing the traversed terms. (Because rnd decrements the fastest (subtract the maximum number first))

Compare {A:5, B:2, C:2, D:1} and {B:2, C:2, A:5, D:1}

The expectation of the traversal step number in the former is 5/10*1+2/10*2+2/10*3+1/10*4=19/10 while the latter is 2/10*1+2/10*2+5/10*3+1/10*4=25/10.

This improves the average selection speed, but the original sequence sorting also takes time.

First, create a prefix and sequence of weight values, and then after generating a random number t, you can use dichotomy to find it from this prefix and sequence. Then the selected time complexity is O(logn).

# coding:utf-8
import random
import bisect

list = ['A', 'B', 'C', 'D']


def weight_choice(weight):
  """
   :param weight: list weight sequence
   :return: The index of the selected value in the original list
   """
  weight_sum = []
  sum = 0
  for a in weight:
    sum += a
    weight_sum.append(sum)
  t = (0, sum - 1)
  return bisect.bisect_right(weight_sum, t)


if __name__ == "__main__":
  print(list[weight_choice([5, 2, 2, 1])])

The above python implementation method of python's random module and weighted random algorithm is all the content I have shared with you. I hope you can give you a reference and I hope you can support me more.