On weighted random numbers
To help with understanding, let's first look at a comparison of the three types of randomized problems:
1. There are n records, from which m records are selected, and the order of the selected records does not matter.
Idea: Iterate through all the records by rows, and take one data about n/m apart.
2. In the case of class 1, it is also required that the m records selected are randomly ordered
Idea for implementation: Given n records, respectively, add a column of markers, the value of a randomly selected between 1 and n non-repeating data.
3. In contrast to questions 1 and 2, if the records are weighted, how to combine the weights to randomly select them. For example, if the weight of A is 10, the weight of B is 5, and the weight of C is 1, then AABB should probably appear when 4 are randomly selected.
The third category of questions is the focus of this paper.
Ideas for implementation: A:10, B:5, C:1 three records on the random selection of four, for example, (whether or not to weight sorting this does not matter)
with regards to
A 10
B 5
C 1
First, assign the value of line n to line n plus line n-1, and execute recursively as follows:
A 10
B 15
C 16
Then one number at a time is randomly selected from [1,16], A is selected if it falls between [1,10], B is selected if it falls between (10,15], and C is selected if it falls between (16,16]. The graphs are as follows, and whoever occupies a large interval (with a high weight) has a higher probability of being selected.
Use in sweepstakes and game popping equipment
Randomization with weights is heavily used in game development, with all sorts of raffles and equipment explosions.
The operation configures the probability of each item appearing as needed.
The idea behind the weighted randomized algorithm we're talking about today is simple: "Take all the items and form them into intervals according to their weights, and the intervals with the highest weights are the largest. Think of it as a pie chart. Then, throw the dice and see which interval it lands on."
For example, there's a year-end raffle for an iphone/ipad/itouch.
The weights configured by the organizer are [('iphone', 10), ('ipad', 40), ('itouch', 50)].
The idea can be illustrated with one line of code, (['iphone']*10 + ['ipad']*40 + ['itouch']*50).
In the following, we write it as a generic function.
#coding=utf-8 import random def weighted_random(items): total = sum(w for _,w in items) n = (0, total)#Throwing the dice in the pie chart for x, w in items:# Iterate to find the interval where the dice are located # if n<w: break n -= w return x print weighted_random([('iphone', 10), ('ipad', 40), ('itouch', 50)])
The above code is intuitive enough, but careful attention will find that each time the total will be calculated, each time a linear traversal of the interval for the subtraction operation. In fact, we can first store, check the table on the line. Use accumulate + bisect bisect lookup.
The more items there are, the more significant the improvement in binary search performance.
#coding=utf-8 class WeightRandom: def __init__(self, items): weights = [w for _,w in items] = [x for x,_ in items] = sum(weights) = list((weights)) def accumulate(self, weights):#Accumulate. E.g. accumulate([10,40,50])->[10,50,100] cur = 0 for w in weights: cur = cur+w yield cur def __call__(self): return [bisect.bisect_right( , (0, ))] wr = WeightRandom([('iphone', 10), ('ipad', 40), ('itouch', 50)]) print wr()