Q: How to generate fixed-length random strings simply and quickly in Go language?
A:
The problem is "the fastest and easiest way". Next we will iterate step by step to finally achieve the fastest way. The benchmark code for each iteration is placed at the end of the answer.
All solutions and benchmark codes are available inGo PlaygroundFound on. The code on Playground is a test file, not an executable file. You need to save it to the file and name itXX_test.go
Then run
go test -bench . -benchmem
Preface
If you only need a random string, the fastest solution is not the preferred solution. Paul's solution (the first method below) is good. If performance is very much concerned, the first two approaches may be an acceptable compromise: they increase performance by 50% without significantly increasing complexity.
That being said, even if you don't need the fastest way to generate random strings, you should be rewarded by reading this answer.
Improvements
1. Genesis (Runes)
As a reminder, the following method is the original universal solution we used to improve:
func init() { (().UnixNano()) } var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ") func RandStringRunes(n int) string { b := make([]rune, n) for i := range b { b[i] = letterRunes[(len(letterRunes))] } return string(b) }
2. Bytes
If the random string to be generated contains only upper and lower case English letters, then we can only use English letter bytes, because the English letters and UTF8-encoded bytes correspond one by one (this is how Go stores strings)
So you can write this:
// Replace rune with bytevar letters = []byte("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
Or better way to write it:
const letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
Writing it as a constant here is a big improvement (there are string constants butNo slice constant), Another point is expressionlen(letters)
will also be a constant (if s is a string constant, thenlen(s)
That's a constant)
So our second method is this:
const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" func RandStringBytes(n int) string { b := make([]byte, n) for i := range b { b[i] = letterBytes[(len(letterBytes))] } return string(b) }
3. Remainder
The previous methods are all called()
to generate a random number and select a random letter.()
From()
, and the latter comes fromRand.Int31n()
。
and generate a random number with 63 random bitsrand.Int63()
It is much slower than the above method of generating random numbers.
So we can call it directlyrand.Int63()
ThenlenletterBytes)
Take the remaining amount.
func RandStringBytesRmndr(n int) string { b := make([]byte, n) for i := range b { b[i] = letterBytes[rand.Int63() % int64(len(letterBytes))] } return string(b) }
This is OK and is much faster than the above method, but one disadvantage is that the probability of all letters appearing is not completely equal (assumingrand.Int63()
generates all 63 digits with equal probability). Because the length of the letter is 52 times1<<63 - 1
It is much smaller, so the probability of the occurrence of each letter is very small, and it is completely fine in actual use.
Explain the phenomenon that the probability of the above letters is unequal: Suppose you want to generate a random number between 0..5. If you use 3 random bits, it will cause the probability of generating the number in the range of 0..1 is twice that of 2..5; if you use 5 random bits, the probability of generating the number in the range of 0..1 is 6/32, and the probability bit in the range of 2..5 is 5/32, which is very close. Increasing the number of bits can make the probability difference smaller and smaller. When it reaches 63 bits, the difference can be ignored.
4. Masking
Based on the previous solution, we can maintain the even distribution of letters by using only the lowest bits of as many random numbers as possible. So we have 52 letters, so we need 6 digits to represent it:52=110100b
. So we only userand.Int63()
The lowest six digits of the returned number are expressed. To keep the letters evenly chalk we only accept falling on0...len(letterBytes)-1
Numbers in range. If the lowest number is greater than this range, then discard it and regenerate a new random number.
Note that the lowest position is greater than or equal tolen(letterBytes)
The probability of the 0.5 (average is 0.25), which means that repeating this situation reduces the probability that we find random numbers that can be used. After n repetitions, the probability that we still haven't found a random number that can be used is much smaller thanpow(0.5, n)
, of course this is a worst-case scenario. In the case of 52 letters, the probability that the lowest 6 bits cannot be used is (64 - 52) / 64 = 0.19, which means that the probability that no number that can be used has not been encountered in 10 repetitions is 1e-8.
So, this solution is as follows:
const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" const ( letterIdxBits = 6 // 6 bits to represent a letter index letterIdxMask = 1<<letterIdxBits - 1 // All 1-bits, as many as letterIdxBits ) func RandStringBytesMask(n int) string { b := make([]byte, n) for i := 0; i < n; { if idx := int(rand.Int63() & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i++ } } return string(b) }
5. Masking Improved
The previous solution only uses the lowest 6 of the 63 random bits returned by rand.Int63(). This is a waste, because getting random bits is the slowest part of our algorithm.
Because we have 52 letters, we can encode an alphabetical index with 6 digits. So 63 random bits can generate 63/6=10 different indexes:
const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" const ( letterIdxBits = 6 // 6 bits to represent a letter index letterIdxMask = 1<<letterIdxBits - 1 // All 1-bits, as many as letterIdxBits letterIdxMax = 63 / letterIdxBits // # of letter indices fitting in 63 bits ) func RandStringBytesMaskImpr(n int) string { b := make([]byte, n) // A rand.Int63() generates 63 random bits, enough for letterIdxMax letters! for i, cache, remain := n-1, rand.Int63(), letterIdxMax; i >= 0; { if remain == 0 { cache, remain = rand.Int63(), letterIdxMax } if idx := int(cache & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i-- } cache >>= letterIdxBits remain-- } return string(b) }
6. Source
The aboveMasking ImprovedThe method is already very good, and we can hardly make any more improvements. OK but not necessary.
Now let's find other things that can be improved: the source of random numbers.
crypto/rand
package, it provides aRead(b []byte)
Function, we can use it to get as many bytes as possible by calling it in one call. This doesn't help performance, because crypto/rand implements a crypto-secured pseudo-random number generator, so it's much slower.
So we still usemath/rand
Bag.UsedAs a source of random bits.
Is a designated
Int63() int64
Interface to the method: This is also the only thing we need and use in the latest solutions.
So we don't need it, can be used
:
var src = (().UnixNano()) func RandStringBytesMaskImprSrc(n int) string { b := make([]byte, n) // A src.Int63() generates 63 random bits, enough for letterIdxMax characters! for i, cache, remain := n-1, src.Int63(), letterIdxMax; i >= 0; { if remain == 0 { cache, remain = src.Int63(), letterIdxMax } if idx := int(cache & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i-- } cache >>= letterIdxBits remain-- } return string(b) }
It should also be noted that this method does not require initialization (seed)math/rand
The whole picture in the packageRand
, because it is not used.
Also: math/rand package documentation description
The default Source is safe for concurrent use by multiple goroutines.
So the default source is to compare()
The generated source is slow because the default source ensures that it is safe when used concurrently. and()
This feature is not provided (so it returns a Source more likely to be faster).
7. Utilizing
All the above methods return strings are created first (the first one is to use[]rune
, the ones behind it are[]byte
) and then convert tostring
. This conversion will copy the content of the slice once, because the string is immutable. If the conversion is not copied, it cannot be guaranteed that the content of the string will not be modified by the original cleavage. For details, please see here:How to convert utf8 string to []byte?and golang: []byte(string) vs []byte(*string)。
Go 1.10 introduced. is a new type, we can use it like
Create string content like this. Internally, it uses
[]byte
to build content, then we can use it()
The method obtains the final string value. But the difference is that it won't perform the copy operation we mentioned above. This is possible because the string byte slice it is used to build is not exposed, so it is guaranteed that no one will modify it unintentionally or maliciously.
So what we need to do next is to useInstead of slicing to create strings, we can finally get the desired result without the need for a copy operation. This may help in terms of speed and certainly help in terms of memory usage and allocation.
func RandStringBytesMaskImprSrcSB(n int) string { sb := {} (n) // A src.Int63() generates 63 random bits, enough for letterIdxMax characters! for i, cache, remain := n-1, src.Int63(), letterIdxMax; i >= 0; { if remain == 0 { cache, remain = src.Int63(), letterIdxMax } if idx := int(cache & letterIdxMask); idx < len(letterBytes) { (letterBytes[idx]) i-- } cache >>= letterIdxBits remain-- } return () }
8. "Mimicing" with package unsafe
Essentially, it is done the same as we did above, using slices to create strings, and the only reason we use it is to avoid copying the slices.
passunsafeAvoid copy operations:
// String returns the accumulated string. func (b *Builder) String() string { return *(*string)((&)) }
In fact, we can do this ourselves. Back to the past we used[]byte
to create a random string, but at the end it is not converted tostring
Then return, instead do an unsafe conversion: get a string pointing to our byte slice as string data.
func RandStringBytesMaskImprSrcUnsafe(n int) string { b := make([]byte, n) // A src.Int63() generates 63 random bits, enough for letterIdxMax characters! for i, cache, remain := n-1, src.Int63(), letterIdxMax; i >= 0; { if remain == 0 { cache, remain = src.Int63(), letterIdxMax } if idx := int(cache & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i-- } cache >>= letterIdxBits remain-- } return *(*string)((&b)) }
Benchmark
BenchmarkRunes-4 2000000 723 ns/op 96 B/op 2 allocs/op
BenchmarkBytes-4 3000000 550 ns/op 32 B/op 2 allocs/op
BenchmarkBytesRmndr-4 3000000 438 ns/op 32 B/op 2 allocs/op
BenchmarkBytesMask-4 3000000 534 ns/op 32 B/op 2 allocs/op
BenchmarkBytesMaskImpr-4 10000000 176 ns/op 32 B/op 2 allocs/op
BenchmarkBytesMaskImprSrc-4 10000000 139 ns/op 32 B/op 2 allocs/op
BenchmarkBytesMaskImprSrcSB-4 10000000 134 ns/op 16 B/op 1 allocs/op
BenchmarkBytesMaskImprSrcUnsafe-4 10000000 115 ns/op 16 B/op 1 allocs/op
The above is the detailed content of Go to quickly generate random strings of fixed lengths. For more information about Go to generate random strings of fixed lengths, please pay attention to my other related articles!