Detailed explanation of Golang current limiter time/rate design and implementation

Current limiters are very important components in backend services and are mostly used in actual business scenarios. Their designs are often encountered in microservices, gateways, and some backend services. The function of the current limiter is to limit the request rate and protect the background response service to avoid service overload causing service unavailability.

Current limiterThere are many ways to implement it, such as Token Bucket, sliding window method, Leaky Bucket, etc.

The official implementation of the current limiter is provided to us in the Golang library/x/time/rate, it is designed and implemented based on the token bucket algorithm (Token Bucket).

Token bucket algorithm

The design of the token bucket is relatively simple. It can be simply understood as a refrigerator that can only store a fixed amount of ice cream. Each request can be understood as the person who comes to get the ice cream. If there are, and can only get one piece at a time, what will happen if the ice cream is finished? There will be a worker who regularly puts ice cream here, and the frequency of putting ice cream into the refrigerator is the same. For example, he can only put 10 ice cream into the refrigerator during 1s, and the frequency of the request response can be seen here.

Token bucket design concept:

Token: Each request can only continue to access after obtaining the Token token;
bucket: There is a fixed number of buckets, and each bucket can only hold a fixed number of designed tokens at most;
Bucket frequency: Put the token into the bucket at a fixed frequency, and the token cannot exceed the capacity of the bucket.

In other words, the token bucket design algorithm limits the request rate and achieves the purpose of controllable request response. Especially for the phenomenon of burst traffic requests in high concurrency scenarios, the backend can easily respond to the request, because when the burst traffic request is reached for the specific service of the backend, the burst traffic request has already been restricted.

Specific design

Current limiter definition

type Limiter struct {
	mu         // Mutex lock (exclusive lock)	limit     Limit      // Frequency of putting into bucket float64 type	burst     int        // The size of the bucket	tokens    float64    // Token token The current remaining number	last        // The time I took the token recently	lastEvent   // Time of the latest current limit event}

limit, burst and token are the core parameters in this current limiter, and the size of the request concurrency is implemented here.

After the token is issued, it is stored in the Reservation appointment object:

type Reservation struct {
	ok        bool      // Whether the conditions are met, the token is assigned	lim       *Limiter  // Current limiter for sending tokens	tokens    int       // The number of token tokens sent	timeToAct  // Meet the time of issuance of tokens	limit     Limit     // Token issuance speed}

Consumption Token

Limited provides three types of methods for users to consume tokens. Users can consume one token at a time or multiple tokens at one time. Each method represents different corresponding means when the token is insufficient.

Wait、WaitN

func (lim *Limiter) Wait(ctx ) (err error)
func (lim *Limiter) WaitN(ctx , n int) (err error)

Among them, Wait isWaitN(ctx, 1), the implementation is the same in the following method introduction.

When consuming a token using the Wait method, if the token array in the bucket is insufficient (less than n), the Wait method will block for a period of time until the token meets the condition. If sufficient, return directly.

Allow、AllowN

func (lim *Limiter) Allow() bool
func (lim *Limiter) AllowN(now , n int) bool

The AllowN method indicates that as of a certain current moment, whether the number of buckets is at least n, and if it is satisfied, it returns true, and n tokens are consumed from the bucket at the same time. Otherwise, return to not consume Token, false.

Usually, for such an online scenario, if the request rate is too fast, some requests will be directly thrown into.

Reserve、ReserveN

The official current limiter includes a blocking wait-based Wait, a direct judgment method, and a reserved-based method that provides its own maintenance, but the core implementations are all the reserveN method below.

func (lim *Limiter) Reserve() *Reservation
func (lim *Limiter) ReserveN(now , n int) *Reservation

When the call is completed, a Reservation* object will be returned regardless of whether the token is sufficient.

You can call the Delay() method of the object, which returns the time you need to wait. If the waiting time is 0, it means there is no need to wait. You must wait until the waiting time is over before the next work can be carried out.

Or, if you don't want to wait, you can call the Cancel() method, which will return the token.

func (lim *Limiter) reserveN(now , n int, maxFutureReserve ) Reservation {
	()

	// First determine whether the frequency is infinite	// If it is infinity, it means that there is no current limit for the time being.	if  == Inf {
		()
		return Reservation{
			ok:        true,
			lim:       lim,
			tokens:    n,
			timeToAct: now,
		}
	}

	// When you get the time to now	// The number of tokens that can be obtained and the time of the last token taken	now, last, tokens := (now)

	// Update tokens quantity	tokens -= float64(n)

	// If tokens is negative, it means that there is currently no token in the bucket	// Instructions to wait, calculate the waiting time	var waitDuration 
	if tokens &lt; 0 {
		waitDuration = (-tokens)
	}

	// Calculate whether the allocation conditions are met	// 1. The size required to be allocated shall not exceed the size of the bucket	// 2. The waiting time does not exceed the set waiting time	ok := n &lt;=  &amp;&amp; waitDuration &lt;= maxFutureReserve

	// Preprocessing reservation	r := Reservation{
		ok:    ok,
		lim:   lim,
		limit: ,
	}
	// If the allocation conditions are currently met	// 1. Set the allocation size	// 2. The time to satisfy the issuance of the token = current time + waiting time	if ok {
		 = n
		 = (waitDuration)
	}

	// Update the limiter value and return	if ok {
		 = now
		 = tokens
		 = 
	} else {
		 = last
	}

	()
	return r
}

Specific use

The rate package provides the use of current limiters, only limit (the frequency of put into the bucket) and burn (the size of the bucket).

func NewLimiter(r Limit, b int) *Limiter {
	return &amp;Limiter{
		limit: r, // Frequency of putting into the bucket		burst: b, // The size of the bucket	}
}

Here, use an http API to simply verify ittime/rateThe power of

func main() {
	r := (1 * )
	limit := (r, 10)
	("/", func(writer , request *) {
		if () {
			("Request succeeds, current time: %s\n", ().Format("2006-01-02 15:04:05"))
		} else {
			("The request was successful, but the current limit was...\n")
		}
	})

	_ = (":8081", nil)
}

Here, I set the bucket to place the token every millisecond, the bucket capacity is 10, and a http service is used to simulate the background API.

Next, do a stress test to see how it works:

func GetApi() {
	api := "http://localhost:8081/"
	res, err := (api)
	if err != nil {
		panic(err)
	}
	defer ()

	if  ==  {
		("get api success\n")
	}
}

func Benchmark_Main(b *) {
	for i := 0; i < ; i++ {
		GetApi()
	}
}

The effects are as follows:

......
The request was successful, current time: 2020-08-24 14:26:52
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
The request was successful, current time: 2020-08-24 14:26:52
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
The request was successful, but the current limit was caused. . .
......

Here, we can see that when using the AllowN method, only when the token token is produced can the token be consumed and the request continues, and the rest is to abandon its request. Of course, in actual business processing, it can be fed back to the front-end in a more friendly way.

Here, the previous requests will be successful because after the service is started, the token bucket will be initialized and the token is placed into the bucket. However, with the request for burst traffic, the token produces the token at a predetermined rate, and there will be obvious token supply and demand imbalance.

This is the article about the detailed explanation of the Golang current limiter time/rate design and implementation. For more relevant Go current limiter time/rate content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!