Nginx current limiting application & principle of leaky bucket / token bucket algorithm

Mengfei 2022-06-23 15:34:55 阅读数:982

nginxcurrentlimitingapplicationprinciple

List of articles

1. Preface

Current limiting is an important part of a background service , Especially when dealing with a large number of concurrent requests , Limit the flow to the range that the system can bear , To ensure the safe and efficient operation of the system . This paper starts from nginx Configuration start , First, several scenarios and nginx Usage of current limiting configuration , Combined with experimental verification , Analyze in detail nginx The principle of leaky bucket algorithm in .

2. Current limiting scenario

2.1 Simple current limiting

The simplest current limiting , Limit the current strictly according to the fixed rate , Exceeded requests are rejected directly .

Configuration example :

http {
limit_req_zone $binary_remote_addr zone=one:10m rate=12r/m;
limit_req_status 429;
server {
location / {
limit_req zone=one;
}
}
  • limit_req_zone Shared memory space is defined ( Used to store the number of requests and other statuses ) And request rate
  • $binary_remote_addr Means as requested ip Address restriction
  • rate=12r/m Indicates that the limited rate is 12r/m(12 Requests per minute ), That is, every 5s Release a request
  • zone=one:10m Indicates that the space capacity is 10M, The name is "one"
  • server Each service path can be specified under limit_req. Like the configuration above zone=one Specified the use of "one" This space is used for current limiting

Experimental verification :

10.99.16.41 - - [18/Feb/2022:17:04:45 +0800.1645175085.926] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:45 +0800.1645175085.926] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:45 +0800.1645175085.928] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.646] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.648] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.648] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.648] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.649] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.649] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:17:04:46 +0800.1645175086.653] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"

The contents of the two brackets indicate the access time and processing time respectively . We are 5s Send inside 10 A request , Only one request succeeded , All other requests were denied ( Status code for limit_req_status Configured 429).

2.2 Peak shaving and valley filling

In the actual scene , There may be sudden traffic , The request rate is higher than the limit rate in a short time , But from a large time range , The total rate is within a limited range . under these circumstances , Rejecting excessive burst traffic directly may not be the best strategy , A better strategy is to accept burst requests , And process requests at a limited rate ( Smooth flow ), This will not exceed the limited rate , It can also deal with burst traffic as much as possible . stay nginx in , This can be done by specifying burst Parameter realization , As if burst=5, Indicates when the request exceeds the limit rate , The number of requests allowed to exceed is 5, this 5 Requests will block waiting , Pass in sequence at a limited rate .

Configuration example :

http {
limit_req_zone $binary_remote_addr zone=one:10m rate=12r/m;
server {
location / {
limit_req zone=one burst=10;
}
}

This configuration is only based on the above configuration , increase burst=10, Means at a rate of 12r/m( Every time 5s Release a request ) On the basis of , The number of requests allowed to exceed the limit rate is 10.

Experimental verification :

10.99.16.41 - - [18/Feb/2022:18:39:13 +0800.1645180753.117] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:13 +0800.1645180753.789] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:18 +0800.1645180758.124] [5.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:23 +0800.1645180763.119] [9.977s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:28 +0800.1645180768.139] [14.356s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:33 +0800.1645180773.144] [19.360s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:38 +0800.1645180778.131] [24.347s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:43 +0800.1645180783.148] [29.363s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:48 +0800.1645180788.135] [34.349s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:53 +0800.1645180793.130] [39.344s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:39:58 +0800.1645180798.167] [44.381s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:18:40:03 +0800.1645180803.146] [49.358s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"

We are 5s Internally 12 A request ( The number of requests exceeded is 11) when , The first 1 Requests go directly through , The first 2~11 Requests are blocked and every 5s Through one ( Time is increasing 5s), The first 12 Requests were directly rejected ( The second line is the log ). notes :nginx The access log of is printed at the time of the actual release request , Not when the request arrives , So you can see the 12 The log of requests is earlier than 2~11 A request , Because the first 12 Requests were directly rejected , Time consuming to 0, And the first 2~11 Requests are blocked in the waiting queue , Longer time consuming .

Note: This strategy is also commonly used in “ Traffic shaping (Traffic Shaping)”

2.3 Fast peak processing

In the previous example , We smoothed the peaks , To handle a certain degree of burst traffic , But we can see , When the request queue is large , The blocking requests that are later in the queue will have a large delay , This undoubtedly reduces the user experience . If our service itself can handle a certain degree of burst traffic ( Instantaneous QPS Above average ), Then we don't have to block the sudden traffic , As long as burst The flow within the allowable range can be directly released , This can be done by setting nodelay Parameter to implement .

Configuration example :

http {
limit_req_zone $binary_remote_addr zone=one:10m rate=12r/m;
server {
location / {
limit_req zone=one burst=10 nodelay;
}
}

Experimental verification :

10.99.16.41 - - [18/Feb/2022:19:05:00 +0800.1645182300.171] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:00 +0800.1645182300.173] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:00 +0800.1645182300.173] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.002] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.002] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.003] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.004] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.004] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.004] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.005] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.005] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.99.16.41 - - [18/Feb/2022:19:05:01 +0800.1645182301.007] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"

We are also in 5s Sent within 12 A request , You can see the 2~11 Requests were not blocked , It was dealt with immediately , The first 12 Requests due to exceeding burst, So I was rejected directly .

2.4 Peak shaving and valley filling + Fast peak processing

In the example of fast peak processing , When a request exceeding the limit rate is received , It can be processed quickly to some extent , But the bearing capacity of the system is limited after all , therefore burst The size of the will be limited by the affordability of the system . Suppose the maximum instantaneous concurrency that the system can withstand is 2000, However, the peak value of external requests in a short time is 3000, So this is more than this 1000 Must a request be discarded ? Is it possible to quickly process 2000 A request , be left over 1000 There is a jam waiting for subsequent treatment ? The answer is yes , This can be done by Nginx Of delay=? Configuration to achieve . This configuration is indicated in requests that exceed the limit rate , How many more requests need to be deferred , Not more than delay Request for value , No need to wait .nodelay It's actually delay A special case of , Indicates that all requests do not have to wait , amount to delay The value is equal to the burst.

Configuration example :

http {
limit_req_zone $binary_remote_addr zone=one:10m rate=12r/m;
server {
location / {
limit_req zone=one burst=10 delay=8;
}
}

delay=8 Indicates that the rate exceeds the limit 8 Requests can be processed quickly , More than ( Of course, it must be less than burst) You have to wait .

chart 0. rate=5r/s burst=12 delay=8 Current limiting instructions at Experimental verification :

10.28.2.18 - - [21/Feb/2022:19:12:39 +0800.1645441959.522] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:39 +0800.1645441959.522] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:39 +0800.1645441959.523] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.194] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.195] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.195] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.195] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.195] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.196] [0.000s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:40 +0800.1645441960.197] [0.000s] "GET / HTTP/1.1" 429 169 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:44 +0800.1645441964.535] [4.339s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"
10.28.2.18 - - [21/Feb/2022:19:12:49 +0800.1645441969.539] [9.342s] "GET / HTTP/1.1" 200 4833 "-" "apifox/1.0.0 (https://www.apifox.cn)" "10.40.166.63"

Also in 5s Send inside 12 A request , You can see , The first 2~9 Requests are processed immediately , The first 10~11 Requests are blocked and processed sequentially at a limited rate ( The last two lines ), The first 12 A request ( The third to last line ) Be rejected directly .

3. The leaky bucket principle

3.1 Algorithm is introduced

Nginx In fact, the flow control of is realized through the principle of leaking barrels , There are many descriptions about leaky bucket algorithm on the network , Another algorithm related to this —— Token bucket algorithm is often mentioned , And the two are easily confused . But through the following analysis , We will see that the two algorithms are essentially equivalent .

We first understand the leaky bucket algorithm through a graph :

chart 1. Use leaky bucket for flow monitoring

  • The leaky bucket has a fixed capacity ( The picture shows T + τ), And water leaks outward at a fixed rate
  • If the leaky bucket is empty , Stop the leak
  • When the request arrives , Need to be able to add a specific amount of water to the bucket . This amount can be the same for every request , It can also be proportional to the length of the request message
  • If water is added, the bucket will overflow , Then the request is considered to have exceeded the limit , And the amount of water in the bucket remains the same

chart 1 Small and medium-sized people are responsible for monitoring the amount of water in the barrel , When a request is made, the amount of water to be added to the bucket is T, The current water volume has exceeded τ when ( That is, if you join T, The barrel will overflow ), Little people take action :

  • In traffic regulation (Traffic Policing) in ( chart 1 The situation of ), The villain will open the door of the trap , Cause the request packet to be discarded , And prevent water from flowing into the bucket ;
  • In flow shaping (Traffic Shaping) in ( Pictured 2), The villain will push up the baffle , So that the request is blocked , Until the water level drops to τ Or below .

chart 2. Use leaking bucket for flow shaping

It's not hard to see. , In the above case , When the queue length of traffic shaping is 0 when , It is equivalent to traffic supervision , Therefore, traffic regulation is a special case of traffic shaping . in addition , The core of leaky bucket algorithm is to judge whether the request passes at a specific rate , As for requests exceeding , There is no uniform way to deal with , It can be a direct refusal , It can also be blocking waiting, etc , It is up to the implementer or the user .

It's important to point out that , In the description above , The flow does not flow through the leaky bucket in the form of water , The bucket is just a ruler , Used to determine whether the request can pass . There is also another version of the leaky bucket algorithm , In this version , The water in the bucket directly simulates the flow through the leaky bucket at a fixed rate . But through analysis, we will find that the latter is a special case of the former , I won't go into details here , For those interested, please refer to wikipedia A comparison of the two versions , If you have any questions, please feel free to communicate .

3.2 And Nginx Parameter correspondence

Through to Nginx- Github Source code Analysis of , And experimental verification , We can nginx The configured parameters correspond to figure 2 in . Give a conclusion directly :

Map 2 The queue length in is recorded as Q, The capacity of the barrel is recorded as C:

  • rate Corresponding to the rate of water leakage from the leaking bucket
  • burst = τ+Q
  • T = 1, That is, each request to add a unit of water to the leaking bucket , And C=τ+1
  • delay = τ; If configured nodelay said Q=0, namely burst= τ; if delay and nodelay None of them are configured , be τ=0, namely burst=Q And C=1

With this correspondence , To understand nginx Parameters of , It's even more obvious .

3.3 Compare with token bucket

The token bucket algorithm is described as follows :

  • every other 1/r Seconds a token is added to the bucket (r Is the average transmission rate )
  • The barrel can hold up to b A token . If the bucket is full when the token arrives , Then discard the token
  • When one n When a message of bytes arrives , Remove... From token bucket n A token ( If there is ), And send the message to the network
  • If there is less than n A token , Then the token will not be deleted , And the message is considered to be outside the traffic limit

Compare this description with the leaky bucket , We will find that , In fact, these are two different descriptions of the same algorithm :

  • The leaking bucket leaks out at a fixed rate until the bucket is empty , And add water when the message arrives ( No overflow is required )
  • The token bucket adds tokens at a fixed rate until it is full , And take out the token when the message arrives ( Require enough tokens )

So just give the equivalent parameters , The external performance of the two algorithms is consistent , The difference in nature will only come from the difference in concrete implementation , The underlying logic is exactly the same .

3.4 Code implementation

Although the leaky bucket or token bucket has a timing clock in the description , But the actual implementation is basically not implemented by using the clock , Instead, it uses lazy computing , That is, when the request arrives, the water in the current bucket is calculated according to the current time and the preset rate / Token quantity , This avoids the use of clocks , While improving the performance, the algorithm can also be used in systems that do not support clock . in addition , The queue in the algorithm will not be explicitly reflected in the code , Basically sleep Block requests in memory by delaying , Implicitly represent the queue , And the size of the queue often needs to be maintained by the user .

Golang The official implementation of a Token bucket based current limiter , Another third-party library that uses more token buckets is juju/ratelimit, The representative Library of leaky bucket algorithm is uber stay Github Open source go Language library ratelimit.

The detailed explanation of the code will be supplemented later , You can refer to these articles first :

Reference resources :

版权声明:本文为[Mengfei]所创,转载请带上原文链接,感谢。 https://qdmana.com/2022/174/202206231334174092.html