https://www.cnblogs.com/crazymakercircle/p/15187184.html
Current limiting
Current limit is a common interview question in the interview.
Why limit current
In short
Current limiting is used in many scenarios to limit concurrency and requests, such as second killing and rush buying, and to protect their own systems and downstream systems from being overwhelmed by huge traffic.
Take the microblog as an example. For example, a star announced a love affair, and the access increased from 500000 to 5 million. The planning capacity of the system can support up to 2 million accesses. Then, the flow restriction rules should be implemented to ensure that it is an available state, so the server will not crash, so the request is unavailable
Reference link
System architecture knowledge map (a 10w system architecture knowledge map)
https://www.processon.com/view/link/60fb9421637689719d246739
Architecture of seckill system
https://www.processon.com/view/link/61148c2b1e08536191d8f92f
Idea of current limiting:
Under the condition of availability, increase the number of people entering as much as possible, and the rest are waiting in line, or return friendly prompts to ensure that the users of the system inside can use normally and prevent the system avalanche.
In daily life, what places need to be restricted?
For example, there is a national scenic spot next to me, which may not be visited at all at ordinary times, but it will be overcrowded when it comes to May day or the Spring Festival. At this time, the scenic spot managers will implement a series of policies to limit the flow of people.
Why limit the current?
If the scenic spot can accommodate 10000 people, and now 30000 people go in, it is bound to be close to each other. If the rectification is not good, there will be accidents. As a result, everyone's experience is not good. As a result, the scenic spot may have to be closed, resulting in the inability to operate externally. As a result, everyone feels that the experience is terrible.
Current limiting algorithm
There are many current limiting algorithms. There are three common types: counter algorithm, leaky bucket algorithm and token bucket algorithm. They are explained one by one below
Current limiting means usually include counter, leakage bucket and token bucket. Pay attention to the difference between current limit and speed limit (all requests will be processed), depending on the business scenario.
(1) Counter
In a period of time interval (time window, time interval), the maximum number of processing requests is fixed, and the excess part will not be processed
(2) Leaky bucket
The size of the leaky bucket is fixed and the processing speed is fixed, but the request entry speed is not fixed (when there are too many requests in an emergency, too many requests will be discarded)
(3) Token bucket
The size of the token bucket is fixed, and the generation rate of the token is fixed, but the consumption rate of the token (i.e. request) is not fixed (it can deal with some situations where there are too many requests at some time); Each request will take the token from the token bucket. If there is no token, the request will be discarded.
Counter algorithm
Definition of counter current limit:
Within a certain time interval (time window, time interval), the maximum number of processing requests is fixed, and the excess part will not be processed.
Simple and crude, such as specifying the thread pool size, specifying the database connection pool size, and the number of nginx connections, which are all equal to the counter algorithm.
Counter algorithm is the simplest and easiest algorithm in current limiting algorithm/
For example, we stipulate that for interface A, we can't access more than 100 times A minute.
Then we can do this:
1. At the beginning, we can set a counter. Every time a request comes, the counter will increase by 1. If the value of the counter is greater than 100 and the interval between the request and the first request is still within 1 minute, it indicates that there are too many requests and access is denied.
2. If the interval between the request and the first request is greater than 1 minute and the counter value is still within the current limit, reset the counter, which is so simple
Implementation of counter current limiting:
package com.crazymaker.springcloud.ratelimit; import lombok.extern.slf4j.Slf4j; import org.junit.Test; import java.util.concurrent.CountDownLatch; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicInteger; import java.util.concurrent.atomic.AtomicLong; // Speedometer speed limit @Slf4j public class CounterLimiter { // Start time private static long startTime = System.currentTimeMillis(); // Time interval of time interval ms private static long interval = 1000; // Limit number per second private static long maxCount = 2; //accumulator private static AtomicLong accumulator = new AtomicLong(); // Count to determine whether the limit is exceeded private static long tryAcquire(long taskId, int turn) { long nowTime = System.currentTimeMillis(); //Within the time interval if (nowTime < startTime + interval) { long count = accumulator.incrementAndGet(); if (count <= maxCount) { return count; } else { return -count; } } else { //Outside the time interval synchronized (CounterLimiter.class) { log.info("The new time zone is here,taskId{}, turn {}..", taskId, turn); // Judge again to prevent repeated initialization if (nowTime > startTime + interval) { accumulator.set(0); startTime = nowTime; } } return 0; } } //Thread pool for multi-threaded simulation test private ExecutorService pool = Executors.newFixedThreadPool(10); @Test public void testLimit() { // Restricted times AtomicInteger limited = new AtomicInteger(0); // Number of threads final int threads = 2; // Number of execution rounds per thread final int turns = 20; // Synchronizer CountDownLatch countDownLatch = new CountDownLatch(threads); long start = System.currentTimeMillis(); for (int i = 0; i < threads; i++) { pool.submit(() -> { try { for (int j = 0; j < turns; j++) { long taskId = Thread.currentThread().getId(); long index = tryAcquire(taskId, j); if (index <= 0) { // Accumulation of restricted times limited.getAndIncrement(); } Thread.sleep(200); } } catch (Exception e) { e.printStackTrace(); } //Wait for all threads to end countDownLatch.countDown(); }); } try { countDownLatch.await(); } catch (InterruptedException e) { e.printStackTrace(); } float time = (System.currentTimeMillis() - start) / 1000F; //Output statistical results log.info("The number of times Limited is:" + limited.get() + ",The number of passes is:" + (threads * turns - limited.get())); log.info("The proportion of restrictions is:" + (float) limited.get() / (float) (threads * turns)); log.info("The running time is:" + time); } }
Serious problem of counter current limiting
Although this algorithm is simple, it is a very fatal problem, that is, the critical problem. Let's see the following figure:
We can see from the above figure that if a malicious user sends 100 requests instantaneously at 0:59 and 100 requests instantaneously at 1:00, the user actually sends 200 requests instantaneously in one second.
What we just specified is a maximum of 100 requests per minute (planned throughput), that is, a maximum of 1.7 requests per second. Users can instantly exceed our rate limit by burst requests at the reset node in the time window.
Users may crush our application in an instant through this loophole in the algorithm.
Leaky bucket algorithm
The basic principle of the leaky bucket algorithm is: water (corresponding to the request) enters the leaky bucket from the water inlet, and the leaky bucket leaves the water at a certain speed (request for release). When the water inflow speed is too large, the total water in the bucket will overflow directly if it is greater than the bucket capacity, as shown in the figure:
The general rules of leakage barrel current limit are as follows:
(1) The water inlet (corresponding to the client request) flows into the water inlet leakage bucket at any rate
(2) The capacity of the leaky bucket is fixed, and the water outlet (release) rate is also fixed.
(3) The capacity of the leaky bucket remains unchanged. If the processing rate is too slow, the water in the bucket will exceed the capacity of the bucket, and the water flowing in later will overflow, indicating that the request is rejected
Principle of leaky bucket algorithm
The idea of leaky bucket algorithm is very simple:
The water (request) enters into the leakage bucket first, and the leakage bucket discharges water at a certain speed. When the water inflow speed exceeds the capacity of the leakage bucket, it will overflow directly/
It can be seen that leaky bucket algorithm can forcibly limit the data transmission rate.
The leaky bucket algorithm is actually very simple. It can be considered as the process of water injection and leakage. Water flows into the bucket at any rate and out at a certain rate. When the water exceeds the capacity of the bucket, it will be discarded, because the bucket capacity is constant to ensure the overall rate
Water flowing out at a certain speed:
Peak shaving: when a large amount of traffic enters, overflow will occur, so that the current limiting protection service is available
Buffering: it does not directly request to the server to buffer the pressure
The consumption speed is fixed because the computing performance is fixed
Implementation of leaky bucket algorithm“
package com.crazymaker.springcloud.ratelimit; import lombok.extern.slf4j.Slf4j; import org.junit.Test; import java.util.concurrent.CountDownLatch; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicInteger; // Leakage barrel current limiting @Slf4j public class LeakBucketLimiter { // Start time of calculation private static long lastOutTime = System.currentTimeMillis(); // Outflow rate: 2 times per second private static int leakRate = 2; // Barrel capacity private static int capacity = 2; //Remaining water private static AtomicInteger water = new AtomicInteger(0); //Return value Description: // false is not restricted to // true current limited public static synchronized boolean isLimit(long taskId, int turn) { // If it is an empty bucket, the current time is used as the leakage time if (water.get() == 0) { lastOutTime = System.currentTimeMillis(); water.addAndGet(1); return false; } // Executive leakage int waterLeaked = ((int) ((System.currentTimeMillis() - lastOutTime) / 1000)) * leakRate; // Calculate the remaining water volume int waterLeft = water.get() - waterLeaked; water.set(Math.max(0, waterLeft)); // Update leakTimeStamp again lastOutTime = System.currentTimeMillis(); // Try adding water, and the water is not full, release if ((water.get()) < capacity) { water.addAndGet(1); return false; } else { // When the water is full, refuse to add water and limit current return true; } } //Thread pool for multi-threaded simulation test private ExecutorService pool = Executors.newFixedThreadPool(10); @Test public void testLimit() { // Restricted times AtomicInteger limited = new AtomicInteger(0); // Number of threads final int threads = 2; // Number of execution rounds per thread final int turns = 20; // Thread synchronizer CountDownLatch countDownLatch = new CountDownLatch(threads); long start = System.currentTimeMillis(); for (int i = 0; i < threads; i++) { pool.submit(() -> { try { for (int j = 0; j < turns; j++) { long taskId = Thread.currentThread().getId(); boolean intercepted = isLimit(taskId, j); if (intercepted) { // Accumulation of restricted times limited.getAndIncrement(); } Thread.sleep(200); } } catch (Exception e) { e.printStackTrace(); } //Wait for all threads to end countDownLatch.countDown(); }); } try { countDownLatch.await(); } catch (InterruptedException e) { e.printStackTrace(); } float time = (System.currentTimeMillis() - start) / 1000F; //Output statistical results log.info("The number of times Limited is:" + limited.get() + ",The number of passes is:" + (threads * turns - limited.get())); log.info("The proportion of restrictions is:" + (float) limited.get() / (float) (threads * turns)); log.info("The running time is:" + time); } }
Problems of leaking barrels:
The water outlet speed of the leaky bucket is fixed, that is, when the release speed is requested.
Copy around on the Internet:
The leaky bucket can not effectively deal with the sudden flow, but it can smooth the sudden flow (rectification)
Practical problems:
The rate at the outlet of the leaky bucket is fixed and can not flexibly cope with the improvement of the back-end capacity. For example, through dynamic capacity expansion, the back-end flow is increased from 1000QPS to 1W QPS, and the leaky bucket has no way
Token bucket current limiting algorithm
The token bucket algorithm generates tokens at a set rate and puts them into the token bucket. Each user request must apply for a token. If the token is insufficient, the request will be rejected/
In the token bucket algorithm, a token will be taken from the bucket when a new request arrives. If there is no token in the bucket, the service will be rejected. Of course, there is an upper limit on the number of tokens.
The number of tokens is strongly related to time and sending rate. The longer the time elapses, the more tokens will be added to the bucket. If the sending speed of the token is faster than the application speed, the token bucket will be full of tokens until the token occupies the whole token bucket, as shown in the figure:
The general flow of token bucket current limiting is as follows:
The water inlet puts tokens into the bucket at a certain speed
The capacity of the token is fixed, but the release speed is not fixed, but there are still remaining tokens in the bucket. Once the request comes, the application can be successful and then released
If the sending speed of the token is slower than the arrival speed of the request, no card is available in the bucket and the request will be rejected.
In short, the sending rate of the token can be set, so that it can effectively deal with the sudden exit traffic.
Token bucket algorithm:
The token bucket algorithm is similar to the leaky bucket algorithm. The difference is that there are some tokens in the token bucket. After the service request arrives, we will get the service only after obtaining the token. For example, we usually queue in front of the window in the canteen when we go to the canteen for dinner. This is like the leaky bucket algorithm. A large number of people gather outside the window in the canteen and enjoy the service at a certain speed, If there are too many people pouring in and the canteen can't fit, some people may stand outside the canteen, so they don't enjoy the service of the canteen, which is called overflow. Overflow can continue to request, that is, continue to queue up. What's the problem?
If there are special circumstances at this time, such as some volunteers in a hurry or the college entrance examination in senior three, this is an emergency. If the leaky bucket algorithm is also used, we have to queue slowly, which does not solve our needs. For many application scenarios, in addition to limiting the average data transmission rate, it is also required to allow some degree of burst transmission. At this time, the leaky bucket algorithm may not be suitable, and the token bucket algorithm is more suitable. As shown in the figure, the principle of token bucket algorithm is that the system will put tokens into the bucket at a constant speed. If the request needs to be processed, you need to obtain a token from the bucket first. When there is no token in the bucket, the service will be denied.
Implementation of token bucket algorithm
package com.crazymaker.springcloud.ratelimit; import lombok.extern.slf4j.Slf4j; import org.junit.Test; import java.util.concurrent.CountDownLatch; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicInteger; // Token bucket speed limit @Slf4j public class TokenBucketLimiter { // Last token issuing time public long lastTime = System.currentTimeMillis(); // Barrel capacity public int capacity = 2; // Token generation speed / s public int rate = 2; // Current number of tokens public AtomicInteger tokens = new AtomicInteger(0); ; //Return value Description: // false is not restricted to // true current limited public synchronized boolean isLimited(long taskId, int applyCount) { long now = System.currentTimeMillis(); //Time interval in ms long gap = now - lastTime; //Count the number of tokens in the time period int reverse_permits = (int) (gap * rate / 1000); int all_permits = tokens.get() + reverse_permits; // Current number of tokens tokens.set(Math.min(capacity, all_permits)); log.info("tokens {} capacity {} gap {} ", tokens, capacity, gap); if (tokens.get() < applyCount) { // If the token is not available, it is rejected // log.info("current limited.." + taskid + ", applycount:" + applycount "); return true; } else { // And a token. Get a token tokens.getAndAdd( - applyCount); lastTime = now; // log.info("remaining tokens.." + tokens); return false; } } //Thread pool for multi-threaded simulation test private ExecutorService pool = Executors.newFixedThreadPool(10); @Test public void testLimit() { // Restricted times AtomicInteger limited = new AtomicInteger(0); // Number of threads final int threads = 2; // Number of execution rounds per thread final int turns = 20; // Synchronizer CountDownLatch countDownLatch = new CountDownLatch(threads); long start = System.currentTimeMillis(); for (int i = 0; i < threads; i++) { pool.submit(() -> { try { for (int j = 0; j < turns; j++) { long taskId = Thread.currentThread().getId(); boolean intercepted = isLimited(taskId, 1); if (intercepted) { // Accumulation of restricted times limited.getAndIncrement(); } Thread.sleep(200); } } catch (Exception e) { e.printStackTrace(); } //Wait for all threads to end countDownLatch.countDown(); }); } try { countDownLatch.await(); } catch (InterruptedException e) { e.printStackTrace(); } float time = (System.currentTimeMillis() - start) / 1000F; //Output statistical results log.info("The number of times Limited is:" + limited.get() + ",The number of passes is:" + (threads * turns - limited.get())); log.info("The proportion of restrictions is:" + (float) limited.get() / (float) (threads * turns)); log.info("The running time is:" + time); } }
Benefits of token bucket
One of the advantages of token bucket is that it can easily deal with sudden exit traffic (improvement of back-end capacity).
For example, the issuing speed of tokens can be changed, and the algorithm can increase the number of tokens according to the new sending rate, so that the outlet burst traffic can be processed.
Guava RateLimiter
Guava is an excellent open source project in the Java field. It contains many very practical functions used by Google in Java projects, including collections, caching, concurrency, common annotations, String operation and I/O operation. Guava's RateLimiter provides the implementation of token bucket algorithm: smooth burst and smooth warming up.
The class diagram of ratelimit is shown above,
Nginx leakage barrel current limit:
Simple demonstration of Nginx current limiting:
Requests are processed only once every six seconds, as follows
limit_req_zone $arg_sku_id zone=skuzone:10m rate=6r/m; limit_req_zone $http_user_id zone=userzone:10m rate=6r/m; limit_req_zone $binary_remote_addr zone=perip:10m rate=6r/m; limit_req_zone $server_name zone=perserver:1m rate=6r/m;
This is from the request parameters, and the current is limited in advance
This is to count the number of current limiting from the request parameters in advance.
Define the current limited memory zone in the http block.
limit_req_zone $arg_sku_id zone=skuzone:10m rate=6r/m; limit_req_zone $http_user_id zone=userzone:10m rate=6r/m; limit_req_zone $binary_remote_addr zone=perip:10m rate=6r/m; limit_req_zone $server_name zone=perserver:1m rate=10r/s;
Use the current limiting zone in the location block. Refer to the following:
# ratelimit by sku id location = /ratelimit/sku { limit_req zone=skuzone; echo "Normal response"; }
test
[root@cdh1 ~]# /vagrant/LuaDemoProject/sh/linux/openresty-restart.sh shell dir is: /vagrant/LuaDemoProject/sh/linux Shutting down openrestry/nginx: pid is 13479 13485 Shutting down succeeded! OPENRESTRY_PATH:/usr/local/openresty PROJECT_PATH:/vagrant/LuaDemoProject/src nginx: [alert] lua_code_cache is off; this will hurt performance in /vagrant/LuaDemoProject/src/conf/nginx-seckill.conf:90 openrestry/nginx starting succeeded! pid is 14197 [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Normal response root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Normal response [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Degraded content after current limiting [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Degraded content after current limiting [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Degraded content after current limiting [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Degraded content after current limiting [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Degraded content after current limiting [root@cdh1 ~]# curl http://cdh1/ratelimit/sku?sku_id=1 Normal response
Advance parameters from Header header
1. Nginx supports reading non nginx standard user-defined headers, but you need to turn on the underline support of headers under http or server:
underscores_in_headers on;
2. For example, we customize the header as X-Real-IP. When obtaining the header through the second nginx, we need to do the following:
$http_x_real_ip; (all in lowercase and preceded by http_)
underscores_in_headers on; limit_req_zone $http_user_id zone=userzone:10m rate=6r/m; server { listen 80 default; server_name nginx.server *.nginx.server; default_type 'text/html'; charset utf-8; # ratelimit by user id location = /ratelimit/demo { limit_req zone=userzone; echo "Normal response"; } location = /50x.html{ echo "Degraded content after current limiting"; } error_page 502 503 =200 /50x.html; }
test
[root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Normal response [root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER-ID:1" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER_ID:2" http://cdh1/ratelimit/demo Normal response [root@cdh1 ~]# curl -H "USER_ID:2" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# [root@cdh1 ~]# curl -H "USER_ID:2" http://cdh1/ratelimit/demo Degraded content after current limiting [root@cdh1 ~]# curl -H "USER-ID:3" http://cdh1/ratelimit/demo Normal response [root@cdh1 ~]# curl -H "USER-ID:3" http://cdh1/ratelimit/demo Degraded content after current limiting
Detailed explanation of three subdivision types of Nginx leaky bucket current limit, namely burst and nodelay parameters
Requests are processed only once every six seconds, as follows
limit_req_zone $arg_user_id zone=limti_req_zone:10m rate=10r/m;
Leaky bucket current limiting without buffer queue
limit_req zone=limti_req_zone;
In strict accordance with limti_req_zone Configured in rate To process the request exceed rate Range of processing capacity, direct drop It shows that there is no delay for the received request
Assuming that 10 requests are submitted within 1 second, you can see that a total of 10 requests have failed, and 503 is returned directly,
Then check / var/log/nginx/access.log to confirm that only one request succeeded, and the others directly returned 503, that is, the server rejected the request.
Leaky bucket current limiting with buffer queue
limit_req zone=limti_req_zone burst=5;
According to in limti_req_zone Configured in rate To process the request At the same time, a buffer queue with a size of 5 is set, and the requests in the buffer queue will wait for slow processing More than burst Buffer queue length and rate Requests for processing power are discarded directly It shows that there is a delay for the received request
Assuming that 10 requests are submitted within one second, it can be found that within one second, after the server receives 10 concurrent requests, it processes one request first, and puts five requests into the burst buffer queue for processing. The number of requests exceeding (burst+1) is directly discarded, that is, four requests are directly discarded. The five requests cached by burst are processed every 6s.
Then check the / var/log/nginx/access.log log
Leaky bucket current limiting with instantaneous processing capacity
limit_req zone=req_zone burst=5 nodelay;
If nodelay is set, it will instantly provide the ability to process (burst + rate) requests. When the number of requests exceeds (burst + rate), it will directly return 503. There is no need to wait for requests within the peak range.
Assuming that 10 requests are submitted within one second, it can be found that the server has processed 6 requests within one second (peak speed: one request in burst + 10s). For the remaining four requests, 503 is returned directly. If you continue to send 10 requests to the server in the next second, the server will directly reject the 10 requests and return 503.
Then check the / var/log/nginx/access.log log
It can be found that within 1s, the server processed 6 requests (peak speed: burst + original processing speed). For the remaining four requests, 503 is returned directly.
However, the total quota is consistent with the speed time, that is, the quota is used up. You need to wait until a time period with quota before receiving new requests. If five requests are processed at one time, it is equivalent to taking up the quota of 30s, 65 = 30. Because 6s is set to process one request, it is not until 30
s before another request can be processed, that is, if 10 requests are sent to the server at this time, 9 503 and one 200 will be returned
Distributed current limiting component
why
- However, the current limiting instruction of Nginx can only be valid in the same memory area. In the production scenario, the second kill external gateway is often deployed in multiple nodes, so the distributed current limiting component is required.
Redis+Lua can be used to develop high-performance distributed current limiting components. JD's rush purchase is to use Redis+Lua to complete current limiting. Redis+Lua current limiting component can be used for both Nginx external gateway and Zuul internal gateway.
Theoretically, the current limit of the access layer has multiple dimensions:
(1) User dimension flow restriction: the user is allowed to submit a request only once in a certain period of time. For example, the client IP or user ID can be used as the flow restriction key.
(2) Flow restriction of commodity dimension: for the same rush purchase commodity, only a certain number of requests are allowed to enter in a certain period of time. You can take the second kill commodity ID as the flow restriction key.
When to limit current with nginx:
User dimension current limiting can be performed on ngix, because using nginx current limiting memory to store user id is more efficient than using redis key to store user id.
When to use redis+lua distributed current limiting:
The current limit of commodity dimension can be carried out on redis without a large number of key s for calculating access times. In addition, it can control the total number of access second kill requests of all access layer nodes.
redis+lua distributed current limiting component
--- Environment for this script: redis Internal, not running nginx inside ---Method: request token --- -1 failed --- 1 success --- @param key key Current limiting keyword --- @param apply Number of tokens requested local function acquire(key, apply) local times = redis.call('TIME'); -- times[1] Seconds -- times[2] Microseconds local curr_mill_second = times[1] * 1000000 + times[2]; curr_mill_second = curr_mill_second / 1000; local cacheInfo = redis.pcall("HMGET", key, "last_mill_second", "curr_permits", "max_permits", "rate") --- Local variable: time of last application local last_mill_second = cacheInfo[1]; --- Local variable: number of previous tokens local curr_permits = tonumber(cacheInfo[2]); --- Local variable: bucket capacity local max_permits = tonumber(cacheInfo[3]); --- Local variable: token issuing rate local rate = cacheInfo[4]; --- Local variable: the number of tokens this time local local_curr_permits = 0; if (type(last_mill_second) ~= 'boolean' and last_mill_second ~= nil) then -- Count the number of tokens in the time period local reverse_permits = math.floor(((curr_mill_second - last_mill_second) / 1000) * rate); -- Total number of tokens local expect_curr_permits = reverse_permits + curr_permits; -- Total number of tokens that can be requested local_curr_permits = math.min(expect_curr_permits, max_permits); else -- Get token for the first time redis.pcall("HSET", key, "last_mill_second", curr_mill_second) local_curr_permits = max_permits; end local result = -1; -- There are enough tokens to apply if (local_curr_permits - apply >= 0) then -- Save remaining tokens redis.pcall("HSET", key, "curr_permits", local_curr_permits - apply); -- Save time for the next token acquisition redis.pcall("HSET", key, "last_mill_second", curr_mill_second) -- Returns that the token was obtained successfully result = 1; else -- Return token acquisition failure result = -1; end return result end --eg -- /usr/local/redis/bin/redis-cli -a 123456 --eval /vagrant/LuaDemoProject/src/luaScript/redis/rate_limiter.lua key , acquire 1 1 -- obtain sha Coded command -- /usr/local/redis/bin/redis-cli -a 123456 script load "$(cat /vagrant/LuaDemoProject/src/luaScript/redis/rate_limiter.lua)" -- /usr/local/redis/bin/redis-cli -a 123456 script exists "cf43613f172388c34a1130a760fc699a5ee6f2a9" -- /usr/local/redis/bin/redis-cli -a 123456 evalsha "cf43613f172388c34a1130a760fc699a5ee6f2a9" 1 "rate_limiter:seckill:1" init 1 1 -- /usr/local/redis/bin/redis-cli -a 123456 evalsha "cf43613f172388c34a1130a760fc699a5ee6f2a9" 1 "rate_limiter:seckill:1" acquire 1 --local rateLimiterSha = "e4e49e4c7b23f0bf7a2bfee73e8a01629e33324b"; ---Method: initialize current limiting Key --- 1 success --- @param key key --- @param max_permits Barrel capacity --- @param rate Token issuance rate local function init(key, max_permits, rate) local rate_limit_info = redis.pcall("HMGET", key, "last_mill_second", "curr_permits", "max_permits", "rate") local org_max_permits = tonumber(rate_limit_info[3]) local org_rate = rate_limit_info[4] if (org_max_permits == nil) or (rate ~= org_rate or max_permits ~= org_max_permits) then redis.pcall("HMSET", key, "max_permits", max_permits, "rate", rate, "curr_permits", max_permits) end return 1; end --eg -- /usr/local/redis/bin/redis-cli -a 123456 --eval /vagrant/LuaDemoProject/src/luaScript/redis/rate_limiter.lua key , init 1 1 -- /usr/local/redis/bin/redis-cli -a 123456 --eval /vagrant/LuaDemoProject/src/luaScript/redis/rate_limiter.lua "rate_limiter:seckill:1" , init 1 1 ---Method: delete current limit Key local function delete(key) redis.pcall("DEL", key) return 1; end --eg -- /usr/local/redis/bin/redis-cli --eval /vagrant/LuaDemoProject/src/luaScript/redis/rate_limiter.lua key , delete local key = KEYS[1] local method = ARGV[1] if method == 'acquire' then return acquire(key, ARGV[2], ARGV[3]) elseif method == 'init' then return init(key, ARGV[2], ARGV[3]) elseif method == 'delete' then return delete(key) else --ignore end
In redis, in order to avoid wasting network resources by repeatedly sending script data, you can use the script load command to cache script data and return a hash code as the call handle of the script,
Each time you call the script, you only need to send a hash code to call it.
Distributed token current limiting practice
redis+lua can be used, and the simple case below the actual combat ticket:
The token is put into the token bucket at the rate of 1 token per second. The bucket can store up to 2 tokens, so the system will only allow continuous processing of 2 requests per second,
Or every two seconds, after two tokens in the bucket are full, deal with the emergencies of two requests at a time to ensure the stability of the system.
Current limit of commodity dimension
When the flow limit of the second kill commodity dimension, when the traffic of the commodity is far greater than the traffic involved, the request is randomly discarded.
Token bucket current limiting script gettoken of Nginx_ access_ Limit.lua is executed in the access phase of the request. However, the script does not implement the core logic of current limiting, and only calls the rate cached in Redis_ The limiter.lua script limits the current.
getToken_access_limit.lua script and rate_ The relationship between the limiter.lua script is shown in Figure 10-17.
Figure 10-17 getToken_access_limit.lua script and rate_limiter.lua script relationship
When will the rate be loaded in Redis_ What about the limiter.lua script?
Like the second kill script, this script completes the loading and caching of goods in Redis when the Java program starts the second kill.
Another important point is that the Java program will encode the sha1 after the script is loaded and cache it in Redis through a custom key (specifically "lua:sha1:rate_limiter") to facilitate the gettoken of Nginx_ access_ The limit.lua script is used to get the and is used when calling the evalsha method.
Note: using redis cluster, each node needs to cache a copy of script data
/** * Because redis cluster is used, each node needs to cache a copy of script data * @param slotKey The slotKey used to locate the corresponding slot */ public void storeScript(String slotKey){ if (StringUtils.isEmpty(unlockSha1) || !jedisCluster.scriptExists(unlockSha1, slotKey)){ //redis supports script caching and returns hash codes, which can be used to call scripts in the future unlockSha1 = jedisCluster.scriptLoad(DISTRIBUTE_LOCK_SCRIPT_UNLOCK_VAL, slotKey); } }
Common current limiting components
Redismission distributed current limiting adopts the idea of token bucket and fixed time window. trySetRate method sets the bucket size. redis key expiration mechanism is used to achieve the purpose of time window and control the number of requests allowed to pass in the fixed time window.
spring cloud gateway integrates redis current limiting, but it belongs to gateway layer current limiting
Source: