Python--current limiting throttle

Preface

A business service, after being open interface, encounters concurrent data scanning, so it needs to do current limiting operation. Has been stubborn tasks, business API and OpenAPI to be separated, perhaps because the initial access to other enterprise ERP systems are more standardized OpenAPI, always feel bad about the development of system business API. Face

Window Current Limitation

The requirement is to limit the current in a Django project. If rest_framework View works well, it provides the current limit directly. rest_framework throttling
You can refer to document settings. The reason you can't use settings directly is that when faced with a service that Django does, and then proxy to other services, the project assumes only one forwarding responsibility. If the upper limit current of LB is not distinguishable from the source IP, it can only be the total limit current, which may lead to the normal platform access being denied once the current is limited. So the requirement of current limiting is very clear. First of all, the granularity of current limiting is that we need to know the real IP of the source first, and the number of visits in a certain window time, such as 100/min.

rest_framework provides the idea of error comparison, similar to the implementation of a set of large-point records, fragment storage, dotted records for the real-time conditions that need to be limited. Take the above 100/min as an example. In the first minute, IP1 has no access, how can it not have any restricted data? The expiration time of redis satisfies the data settings. In the second minute, the number of times to be satisfied does not exceed 100. Maintaining an array with a length of more than 100 means exceeding the access limit. Please record in the array. To find the time value of each visit, window sliding is to eliminate the continuous access, and to point the access before one minute after the current time to ensure that the array window is always a record point within one minute of the current latest request.

# throttle setting
THROTTLE_RATES = {
    'resource1': '100/min',
    'resource2': '20/second'
}

# throttle class
class WindowAccessThrottle:

    cache = Cache()
    timer = time.time

    def __init__(self, request, view, scope):
        self.rate = settings.THROTTLE_RATES[scope]
        self.request = request
        self.view = view
        self.key = self.get_cache_key()

    def parse_rate(self):
        num, period = self.rate.split('/')
        num_requests = int(num)
        duration = {'s': 1, 'm': 60, 'h': 3600, 'd': 86400}[period[0]]
        return num_requests, duration

    def get_cache_key(self):
        host = self.request.META['HTTP_X_FORWARDED_FOR'] \
            if self.request.META.get('HTTP_X_FORWARDED_FOR', None) else \
            self.request.META['REMOTE_ADDR']
        return 'throttle:{}:{}'.format(host, self.view.__name__)

    def allow_request(self):
        history = self.cache.get_value(self.key, [])
        now = self.timer()
        num_requests, duration = self.parse_rate()

        while history and history[-1] <= now - duration:
            history.pop()
        if len(history) >= num_requests:
            return False

        history.insert(0, now)
        self.cache.set(self.key, history, duration)
        return True

Be careful
1. The above examples can be modified according to actual requirements.
2. At the IP level, if the request.META ['REMOTE_ADDR'] is directly invoked to obtain the IP directly from the request, the actual deployment services are mostly through LB or nginx reverse proxy, and REMOTE_ADDR is mostly the IP of the pre-LB, so HTTP_X_FORWARDED_FOR is used to obtain the remote IP of the originating request.
3, cache = Cache() is a redis encapsulation, slightly implementing the default value of cache.get_value(self.key, []) for obtaining support.
4. When used, it is similar to native throttle, setting scope in view function
4. With Django's middleware, call decision is roughly as follows:

from django.urls import resolve

'''
//In fact, the following middleware needs to be customized to debug according to the requirements. If only rest_framework View can be set directly with native settings, because the author is a forwarded View encapsulated by himself.
//This is equivalent to re-customizing a completely new general view, requiring re-implementation of current limitation
'''
class ThrottleMiddleware(MiddlewareMixin):
	def process_request(self, request):
		resolver = resolve(request.path)
		throttle_scope = getattr(resolver.func, 'throttle_scope', None)
		throttle = WindowAccessThrottle(request, resolver.func, throttle_scope)
		if throttle.allow_request():
			return
		else:
			return HttpResponse()

Funnel Current Limitation

The current limiter in the upper window solves the problem of traffic surge to a certain extent, but in the case of the current limiter in 120/min above, the user can concurrently operate at 120 at a moment in a minute. In this scenario, the above current limiter is basically useless. It is assumed that the total amount of access can be limited in the end-time, and the frequency of access can also be limited. As for the high rate, the funnel current limit is very ideal. Basic Abstract model:
1. Funnel parameters:
- capacity: capacity, funnel size
- Rate: Funnel outflow rate, which can be calculated by total and duration, total allowable for passage over a period of time
2. When the funnel is empty:
- Access rate < rate, funnel no backlog at this time, all requests through
- Access rate >= rate, when the funnel gradually backs up, and the funnel does not flow out with rate value
3. When the funnel is not empty:
- The outlet runs out at the maximum rate.
- The funnel is not full and will continue to be included
- If the funnel is full, it will overflow directly and refuse the request.
The above IP current limiting is realized by funnel current limiting. Examples are as follows:

THROTTLE_RATES = {
    'funnel': {
        'capacity': 15,
        'duration': 60,  # seconds
        'total': 30,
    },
}

class FunnelThrottle:

    cache = CusCache()
    timer = time.time

    def __init__(self, request, view, scope):
        config = settings.THROTTLE_RATES[scope]
        self.rate = config['total'] / config['duration']
        self.capacity = config['capacity']
        self.duration = config['duration']
        self.request = request
        self.view = view
        self.key = self.get_cache_key()

    def get_cache_key(self):
       	"""
       	same as WindowAccessThrottle
       	"""
       	pass

    def allow_request(self):
        history = self.cache.get_value(self.key, [])
        now = self.timer()
        if not history:  # Direct release of empty funnel
            history.insert(0, now)
            self.cache.set(self.key, history, self.duration)
            return True

        latest_duration = now - history[0]  # The closest release interval
        leak_count = int(latest_duration * self.rate)  # Computation of Funnel Free Space by Interval Time and Funnel Flow Velocity 
        for i in range(leak_count):
            if history:
                history.pop()
            else:
                break
		
		# After clearing the outflow space of the funnel, the funnel is still full and the direct determination of inaccessibility is made.
        if len(history) >= self.capacity:
            return False
		
		# If accessible, request access to funnel metering
        history.insert(0, now)
        self.cache.set(self.key, history, self.duration)
        return True

Note:
1. The funnel current limiting method and the data structure used in the previous window current limiting have been basically in cache. Because of the inconsistency of the decision algorithm, the current limiting effect is completely different.
2. Current limit of funnel, the point of measurement entering funnel, means that all the points are passed, but in funnel, the point will be judged whether it is invalidated by rate of funnel according to the time of next visit, so as to achieve the effect of reasonable capacity and limited flow rate.

Redis funnel current limiting (redis-cell)

The funnel current limiting algorithm mentioned above has been implemented in Redis module. See also for details. Github redis-cell The author installed on MacOS, basically no problem:

# Download the mac version installation package
https://github.com/brandur/redis-cell/releases
# decompression
tar -zxf redis-cell-*.tar.gz
# Copy executable files
cp libredis_cell.dylib /your_redis_server_localtion
# Restart redis-server and load libredis_cell.dylib
redis-server --loadmodule /path/to/modules/libredis_cell.dylib

After the installation restarts, you can execute the CL.THROTTLE command in redis:

# CL.THROTTLE user 123 1530 601 is similar to the configuration in the implementation algorithm. user123 denotes current limiting key, 15: capacity, 30: total, 60: duration.
127.0.0.1:6379> CL.THROTTLE user123 15 30 60 1
1) (integer) 0  # 0 denotes permission and 1 denies
2) (integer) 16  # Funnel capacity max_burst+1 = 15+1 = 16
3) (integer) 15  #  Funnel residual capacity
4) (integer) -1  #  If rejected, how many seconds to retry
5) (integer) 2  # How long after the funnel is completely empty

But redis-cell did not find the corresponding sdk

Python Bound method

# python 3.x
def func():
	pass

class A:
	@classmethod
	def method_cls(cls):
		pass
	def method_a(self):
		pass
class B(A):
	pass

a, b = A(), B()
print(func)  # <function func at 0x10ee8a1e0>
print(a.method_a)  # <bound method A.method_a of <__main__.A object at 0x10ef11978>>
print(b.method_cls)  # <bound method A.method_cls of <class '__main__.B'>>

For func is a function object, and method_a and method_cls belong to class A, so it's a bound method, so how do you see the ownership of a bound method?
Three properties im_func, im_class and im_self are provided in Python 2.x:

im_func is the function object.
im_class is the class the method comes from.
im_self is the self object the method is bound to.

In Python 3.x

__func__ replace im_func
__self__ replace im_self
im_class Cancellation in 2.x

# python 3.x
print(a.method_a.__self__)
print(b.method_cls.__self__)
# Print (func. _self_) error func without _self__
print(b.method_cls.__self__.__name__)
# Print (b.method_cls. _self_. name) error b.method_cls. _self_ is an instance without _name_ attribute

For _name_ and _qualname_ see PEP 3155

Posted by davidlenehan on Thu, 22 Aug 2019 23:24:20 -0700

Programmer Group