This blog is part of an in-depth study of Envoy Proxy and Istio.io and how it can achieve a more elegant way to connect and manage micro services.
This is the idea for the next few sections (links will be updated at publication time):
- Circuit Breakers (Part I)
- Retry/timeout (Part II)
- Distributed Tracking (Part 3)
- Prometheus Index Collection (Part IV)
- rate limiter (Part 5)
Part V - rate limiter
Envoy ratelimit filters
Envoy passes through two Filter Integration with Ratelimit services:
- Network Level Filter: envoy calls the Ratelimit service for each new connection on the listener where the filter is installed. In this way, you can limit the rate of connections per second through the listener.
- HTTP Level Filter: Envoy invokes the Ratelimit service for each new request on the listener where the filter is installed, and the routing table specifies that the Ratelimit service should be invoked. Much work is being done to extend the functionality of HTTP filters.
envoy configuration enables http rate limiter
http rate limiter When the requested route or virtual host has one or more rate limiting configurations that match the filter stage settings, the HTTP rate limiting filter will invoke the rate limiting service. The route can be selected to include a virtual host rate limiting configuration. Multiple configurations can be applied to requests. Each configuration causes the descriptor to be sent to the rate-limiting service.
If the call rate limits the service and the response of any descriptor exceeds the limit, a 429 response is returned. The rate limiting filter also sets the x-envoy-ratelimited header.
If an error occurs in the call rate limiting service or a rate limiting service return error occurs and failure_mode_deny is set to true, 500 responses are returned.
All the configurations are as follows:
envoy.yaml: |- static_resources: listeners: - address: socket_address: address: 0.0.0.0 port_value: 8000 filter_chains: - filters: - name: envoy.http_connection_manager config: codec_type: auto stat_prefix: ingress_http access_log: - name: envoy.file_access_log config: path: "/dev/stdout" format: "[ACCESS_LOG][%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\" \"%DOWNSTREAM_REMOTE_ADDRESS_WITHOUT_PORT%\"\n" route_config: name: local_route virtual_hosts: - name: gateway domains: - "*" routes: - match: prefix: "/cost" route: cluster: cost rate_limits: # enable rate limit checks for the greeter service actions: - destination_cluster: {} http_filters: - name: envoy.rate_limit # enable the Rate Limit filter config: domain: envoy - name: envoy.router config: {} clusters: - name: cost connect_timeout: 0.25s type: strict_dns lb_policy: round_robin hosts: - socket_address: address: cost.sgt port_value: 80 - name: rate_limit_cluster type: strict_dns connect_timeout: 0.25s lb_policy: round_robin http2_protocol_options: {} hosts: - socket_address: address: limiter.sgt port_value: 80 rate_limit_service: grpc_service: envoy_grpc: cluster_name: rate_limit_cluster timeout: 0.25s admin: access_log_path: "/dev/null" address: socket_address: address: 0.0.0.0 port_value: 9000
As can be seen from the configuration file, this demo sets up a global http filter rate limiter.
Although distributed fuses are usually very effective in controlling throughput in distributed systems, sometimes they are not very effective and require global rate constraints. The most common scenario is when a large number of hosts are forwarded to a small number of hosts and the average request delay is low (for example, connection/request to a database server). If the target host is backed up, the downstream host will flood the upstream cluster. In this case, it is very difficult to configure sufficient strict break limits on each downstream host, so that the system will run normally during the typical request mode, but cascade failures can still be prevented when the system starts to fail. Global rate limitation is a good solution to this situation.
Writing rate limit service
Envoy integrates with rate-limiting services directly through gRPC. Envoy requires the rate-limiting service to support the gRPC IDL specified in rls.proto. For more information on how API s work, see IDL File.
envoy itself only provides a current-limiting interface, but it has no specific implementation, so it must implement a current-limiting device by itself. The following is just a simple implementation, to give you a train of thought.
The specific code is as follows:
package main import ( "log" "net" "time" rls "github.com/envoyproxy/go-control-plane/envoy/service/ratelimit/v2" "github.com/juju/ratelimit" "golang.org/x/net/context" "google.golang.org/grpc" "google.golang.org/grpc/reflection" ) // server is used to implement rls.RateLimitService type server struct { bucket *ratelimit.Bucket } func (s *server) ShouldRateLimit(ctx context.Context, request *rls.RateLimitRequest) (*rls.RateLimitResponse, error) { // logic to rate limit every second request var overallCode rls.RateLimitResponse_Code if s.bucket.TakeAvailable(1) == 0 { overallCode = rls.RateLimitResponse_OVER_LIMIT } else { overallCode = rls.RateLimitResponse_OK } response := &rls.RateLimitResponse{OverallCode: overallCode} return response, nil } func main() { // create a TCP listener on port 8089 lis, err := net.Listen("tcp", ":8089") if err != nil { log.Fatalf("failed to listen: %v", err) } log.Printf("listening on %s", lis.Addr()) // create a gRPC server and register the RateLimitService server s := grpc.NewServer() rls.RegisterRateLimitServiceServer(s, &server{ bucket: ratelimit.NewBucket(100*time.Microsecond, 100), }) reflection.Register(s) if err := s.Serve(lis); err != nil { log.Fatalf("failed to serve: %v", err) } }
Specific items, refer to github.
PS:
- Token bucket algorithm is used to limit current. Token Bucket and Leaky Bucket have the same effect, but the opposite direction is easier to understand. Over time, the system will add Token to the bucket at a constant 1/QPS interval (10 ms if QPS=100). If the bucket is full, there will be no more. When new requests come, each will take one. Token, blocking or denying service if there is no Token to take.
- There is a single point of risk in this implementation.
- Dockerfile is in the code repository, so you can build a mirror and test it yourself.
conclusion
This paper briefly describes the rate limit function of envoy, provides the configuration file of global current limit, and simply implements a token bucket based current limiter. I hope to help you understand how Envoy's speed-limiting filters work with the gRPC protocol.