1 background of hystrix
With the improvement of business complexity and the continuous splitting of the system, a user-oriented API has layers of nested RPC calls, and the call chain may be very long. This will cause the following problems:
-
Reduced API interface availability
Citing an official example of Hystrix, suppose that an application provided by tomcat internally relies on 30 services, and the availability of each service is very high, which is 99.99%. The availability of the whole application is: the 30th power of 99.99% = 99.7%, that is, the failure rate of 0.3%.
This means that for every 100 million requests, 300000 fail; In terms of time, the failure time of each month is more than 2 hours. -
Service fuse
In order to solve the above problems, the idea of service fusing has been put forward. Similar to the "fuse" in the real world, when an abnormal condition is triggered, the whole service will be blown directly, rather than waiting until the service times out.
The triggering conditions of fusing can vary according to different scenarios, such as counting the number of failed calls in a time window. -
service degradation
If there is a fuse, there must be degradation. The so-called degradation is that when a service is blown, the server will no longer be called. At this time, the client can prepare a local fallback callback and return a default value.
In this way, although the service level drops, it is better to be available at any rate than to hang up directly. Of course, it also depends on the appropriate business scenario.
Microservice high concurrency scenario
Reduce the dependency relationship between services (not business dependency), prevent service avalanche effect, and finally degrade, fuse and limit current with services.
-
Service avalanche effect: when a service suddenly receives high concurrent requests, if the Tomcat server cannot bear it, it will generate service accumulation, which may lead to the unavailability of other services.
-
Fault tolerance: the solution after an error occurs when the service is unavailable.
2. Effect of hystrix
What is Hystrix?
Hystrix is a microservice framework for service protection. It is an open-source delay and fault-tolerant solution framework for distributed systems by Netflix. It is used to isolate distributed service faults. It provides thread and semaphore isolation to reduce the interaction caused by resource competition between different services, provides elegant degradation mechanism, and provides circuit breaker mechanism to make services available It can prevent cascading failures and ensure system flexibility and availability through these mechanisms.
Hystrix action
-
Service protection: protect services when they accumulate.
-
Service isolation: ensure that each service does not affect each other, and use semaphores and thread pools.
-
Service degradation: when the service is unavailable, it will not be waiting, and a friendly prompt will be returned directly.
Main purpose: to limit current for user experience and high concurrency -
Service fusing: when the server reaches its maximum level, it directly denies access to the service, then calls the service degradation mode. It returns friendly hints to ensure that the server will not fail.
Main purpose: Protection Services -
Stacking requests: assuming that the thread pool of the default tomcat maximum thread is 50, try the 51st request, the 51st request is blocked, and a large number of requests are waiting. If there are too many stacking requests, the server may be paralyzed.
The bottom layer of tomcat is HTTP + thread pool, and each thread is an independent request.
2.1 why does avalanche effect occur?
The bottom layer of tomcat is actually thread pool technology. Thread pool manages all thread requests. Assuming that the thread pool creates 50 threads at most, more than 50 will wait.
How to check whether it is the same thread pool?
You can get the current thread name and thread pool name + thread ID
Service avalanche effect
The generated services are stacked in the same thread pool, because in the same thread pool, all requests are accessed to one service. At this time, no threads of other services will receive the requested access, so there will be a service avalanche effect.
2.2 service isolation
When most people use tomcat, multiple HTTP services will share a thread pool. Suppose that the database response accessed by one HTTP service is very slow, which will increase the service response time delay. Most threads block waiting for the data response to return, resulting in the entire Tomcat thread pool being occupied by the service, or even the entire Tomcat. Therefore, if we can If different HTTP services are isolated to different thread pools, a full thread pool of an HTTP service will not cause catastrophic failures to other services. This requires thread isolation or semaphore isolation.
The purpose of using thread isolation or signal isolation is to allocate certain resources for different services. When their own resources are exhausted, they will directly return to failure instead of occupying other people's resources.
Service isolation
Each service interface does not affect each other
There are two ways for Hystrix to realize service isolation:
- Thread pool mode: the same service interface has its own independent thread pool to manage and run its own interface
Disadvantages: CPU memory overhead is very large, complete isolation can be realized, and high concurrency can be applied to solve the avalanche effect. - Counter mode: the bottom layer uses atomic counters to set its own independent limit threshold for each service.
For example, each service interface can be accessed at most 50 times at the same time. If it exceeds 50 requests, implement the rejection policy yourself.
For example, the following OrderController has two services: orderIndex and findIndex.
Then they have their own independent thread pool, so blocking one service will not cause another service to wait.
However, since each service has its own independent thread pool, the CPU overhead is very large.
2.3 service degradation
service degradation
When the service is unavailable (equivalent to when the service is waiting, the network is delayed, and the server response is slow), the client has been waiting. An error prompt should be directly returned to the client to prevent the client from waiting. It is not necessary to use the fallback method to return to the current service.
Service degradation
The purpose is to improve user experience and prevent avalanche effect.
2.4 service fuse
Service fuse
The cause of service fusing in microservices is that there are too many service requests (belonging to high concurrency). Set a limit, for example, you can only request access at the same time (100 threads). The excess requests are stored in the cache queue. If the threads in the cache queue are full, you can directly refuse access and cannot directly access the service.
Service fusing and service degradation are used together.
Service fusing can achieve the following results:
Prevent the service from being hung up and protect the service.
3 using Hystrix
introduction
Start the service and test with jmeter:
1. Create a thread group:
2. Create an HTTP request:
Test:
Click start, and then visit the browser: http://127.0.0.1:8080/order/findOrderIndex
As you can see, the above request is stuck for a long time, which is caused by high concurrency.
Using Hystrix
Import dependency:
<!-- Hystix rely on --> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-hystrix</artifactId> </dependency>
4.1 mode 1: thread pool mode
- Using thread pool isolation can completely isolate third-party applications, and request threads can be quickly put back;
- The requesting thread can continue to accept new requests. If there is a problem, the thread pool isolation is independent and will not affect other applications
- When the failed application becomes available again, the thread pool will be cleaned up and can be recovered immediately without a long recovery
- Independent thread pools improve concurrency.
Disadvantages:
The main disadvantage of thread pool isolation is that they increase computational overhead (CPU). The execution of each command involves queuing, scheduling and context switching, and is run on a separate thread.
Create OrderHystrixCommand
package com.snow.order.hystrix; import com.snow.order.service.MemberService; import org.springframework.beans.factory.annotation.Autowired; import com.alibaba.fastjson.JSONObject; import com.netflix.hystrix.HystrixCommand; import com.netflix.hystrix.HystrixCommandGroupKey; import com.netflix.hystrix.HystrixCommandKey; import com.netflix.hystrix.HystrixCommandProperties; import com.netflix.hystrix.HystrixThreadPoolKey; import com.netflix.hystrix.HystrixThreadPoolProperties; /** * Function Description: using thread pool < br > * Inherit the hystrixcommand < JSONObject >, and the value in the generic type should be consistent with the return type in the controller, which is JSONObject * */ @SuppressWarnings("rawtypes") public class OrderHystrixCommand extends HystrixCommand<JSONObject> { @Autowired private MemberService memberService; public OrderHystrixCommand(MemberService memberService) { super(setter()); this.memberService = memberService; } protected JSONObject run() throws Exception { JSONObject member = memberService.getMember(); System.out.println("Current thread name:" + Thread.currentThread().getName() + ",Order service calls member service:member:" + member); return member; } /** * Configure the isolation mechanism of the service * * @return */ private static Setter setter() { // Service grouping HystrixCommandGroupKey groupKey = HystrixCommandGroupKey.Factory.asKey("orders"); // Service identification HystrixCommandKey commandKey = HystrixCommandKey.Factory.asKey("order"); // Thread pool name (why specify thread pool name? Because each service has its own thread pool) HystrixThreadPoolKey threadPoolKey = HystrixThreadPoolKey.Factory.asKey("order-pool"); // ##################################################### // Thread pool configuration: the thread pool size is 10, the thread survival time is 15 seconds, and the queue waiting threshold is 100. If it exceeds 100, execute the denial policy ----- configure the service HystrixThreadPoolProperties.Setter threadPoolProperties = HystrixThreadPoolProperties.Setter().withCoreSize(10) .withKeepAliveTimeMinutes(15).withQueueSizeRejectionThreshold(100); // ######################################################## // Command properties configure Hystrix on timeout HystrixCommandProperties.Setter commandProperties = HystrixCommandProperties.Setter() // Using thread pool to realize service isolation .withExecutionIsolationStrategy(HystrixCommandProperties.ExecutionIsolationStrategy.THREAD) // prohibit .withExecutionTimeoutEnabled(false); return HystrixCommand.Setter.withGroupKey(groupKey).andCommandKey(commandKey).andThreadPoolKey(threadPoolKey) .andThreadPoolPropertiesDefaults(threadPoolProperties).andCommandPropertiesDefaults(commandProperties); } @Override protected JSONObject getFallback() { // If the Hystrix is blown and the current service is unavailable, execute the Fallback method directly System.out.println("System error!"); JSONObject jsonObject = new JSONObject(); jsonObject.put("code", 500); jsonObject.put("msg", "System error!"); return jsonObject; } }
Used in OrderController
/** * To solve the avalanche effect of services, the bottom layer uses service isolation thread pool * * @return * @throws InterruptedException */ @RequestMapping("/orderIndexHystrix") public Object orderIndexHystrix() throws InterruptedException { return new OrderHystrixCommand(memberService).execute(); }
test
Using jmeter test, it is found that access: http://127.0.0.1:8080/order/findOrderIndex I won't wait because I have my own thread pool.
4.2 mode 2: semaphore
An atomic counter (or semaphore) is used to record how many threads are currently running. When the request comes in, judge the value of the counter first. If the number of threads exceeds the set maximum number, the request will be rejected. If not, the request will pass. At this time, the counter is + 1. After the request is returned successfully, the counter is - 1.
The biggest difference from thread pool isolation is that the thread executing dependent code is still the request thread
The semaphore size can be dynamically adjusted, but the thread pool size cannot be adjusted.
Create OrderHystrixCommand2
package com.snow.order.hystrix; import com.snow.order.service.MemberService; import org.springframework.beans.factory.annotation.Autowired; import com.alibaba.fastjson.JSONObject; import com.netflix.hystrix.HystrixCommand; import com.netflix.hystrix.HystrixCommandGroupKey; import com.netflix.hystrix.HystrixCommandKey; import com.netflix.hystrix.HystrixCommandProperties; import com.netflix.hystrix.HystrixThreadPoolKey; import com.netflix.hystrix.HystrixThreadPoolProperties; /** * Function Description: use semaphore < br > * */ @SuppressWarnings("rawtypes") public class OrderHystrixCommand2 extends HystrixCommand<JSONObject> { @Autowired private MemberService memberService; public OrderHystrixCommand2(MemberService memberService) { super(setter()); this.memberService = memberService; } protected JSONObject run() throws Exception { JSONObject member = memberService.getMember(); System.out.println("Current thread name:" + Thread.currentThread().getName() + ",Order service calls member service:member:" + member); return member; } private static Setter setter() { // Service grouping HystrixCommandGroupKey groupKey = HystrixCommandGroupKey.Factory.asKey("members"); // The command attribute configuration adopts semaphore mode HystrixCommandProperties.Setter commandProperties = HystrixCommandProperties.Setter() .withExecutionIsolationStrategy(HystrixCommandProperties.ExecutionIsolationStrategy.SEMAPHORE) // An atomic counter (or semaphore) is used to record how many threads are currently running. When the request comes in, judge the count first // If the number of threads exceeds the set maximum number, the request will be rejected. If not, the request will pass. At this time, the counter is + 1. After the request is returned successfully, the counter is - 1. .withExecutionIsolationSemaphoreMaxConcurrentRequests(50); return HystrixCommand.Setter.withGroupKey(groupKey).andCommandPropertiesDefaults(commandProperties); } @Override protected JSONObject getFallback() { // If the Hystrix is blown and the current service is unavailable, execute the Fallback method directly System.out.println("System error!"); JSONObject jsonObject = new JSONObject(); jsonObject.put("code", 500); jsonObject.put("msg", "System error!"); return jsonObject; } }
Used in OrderController
/** * To solve the avalanche effect of services, the bottom layer uses service isolation semaphore * * @return * @throws InterruptedException */ @RequestMapping("/orderIndexHystrix2") public Object orderIndexHystrix2() throws InterruptedException { return new OrderHystrixCommand2(memberService).execute(); }
test
Using jmeter test, it is found that access: http://127.0.0.1:8080/order/findOrderIndex I won't wait because I have my own thread pool.
4.3 application scenarios
Thread pool isolation:
- Third party applications or interfaces
- Large amount of concurrency
Semaphore isolation:
- Internal application or middleware (redis)
- Little concurrent demand