Introduction: MSE microservice engine will launch the professional version of service governance, providing out of the box and complete professional microservice governance solutions to help enterprises better realize microservice governance capabilities. If your system can quickly have a complete full link gray-scale capability as described in this paper, and further microservice governance practice based on this capability, it can not only save objective manpower and cost, but also make your enterprise more confident in the exploration of microservices.
Author: Shi Mian
Audit & proofreading__ Right: Wangtao, Yizhan
Editor & typesetter: Wen Yan
This year's double 11, cloud native middleware has completed the trinity of open source, self-development and commercialization, and has been fully upgraded to middleware cloud products. MSE microservice governance has supported the traffic peak of Alibaba Group's core business "double 11" through Dubbo 3.0. Up to now, 50% of users in the group have been used to using MSE microservice governance HSF and Dubbo 3.0 applications. Today, let's talk about the full link grayscale capability in MSE service governance professional edition and some scenarios of its large-scale production practice.
background
Under the microservice architecture, there are some requirements development, which involves that multiple microservices on the microservice call link have been changed at the same time. It is necessary to better control the risk and explosion radius of the launch of the new version of services through gray-scale publishing. Generally, each microservice will have a gray environment or packet to receive gray traffic. We hope that the traffic entering the upstream gray environment can also enter the downstream gray environment to ensure that a request is always transmitted in the gray environment, even if some microservices on the call link do not have gray environment, These applications can still return to the gray environment when they request downstream. Through the full link grayscale capability provided by MSE, you can easily realize the above capabilities without modifying any of your service codes.
Gray scale characteristics of MSE microservice governance full link
As the fist function in MSE service management professional edition, full link gray has the following six characteristics
- Fine traffic can be introduced through custom rules
In addition to simply introducing traffic according to proportion, we also support the introduction of Spring Cloud and Dubbo traffic according to rules. Spring Cloud traffic can introduce traffic according to the requested cookie, header, param parameters or random percentage, and Dubbo traffic can be introduced according to services, methods and parameters.
- Full link isolated traffic lane
1) The required traffic is "dyed" by setting traffic rules, and the "dyed" traffic will be routed to the gray-scale machine.
2) Gray scale traffic carries gray scale to the downstream to form a gray scale exclusive environment traffic lane. Applications without gray scale environment will select the unmarked baseline environment by default.
- End to end stable baseline environment
Unmarked applications belong to the baseline stable version, i.e. stable online environment. When we publish the corresponding gray version code, we can configure rules to introduce specific online traffic to control the risk of gray code.
- Flow one key dynamic cut-off
After the traffic rules are customized, you can press one button to stop and start, add, delete, modify and query, and take effect in real time. Gray drainage is more convenient.
- Low cost access, based on Java Agent technology, without modifying a line of business code
MSE microservice governance capability is implemented based on Java Agent bytecode enhanced technology. It seamlessly supports all Spring Cloud and Dubbo versions on the market for nearly 5 years. Users can use it without changing a line of code, without changing the existing architecture of the business. It can be up and down at any time without binding. Just open MSE microservice governance professional edition, configure online and take effect in real time.
- It has the ability of lossless online and offline, making the release more smooth
After MSE microservice governance is enabled, the application has the ability of lossless online and offline. The scenarios of release, rollback, capacity expansion and capacity reduction under large traffic can ensure that the traffic is lossless.
Scenario of mass production practice
This paper mainly introduces the production practice of several common full link grayscale schemes summarized and abstracted by MSE microservice governance in the process of supporting key customers.
Scenario 1: automatically dye the traffic passing through the machine to realize the gray scale of the whole link
- After entering a node with a tag, subsequent calls give priority to nodes with the same tag, that is, dye the traffic passing through the tag node.
- If a node with the same tag cannot be found on the calling link with a tag, the fallback will find a node without a tag.
- The calling link with a tag passes through the node without a tag. If the link calls the node with a tag later, the tag calling mode will be restored.
Scenario 2: achieve full link grayscale by adding a specific header to the traffic
The client adds the identification of the development environment in the request, and the access layer forwards it to the gateway representing the corresponding environment according to the representation. The gateway of the corresponding environment calls the identification of the corresponding project isolation environment through the isolation plug-in, and requests to be closed-loop in the business project isolation environment.
Scenario 3: full link grayscale through custom routing rules
By adding a specified header to the gray level request, and the whole calling link will transmit the header through, you only need to configure the relevant routing rules of the header in the corresponding application, and enter the gray level request with the specified header into the gray level machine to realize the full link traffic gray level on demand.
Practice of full link grayscale
How can we quickly obtain the full link grayscale capability of the same model? Next, I will take you from 0 to 1 to quickly build our full link grayscale capability.
We assume that the application architecture consists of ingress nginx and the back-end micro service architecture (Spring Cloud). The back-end call link has three hops, shopping cart (a), trading center (b) and inventory center (c). They do service discovery through the Nacos registry, and the client accesses the back-end service through the client or H5 page.
prerequisite
Install ingress nginx component
Access the container service console, open the application directory, and search ack-ingress-nginx , Select namespace Kube system, click create, and after installation, click kube-system You will see a deployment in the namespace ack-ingress-nginx-default-controller , Indicates that the installation was successful.
$ kubectl get deployment -n kube-system NAME READY UP-TO-DATE AVAILABLE AGE ack-ingress-nginx-default-controller 2/2 2 2 18h
Open MSE microservice governance professional edition
- Click to open MSE micro Service Management Professional Edition To use full link grayscale capabilities.
- Access the container service console, open the application directory, search ack MSE pilot, and click create.
- On the MSE service governance console, open the K8s cluster list, select the corresponding cluster and namespace, and open microservice governance.
Deploy Demo application
Save the following file to ingress-gray.yaml and execute kubectl apply -f ingress-gray.yaml To deploy applications, here we will deploy three applications a, B and C. each application will deploy a baseline version and a grayscale version respectively.
# A application base version --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: spring-cloud-a name: spring-cloud-a spec: replicas: 2 selector: matchLabels: app: spring-cloud-a template: metadata: annotations: msePilotCreateAppName: spring-cloud-a labels: app: spring-cloud-a spec: containers: - env: - name: LANG value: C.UTF-8 - name: JAVA_HOME value: /usr/lib/jvm/java-1.8-openjdk/jre image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT imagePullPolicy: Always name: spring-cloud-a ports: - containerPort: 20001 protocol: TCP resources: requests: cpu: 250m memory: 512Mi livenessProbe: tcpSocket: port: 20001 initialDelaySeconds: 10 periodSeconds: 30 # A apply gray version --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: spring-cloud-a-new name: spring-cloud-a-new spec: replicas: 2 selector: matchLabels: app: spring-cloud-a-new strategy: template: metadata: annotations: alicloud.service.tag: gray msePilotCreateAppName: spring-cloud-a labels: app: spring-cloud-a-new spec: containers: - env: - name: LANG value: C.UTF-8 - name: JAVA_HOME value: /usr/lib/jvm/java-1.8-openjdk/jre - name: profiler.micro.service.tag.trace.enable value: "true" image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT imagePullPolicy: Always name: spring-cloud-a-new ports: - containerPort: 20001 protocol: TCP resources: requests: cpu: 250m memory: 512Mi livenessProbe: tcpSocket: port: 20001 initialDelaySeconds: 10 periodSeconds: 30 # B application base version --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: spring-cloud-b name: spring-cloud-b spec: replicas: 2 selector: matchLabels: app: spring-cloud-b strategy: template: metadata: annotations: msePilotCreateAppName: spring-cloud-b labels: app: spring-cloud-b spec: containers: - env: - name: LANG value: C.UTF-8 - name: JAVA_HOME value: /usr/lib/jvm/java-1.8-openjdk/jre image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.1-SNAPSHOT imagePullPolicy: Always name: spring-cloud-b ports: - containerPort: 8080 protocol: TCP resources: requests: cpu: 250m memory: 512Mi livenessProbe: tcpSocket: port: 20002 initialDelaySeconds: 10 periodSeconds: 30 # B apply gray version --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: spring-cloud-b-new name: spring-cloud-b-new spec: replicas: 2 selector: matchLabels: app: spring-cloud-b-new template: metadata: annotations: alicloud.service.tag: gray msePilotCreateAppName: spring-cloud-b labels: app: spring-cloud-b-new spec: containers: - env: - name: LANG value: C.UTF-8 - name: JAVA_HOME value: /usr/lib/jvm/java-1.8-openjdk/jre image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.1-SNAPSHOT imagePullPolicy: Always name: spring-cloud-b-new ports: - containerPort: 8080 protocol: TCP resources: requests: cpu: 250m memory: 512Mi livenessProbe: tcpSocket: port: 20002 initialDelaySeconds: 10 periodSeconds: 30 # C application base version --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: spring-cloud-c name: spring-cloud-c spec: replicas: 2 selector: matchLabels: app: spring-cloud-c template: metadata: annotations: msePilotCreateAppName: spring-cloud-c labels: app: spring-cloud-c spec: containers: - env: - name: LANG value: C.UTF-8 - name: JAVA_HOME value: /usr/lib/jvm/java-1.8-openjdk/jre image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-c:0.1-SNAPSHOT imagePullPolicy: Always name: spring-cloud-c ports: - containerPort: 8080 protocol: TCP resources: requests: cpu: 250m memory: 512Mi livenessProbe: tcpSocket: port: 20003 initialDelaySeconds: 10 periodSeconds: 30 # C apply gray version --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: spring-cloud-c-new name: spring-cloud-c-new spec: replicas: 2 selector: matchLabels: app: spring-cloud-c-new template: metadata: annotations: alicloud.service.tag: gray msePilotCreateAppName: spring-cloud-c labels: app: spring-cloud-c-new spec: containers: - env: - name: LANG value: C.UTF-8 - name: JAVA_HOME value: /usr/lib/jvm/java-1.8-openjdk/jre image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-c:0.1-SNAPSHOT imagePullPolicy: IfNotPresent name: spring-cloud-c-new ports: - containerPort: 8080 protocol: TCP resources: requests: cpu: 250m memory: 512Mi livenessProbe: tcpSocket: port: 20003 initialDelaySeconds: 10 periodSeconds: 30 # Nacos Server --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: nacos-server name: nacos-server spec: replicas: 1 selector: matchLabels: app: nacos-server template: metadata: labels: app: nacos-server spec: containers: - env: - name: MODE value: standalone image: nacos/nacos-server:latest imagePullPolicy: Always name: nacos-server resources: requests: cpu: 250m memory: 512Mi dnsPolicy: ClusterFirst restartPolicy: Always # Nacos Server Service configuration --- apiVersion: v1 kind: Service metadata: name: nacos-server spec: ports: - port: 8848 protocol: TCP targetPort: 8848 selector: app: nacos-server type: ClusterIP
Hands on practice
Scenario 1: automatically dye the traffic passing through the machine to realize the gray scale of the whole link
Sometimes, we can distinguish the baseline environment and gray environment on the line through different domain names. The gray environment has a separate domain name that can be configured. Suppose we request the gray environment by visiting www.gray.com and go to the baseline environment by visiting www.base.com.
Call the link ingress nginx - > A - > b - > C, where a can be a spring boot application.
Note: for the gray and base environments of portal application a, it is necessary to turn on the transparence switch of application a according to the flow ratio in the MSE service management console, indicating that the function of transparently transmitting the label of the current environment back is turned on. In this way, when Ingress-nginx routes the gray of A, even if no header is carried in the request, because the switch is turned on, the header will automatically be added later, and the gray value of header will come from the label information of A application configuration. If the original request contains x-mse-tag:gray, the tag in the original request will take precedence.
For portal application a, configure two k8s services. Spring-cloud-a-base corresponds to the base version of a, and spring-cloud-a-gray corresponds to the gray version of A.
apiVersion: v1 kind: Service metadata: name: spring-cloud-a-base spec: ports: - name: http port: 20001 protocol: TCP targetPort: 20001 selector: app: spring-cloud-a --- apiVersion: v1 kind: Service metadata: name: spring-cloud-a-gray spec: ports: - name: http port: 20001 protocol: TCP targetPort: 20001 selector: app: spring-cloud-a-new
Configure the Ingress rules of the entry, access www.base.com to route to the base version of A application, and access www.gray.com to route to the gray version of A application.
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-base spec: rules: - host: www.base.com http: paths: - backend: serviceName: spring-cloud-a-base servicePort: 20001 path: / --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-gray spec: rules: - host: www.gray.com http: paths: - backend: serviceName: spring-cloud-a-gray servicePort: 20001 path: /
Result verification
At this point, visit www.base.com to route to the baseline environment
curl -H"Host:www.base.com" http://106.14.155.223/a A[172.18.144.155] -> B[172.18.144.120] -> C[172.18.144.79]
At this point, visit www.gray.com to route to the gray environment
curl -H"Host:www.gray.com" http://106.14.155.223/a Agray[172.18.144.160] -> Bgray[172.18.144.57] -> Cgray[172.18.144.157]
Further, if the portal application A does not have A grayscale environment, accesses the base environment of A, and needs to enter the grayscale environment when A - > b, you can add A special header x-mse-tag To implement, the value of the header is the label of the environment you want to go to, for example gray.
curl -H"Host:www.base.com" -H"x-mse-tag:gray" http://106.14.155.223/a A[172.18.144.155] -> Bgray[172.18.144.139] -> Cgray[172.18.144.8]
You can see that the first hop enters the base environment of A, but when A - > b, it returns to the gray environment again.
The advantage of this method is that the configuration is simple. You only need to configure the rules at the Ingress. When an application needs gray publishing, you only need to deploy the application in the gray environment. The gray traffic will naturally enter the gray machine. If there is no problem with verification, the gray image will be published to the baseline environment; If more than one application needs gray publishing at a time, they can be added to the gray environment.
Best practices
- All applications in gray environment are marked with gray, and applications in baseline environment are not marked by default.
- 2% of the traffic is drained into the gray environment by online normalization
Scenario 2: achieve full link grayscale by adding a specific header to the traffic
Some clients can't rewrite the domain name. They want to visit www.demo.com and route to the gray environment by passing in different headers. For example, in the following figure, you can access the grayscale environment by adding the header x-mse-tag:gray.
At this time, the progress rule of demo is as follows. Note that it is added here nginx.ingress.kubernetes.io/canary Related multiple rules
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-base spec: rules: - host: www.demo.com http: paths: - backend: serviceName: spring-cloud-a-base servicePort: 20001 path: / --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-gray annotations: nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-by-header: "x-mse-tag" nginx.ingress.kubernetes.io/canary-by-header-value: "gray" nginx.ingress.kubernetes.io/canary-weight: "0" spec: rules: - host: www.base.com http: paths: - backend: serviceName: spring-cloud-a-gray servicePort: 20001 path: /
Result verification
At this point, visit www.demo.com to route to the baseline environment
curl -H"Host:www.demo.com" http://106.14.155.223/a A[172.18.144.155] -> B[172.18.144.56] -> C[172.18.144.156]
How do I access the grayscale environment? Just add a header to the request x-mse-tag:gray Just.
curl -H"Host:www.demo.com" -H"x-mse-tag:gray" http://106.14.155.223/a Agray[172.18.144.82] -> Bgray[172.18.144.57] -> Cgray[172.18.144.8]
You can see that Ingress is directly routed to A's gray environment according to this header.
further more
You can also use Ingress to realize more complex routing. For example, the client has brought a header and wants to use the existing header to realize routing without adding a new header. For example, as shown in the figure below, suppose we want to x-user-id A request of 100 enters the grayscale environment.
You only need to add the following four rules:
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-base spec: rules: - host: www.demo.com http: paths: - backend: serviceName: spring-cloud-a-base servicePort: 20001 path: / --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-base-gray annotations: nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-by-header: "x-user-id" nginx.ingress.kubernetes.io/canary-by-header-value: "100" nginx.ingress.kubernetes.io/canary-weight: "0" spec: rules: - host: www.demo.com http: paths: - backend: serviceName: spring-cloud-a-gray servicePort: 20001 path: /
When accessing, bring a special header and enter the gray environment if the conditions are met
curl -H"Host:www.demo.com" -H"x-user-id:100" http://106.14.155.223/a Agray[172.18.144.93] -> Bgray[172.18.144.24] -> Cgray[172.18.144.25]
If the request does not meet the conditions, enter the baseline environment:
curl -H"Host:www.demo.com" -H"x-user-id:101" http://106.14.155.223/a A[172.18.144.91] -> B[172.18.144.22] -> C[172.18.144.95]
Compared with scenario 1, the advantage is that the domain name of the client remains unchanged and only needs to be distinguished by request.
Scenario 3: full link grayscale through custom routing rules
Sometimes we don't want automatic transparent transmission and automatic routing, but we want each application on the upstream and downstream of the micro service call chain to customize the gray rules. For example, application B wants to control that only the requests that meet the custom rules will be routed to application B, while application C may want to define gray rules different from B. how should we configure them, See the following figure for the scene:
Note that it is best to clear the parameters configured in scenarios 1 and 2.
The first step is to add an environment variable: alicloud.service.header=x-user-id, x-user-id at portal application A (preferably all portal applications, including gray and base) It is A header that needs transparent transmission. Its function is to identify the header and do automatic transparent transmission.
Note: do not use x-mse-tag here. It is the default header of the system and has special logic.
In the second step, B in the middle should be used to configure label routing rules in the MSE console
The third step is to configure routing rules at Ingress. Refer to scenario 2 and adopt the following configuration:
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: spring-cloud-a-base spec: rules: - host: www.base.com http: paths: - backend: serviceName: spring-cloud-a-base servicePort: 20001 path: / --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/canary: 'true' nginx.ingress.kubernetes.io/canary-by-header: x-user-id nginx.ingress.kubernetes.io/canary-by-header-value: '100' nginx.ingress.kubernetes.io/canary-weight: '0' name: spring-cloud-a-gray spec: rules: - host: www.base.com http: paths: - backend: serviceName: spring-cloud-a-gray servicePort: 20001 path: /
Result verification
Test and verify, access the gray environment, take the qualified header and route to the gray environment of B.
curl 120.77.215.62/a -H "Host: www.base.com" -H "x-user-id: 100" Agray[192.168.86.42] -> Bgray[192.168.74.4] -> C[192.168.86.33]
Access the gray environment, take the header that does not meet the conditions, and route it to the base environment of B.
curl 120.77.215.62/a -H "Host: www.base.com" -H "x-user-id: 101" A[192.168.86.35] -> B[192.168.73.249] -> C[192.168.86.33]
Remove the Ingress Canary configuration, access the base A service (the baseline environment entry application needs to add alicloud.service.header environment variable), take the qualified header, and route to the gray environment of B.
curl 120.77.215.62/a -H "Host: www.base.com" -H "x-user-id: 100" A[192.168.86.35] -> Bgray[192.168.74.4] -> C[192.168.86.33]
Access the base environment, take the unqualified header and route to the base environment of B.
curl 120.77.215.62/a -H "Host: www.base.com" -H "x-user-id: 101" A[192.168.86.35] -> B[192.168.73.249] -> C[192.168.86.33]
summary
20 minutes of rapid practice has the ability of full link grayscale with great technical difficulty. In fact, full link grayscale is not so difficult!
Based on the full link grayscale capability of MSE service governance, we can quickly implement the enterprise level full link grayscale capability. The above three scenarios are our standard scenarios for large-scale implementation in production practice. Of course, we can customize and adapt according to our own business based on the capability of MSE service governance; Even in the context of multiple traffic sources, accurate drainage can be customized according to the business.
At the same time, the observability ability of MSE service management professional edition makes the gray-scale effectiveness measurable. If the gray is not gray, how gray is, so as to "know in mind".
- Gray flow second level monitoring
Standardize the release process
In daily release, we often have the following wrong ideas:
- The content of this change is relatively small, and the online requirements are relatively urgent, so you don't need to test and release it online directly.
- Publishing does not need to go through the gray process, but can be released online quickly.
- Grayscale publishing is useless. It is just a process. After publishing, it is directly published online without waiting for observation.
- Although gray publishing is very important, gray environment is difficult to build, time-consuming and labor-consuming, and the priority is not high.
These ideas may lead us to a wrong release. Many failures are caused directly or indirectly by the release. Therefore, improving the quality of release and reducing the occurrence of errors is a key link to effectively reduce online failures. To achieve safe release, we need to standardize the release process.
tail
With the popularity of microservices, more and more companies use microservice framework. Microservices provide better fault tolerance and adapt to the rapid iteration of business with their characteristics of high cohesion and low coupling, which brings a lot of convenience to developers. However, with the development of business, the separation of microservices is becoming more and more complex, and the governance of microservices has become a headache.
Just take the whole link gray level. In order to ensure the verification of the functional correctness of the new version of the application before it goes online, we need to take into account the efficiency of application release. If the scale of our application is very small, we can directly ensure the correctness of release by maintaining multiple sets of environments. However, when our business grows to a large and complex level, it is assumed that our system is composed of 100 microservices, even if the test / gray environment takes 1 to 2 hours for each service With two pod s and so many sets of environments, we need to face the challenge of huge cost and efficiency brought by the operation and maintenance environment.
Is there a simpler and more efficient way to solve the problem of microservice governance?
MSE microservice engine will launch the professional version of service governance to provide out of the box and complete professional microservice governance solutions to help enterprises better realize microservice governance capabilities. If your system can quickly have a complete full link gray-scale capability as described in this paper, and further microservice governance practice based on this capability, it can not only save objective manpower and cost, but also make your enterprise more confident in the exploration of microservices.
Copyright notice: the content of this article is spontaneously contributed by Alibaba cloud real name registered users. The copyright belongs to the original author. Alibaba cloud developer community does not own its copyright or bear corresponding legal liabilities. Please refer to Alibaba cloud developer community user service agreement and Alibaba cloud developer community intellectual property protection guidelines for specific rules. If you find any content suspected of plagiarism in the community, fill in the infringement complaint form to report. Once verified, the community will immediately delete the content suspected of infringement.