1. Overview
This article shares the elegant downtime of Dubbo, corresponding to Dubbo User Guide Elegant Downtime .
Define as follows:
Dubbo accomplishes elegant downtime through JDK's HutdownHook, so if a user uses a mandatory shutdown command such as kill-9 PID, the elegant downtime will not be executed and will only be executed through kill PID.
- This one, we are in 「2. ShutdownHook」 In, parse in detail.
The principle is as follows:
service provider
- When stopping, it is marked as not accepting new requests, and the new requests come in with a direct error, allowing the client to retry the other machine.//<1>
- It then detects if the threads in the thread pool are running and, if so, waits for all threads to complete execution and forces shutdown unless a time-out occurs.// <2>
Service consumer
- When stopped, no new call requests are made, and all new calls fail on the client side.// <3>
- Then, detect whether a response to a request has not returned, wait for the response to return, and force it to close unless it times out.// <4>
- <1> <2>: Based on the READONLYEVENT event, in Perfect Dubbo Source Analysis - Exchange Layer of NIO Server (4) In, see HeaderExchangeServer's "4.1.4 Elegant Close" .
- <3> <4>:At Perfect Dubbo Source Analysis - Exchange Layer of NIO Server (4) In, see HeaderExchangeChannel's "2.1.2 Send Request" And "2.1.3 Elegant Close".
- (vii) Because the previous article was opened, the following is a sequence diagram of the whole.
2. ShutdownHook
ShutdownHook, Dubbo's elegant downtime, is initialized in AbstractConfig's static code block with the following code:
static { Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() { public void run() { if (logger.isInfoEnabled()) { logger.info("Run shutdown hook now."); } ProtocolConfig.destroyAll(); } }, "DubboShutdownHook")); } |
- From the location of the code, this is not a good location.However, this is an appropriate location considering that it is guaranteed to be initialized to ShutdownHook.Of course, from the official TODO, there may be a change of place in the future.
-
ProtocolConfig#destroyAll() method, code as follows:
1: public static void destroyAll() { 2: // Ignore if destroyed 3: if (!destroyed.compareAndSet(false, true)) { 4: return; 5: } 6: // Destroy Registry correlation 7: AbstractRegistryFactory.destroyAll(); 8: 9: // Wait until the service is consumed and the registry notifies that the service provider is offline, increasing the success rate of graceful downtime without retrying. 10: // Wait for registry notification 11: try { 12: Thread.sleep(ConfigUtils.getServerShutdownTimeout()); 13: } catch (InterruptedException e) { 14: logger.warn("Interrupted unexpectedly when waiting for registry notification during shutdown process!"); 15: } 16: 17: // Destroy Protocol Related 18: ExtensionLoader<Protocol> loader = ExtensionLoader.getExtensionLoader(Protocol.class); 19: for (String protocolName : loader.getLoadedExtensions()) { 20: try { 21: Protocol protocol = loader.getLoadedExtension(protocolName); 22: if (protocol != null) { 23: protocol.destroy(); 24: } 25: } catch (Throwable t) { 26: logger.warn(t.getMessage(), t); 27: } 28: } 29: }
- Lines 2 to 5: Ignore if destroyed.
- Line 7: Call the Abstract Registry Factory#destroyAll() method, destroy all Registries, and unsubscribe and register service providers and consumers in the application.For a detailed analysis, see 「2.1 AbstractRegistryFactory」 Medium.
-
Lines 9 to 15: sleep waits for a period of time for service consumers in other applications to receive a registry notification that the service provider for the application is offline, increasing the success rate of elegant downtime without retrying.
-
Of course, this is not an absolute wait, but rather the developer configures the "dubbo.service.shutdown.wait" parameter to set the wait time in milliseconds.ConfigUtils#getServerShutdownTimeout() method with the following code:
public static int getServerShutdownTimeout() { // Default, 10 * 1000ms int timeout = Constants.DEFAULT_SERVER_SHUTDOWN_TIMEOUT; // Get the "dubbo.service.shutdown.wait" configuration item in milliseconds String value = ConfigUtils.getProperty(Constants.SHUTDOWN_WAIT_KEY); if (value != null && value.length() > 0) { try { timeout = Integer.parseInt(value); } catch (Exception e) { } // If empty, get the "dubbo.service.shutdown.wait.seconds" configuration item in seconds. // ps: This parameter has been discarded and "dubbo.service.shutdown.wait" is recommended } else { value = ConfigUtils.getProperty(Constants.SHUTDOWN_WAIT_SECONDS_KEY); if (value != null && value.length() > 0) { try { timeout = Integer.parseInt(value) * 1000; } catch (Exception e) { } } } // Return return timeout; }
- Default 10 * 1000ms.
-
In ISSUE#1021: Enhancement for graceful shutdown This is a very interesting discussion. Fat friends must have a look at it.
Whether you use the most version 2.5.3 or the latest version 2.5.7, you can't do elegant downtime without setting up a retry mechanism. This change is mainly to modify a little code and add a configurable wait time to simply do "graceful downtime without starting a retry".
The main implementation mechanism is to add a configurable wait time in the two phases, [after provider s disconnect the registry, before closing the response], and [after consumer s remove invoker s, before closing the client]. Currently, hands-on testing can be done without configuring retries or with elegant downtime.
Since most dubbo-enabled companies now turn off retry mechanisms to avoid extreme avalanches and traffic storms, most interfaces fail to do elegant downtime with the current Dubbo elegant downtime settings, so the elegance without retrying is enhanced here in a simpler waySuccess rate of downtime.
-
-
Lines 17 to 28: Destroy all Protocol s.There are two types of protocols currently layered:
- The rotocol implementation class integrated with Registry, Registry Protocol, focuses on the registration of services.Specific destruction logic, see 「2.3 RegistryProtocol」 Medium.
- Protocol implementation classes for specific protocols, such as dubbo:// corresponding DubboProtocol, hessian:// corresponding HessianProtocol, focus on service exposure and reference.Because DubboProtocol is the most commonly used, we take it for example in 「2.2 DubboProtocol」 Share in.
2.1 AbstractRegistryFactory
#destroyAll() method, destroys all Registries.The code is as follows:
private static final Map<String, Registry> REGISTRIES = new ConcurrentHashMap<String, Registry>(); public static void destroyAll() { if (LOGGER.isInfoEnabled()) { LOGGER.info("Close all registries " + getRegistries()); } // Acquire locks LOCK.lock(); try { // Destroy for (Registry registry : getRegistries()) { try { registry.destroy(); } catch (Throwable e) { LOGGER.error(e.getMessage(), e); } } // wipe cache REGISTRIES.clear(); } finally { // Release lock LOCK.unlock(); } } |
- Call the Registry#destroy() method to destroy each Registry.
-
AbstractRegistry implements a common destruction logic: unregistering and subscribing.The code is as follows:
@Override public void destroy() { // Destroyed, skipped if (!destroyed.compareAndSet(false, true)) { return; } if (logger.isInfoEnabled()) { logger.info("Destroy registry:" + getUrl()); } // Unregister Set<URL> destroyRegistered = new HashSet<URL>(getRegistered()); if (!destroyRegistered.isEmpty()) { for (URL url : new HashSet<URL>(getRegistered())) { if (url.getParameter(Constants.DYNAMIC_KEY, true)) { try { unregister(url); // Unregister if (logger.isInfoEnabled()) { logger.info("Destroy unregister url " + url); } } catch (Throwable t) { logger.warn("Failed to unregister url " + url + " to registry " + getUrl() + " on destroy, cause: " + t.getMessage(), t); } } } } // unsubscribe Map<URL, Set<NotifyListener>> destroySubscribed = new HashMap<URL, Set<NotifyListener>>(getSubscribed()); if (!destroySubscribed.isEmpty()) { for (Map.Entry<URL, Set<NotifyListener>> entry : destroySubscribed.entrySet()) { URL url = entry.getKey(); for (NotifyListener listener : entry.getValue()) { try { unsubscribe(url, listener); // unsubscribe if (logger.isInfoEnabled()) { logger.info("Destroy unsubscribe url " + url); } } catch (Throwable t) { logger.warn("Failed to unsubscribe url " + url + " to registry " + getUrl() + " on destroy, cause: " + t.getMessage(), t); } } } } }
- Registry is registered and subscribed to by both service providers and consumers, so cancellation is required.
-
FailbackRegistry, a subclass of AbstractRegistry, implements the retry task of destroying the public.The code is as follows:
@Override public void destroy() { // Ignore if destroyed if (!canDestroy()) { return; } // Call parent method, unregister and subscribe super.destroy(); // Destroy Retry Task try { retryFuture.cancel(true); } catch (Throwable t) { logger.warn(t.getMessage(), t); } } protected boolean canDestroy(){ return destroyed.compareAndSet(false, true); }
-
FailbackRegistry has multiple implementation classes with logic to destroy their corresponding client connections.Take ZookeeperRegistry as an example.The code is as follows:
@Override public void destroy() { // Call parent method, unregister and subscribe super.destroy(); try { // Close Zookeeper Client Connection zkClient.close(); } catch (Exception e) { logger.warn("Failed to close zookeeper client " + getUrl() + ", cause: " + e.getMessage(), e); } }
2.2 DubboProtocol
#destroy() method, destroys all communications ExchangeClient and ExchangeServer.The code is as follows:
1: @SuppressWarnings("Duplicates") 2: @Override 3: public void destroy() { 4: // Destroy all Exchange Servers 5: for (String key : new ArrayList<String>(serverMap.keySet())) { 6: ExchangeServer server = serverMap.remove(key); 7: if (server != null) { 8: try { 9: if (logger.isInfoEnabled()) { 10: logger.info("Close dubbo server: " + server.getLocalAddress()); 11: } 12: server.close(ConfigUtils.getServerShutdownTimeout()); 13: } catch (Throwable t) { 14: logger.warn(t.getMessage(), t); 15: } 16: } 17: } 18: 19: // Destroy all ExchangeClient s 20: for (String key : new ArrayList<String>(referenceClientMap.keySet())) { 21: ExchangeClient client = referenceClientMap.remove(key); 22: if (client != null) { 23: try { 24: if (logger.isInfoEnabled()) { 25: logger.info("Close dubbo connect: " + client.getLocalAddress() + "-->" + client.getRemoteAddress()); 26: } 27: client.close(ConfigUtils.getServerShutdownTimeout()); // Destroy 28: } catch (Throwable t) { 29: logger.warn(t.getMessage(), t); 30: } 31: } 32: } 33: // Destroy all ghost ExchangeClient s 34: for (String key : new ArrayList<String>(ghostClientMap.keySet())) { 35: ExchangeClient client = ghostClientMap.remove(key); 36: if (client != null) { 37: try { 38: if (logger.isInfoEnabled()) { 39: logger.info("Close dubbo connect: " + client.getLocalAddress() + "-->" + client.getRemoteAddress()); 40: } 41: client.close(ConfigUtils.getServerShutdownTimeout()); // Destroy 42: } catch (Throwable t) { 43: logger.warn(t.getMessage(), t); 44: } 45: } 46: } 47: // [TODO 8033] parameter callback 48: stubServiceMethodsMap.clear(); 49: super.destroy(); 50: } |
- In fact, an application can be both a service provider and a service consumer.Therefore, you need to close ExchangeClient and ExchangeServer.
- Lines 4 to 17: Loop the HeaderExchangeServer#close(timeout) method to destroy all ExchangeServers.For a detailed analysis, see 「2.2.1 HeaderExchangeServer」 .
- Lines 19 to 32: Loop the ReferenceCountExchangeClient#close(timeout) method to destroy all ReferenceCountExchangeClients.Inside the method, the HeaderExchangeClient#close(timeout) method is called to close the HeaderExchangeClient object.For a detailed analysis, see 「2.2.2 HeaderExchangeClient」.
- Lines 33 to 46: The LazyConnectExchangeClient#close(timeout) method is called circularly to close.For more information about the LazyConnectExchangeClient, see Perfect Dubbo Source Analysis - Remote Reference to Service Reference (Dubbo) Of 「5.2 LazyConnectExchangeClient」 .
- Line 48: [TODO 8033] Parameter callback
-
Line 49: Call the parent AbstractExporter#unexport() method to remove the exposure of the service (Exporter).The code is as follows:
1: @Override 2: public void destroy() { 3: // Destroy all Invoker s of service consumers corresponding to the agreement 4: for (Invoker<?> invoker : invokers) { 5: if (invoker != null) { 6: invokers.remove(invoker); 7: try { 8: if (logger.isInfoEnabled()) { 9: logger.info("Destroy reference: " + invoker.getUrl()); 10: } 11: invoker.destroy(); 12: } catch (Throwable t) { 13: logger.warn(t.getMessage(), t); 14: } 15: } 16: } 17: // Destroy all Exporter s of service providers corresponding to the agreement 18: for (String key : new ArrayList<String>(exporterMap.keySet())) { 19: Exporter<?> exporter = exporterMap.remove(key); 20: if (exporter != null) { 21: try { 22: if (logger.isInfoEnabled()) { 23: logger.info("Unexport service: " + exporter.getInvoker().getUrl()); 24: } 25: exporter.unexport(); 26: } catch (Throwable t) { 27: logger.warn(t.getMessage(), t); 28: } 29: } 30: } 31: }
- Lines 3 to 16: Cycle to destroy all Invokers (DubboInvoker here) of service consumers corresponding to the agreement (DubboProtocol).For a detailed analysis, see 「2.2.3 DubboInvoker」 .
- Lines 17 to 30: Cycle, destroy all Exporters (DubboExporter here) of the service provider corresponding to the agreement (DubboProtocol).For a detailed analysis, see 「2.2.4 DubboExporter」 .
2.2.1 HeaderExchangeServer
#close(timeout) method, the overall process is as follows:
-
The red box section: Because the ProtocolListenerWrapper and the ProtocolFilterWrapper and Protocols'Ubbo SPI Wrapper implementation classes, they are called first when the DubboProtocol#destroy() method is called.At present, it is just a layer of packaging, no logic, the code is as follows:
// ProtocolListenerWrapper.java @Override public void destroy() { protocol.destroy(); } // ProtocolFilterWrapper.java @Override public void destroy() { protocol.destroy(); }
- Green box section: HeaderExchangeServer's elegant shutdown process, as detailed in Perfect Dubbo Source Analysis - Exchange Layer of NIO Server (4) Of "4.1.4 Elegant Close" .
- Yellow box section: NettyServer closes the real Netty-related server components, as detailed in Perfect Dubbo Source Analysis - Netty4 Implementation of NIO Server (6) Of "Shut down the server" .
- ExecutorUtil provides two closing methods, which are 「3. ExecutorUtil」 Detailed analysis.
2.2.2 HeaderExchangeClient
#close(timeout) method, the overall process is as follows:
- The red box section: The elegant closing process of the HeaderExchangeClient, as detailed in Perfect Dubbo Source Analysis - Exchange Layer of NIO Server (4) Of "2.1.3 Elegant Close".
- Green box section: NettyClient closes the true Netty-related client components, as detailed in Perfect Dubbo Source Analysis - Netty4 Implementation of NIO Server (6) Of "Close channel".
2.2.3 DubboInvoker
#destroy() method, destroy ExchangeClient.The code is as follows:
1: @Override 2: public void destroy() { 3: // Ignore if destroyed 4: if (super.isDestroyed()) { 5: return; 6: } else { 7: // double check to avoid dup close 8: // Dual lock check to avoid being turned off 9: destroyLock.lock(); 10: try { 11: if (super.isDestroyed()) { 12: return; 13: } 14: // Tag off 15: super.destroy(); 16: // Remove `invokers` 17: if (invokers != null) { 18: invokers.remove(this); 19: } 20: // Close ExchangeClient s 21: for (ExchangeClient client : clients) { 22: try { 23: client.close(ConfigUtils.getServerShutdownTimeout()); 24: } catch (Throwable t) { 25: logger.warn(t.getMessage(), t); 26: } 27: } 28: } finally { 29: // Release lock 30: destroyLock.unlock(); 31: } 32: } 33: } |
- Code is easy to understand, fat friends see code comments.Here are just a few ways to share.
-
Parent AbstractInvoker#isDestroyed() method to determine if it has been destroyed.The code is as follows:
/** * Is it destroyed */ private AtomicBoolean destroyed = new AtomicBoolean(false); public boolean isDestroyed() { return destroyed.get(); }
-
Parent AbstractInvoker#destroy() method, token destroyed.The code is as follows:
/** * Is Available */ private volatile boolean available = true; @Override public void destroy() { if (!destroyed.compareAndSet(false, true)) { return; } setAvailable(false); } protected void setAvailable(boolean available) { this.available = available; }
- Also, it will mark that DubboInvoker is no longer available.
-
Calling the #invoke(Invocation) method after the tag has been destroyed throws an RpcException exception.The code is as follows:
@Override public Result invoke(Invocation inv) throws RpcException { if (destroyed.get()) { throw new RpcException("Rpc invoker for service " + this + " on consumer " + NetUtils.getLocalHost() + " use dubbo version " + Version.getVersion() + " is DESTROYED, can not be invoked any more!"); } // ...omit other code }
- x
-
Lines 20 to 27: Loop, call the ReferenceCountExchangeClient#close(timeout) method, and close the client.In fact, the client has been closed in the DubboProtocol#destroy() method.Although it looks duplicated, it doesn't.Because DubboInvoker needs to be destroyed when the remote service provider closes, the client's link must be closed.Therefore, DubboInvoker must have this logic.
2.2.4 DubboExporter
#unexport() method, cancel exposure.The code is as follows:
/** * Service Key */ private final String key; /** * Exporter aggregate * * key: Service Key * * The value is actually {@link com.alibaba.dubbo.rpc.protocol.AbstractProtocol#exporterMap} */ private final Map<String, Exporter<?>> exporterMap; @Override public void unexport() { // Unexpose super.unexport(); // Remove yourself exporterMap.remove(key); } |
-
Call the parent AbstractExporter#unexport() method to unexpose.The code is as follows:
/** * Invoker object */ private final Invoker<T> invoker; /** * Whether to Unexpose Services */ private volatile boolean unexported = false; @Override public void unexport() { // Marker Unexposed if (unexported) { return; } unexported = true; // Destroy getInvoker().destroy(); }
-
Where invoker is shown in the following figure:
-
This Invoker was created using JavassistProxyFactory and actually implements the AbstractProxyInvoker Abstract class.So the #destroy() method is as follows, with the following code:
@Override public void destroy() { }
- _Empty, hey hey hey.
-
-
2.3 RegistryProtocol
#destroy() method, to remove all exporters from exposure.The code is as follows:
/** * A collection of bound relationships. * * key: Service Dubbo URL */ private final Map<String, ExporterChangeableWrapper<?>> bounds = new ConcurrentHashMap<String, ExporterChangeableWrapper<?>>(); @Override public void destroy() { // Get Exporter Array List<Exporter<?>> exporters = new ArrayList<Exporter<?>>(bounds.values()); // Unexpose all Exporter s for (Exporter<?> exporter : exporters) { exporter.unexport(); } // empty bounds.clear(); } |
-
Loop, call the ExporterChangeableWrapper#unexport() method, and cancel the service exposure.The code is as follows:
/** * Exporter object exposed */ private Exporter<T> exporter; @Override public void unexport() { String key = getCacheKey(this.originInvoker); // Remove `bounds` bounds.remove(key); // Unexpose exporter.unexport(); }
- Because the service provider integrates the Configurator of configuration rules, you need to use ExporterChangeableWrapper to save the original Invoker object.
- Therefore, all of the above DE-exposure logic cannot destroy the ExporterChangeableWrapper's mapping to bounds and needs to be implemented through the #destroy() method of RegistryProtocol s.
- Therefore, the exposed Exporter object, exporter, is called here and has been DE-exposed by the AbstractExporter#unexport() method.However, this logic cannot be removed here, because there may be a place to call the ExporterChangeableWrapper#unexport() method.
- Because the service provider integrates the Configurator of configuration rules, you need to use ExporterChangeableWrapper to save the original Invoker object.
3. ExecutorUtil
3.1 gracefulShutdown
The #gracefulShutdown(executor, timeout) method closes gracefully, prohibits new tasks from being submitted, and completes old tasks.
public static void gracefulShutdown(Executor executor, int timeout) { // Ignore, if not ExecutorService, or closed if (!(executor instanceof ExecutorService) || isShutdown(executor)) { return; } // Close, disable new tasks from submitting, and finish existing tasks final ExecutorService es = (ExecutorService) executor; try { es.shutdown(); // Disable new tasks from being submitted <1> } catch (SecurityException ex2) { return; } catch (NullPointerException ex2) { return; } // Wait for the original task to finish.Force all tasks to end if waiting for a timeout try { if (!es.awaitTermination(timeout, TimeUnit.MILLISECONDS)) { es.shutdownNow(); } } catch (InterruptedException ex) { // An InterruptedException exception occurs, also forcing the end of all tasks es.shutdownNow(); Thread.currentThread().interrupt(); } // New threads open to close if not closed successfully if (!isShutdown(es)) { newThreadToCloseExecutor(es); } } |
3.2 shutdownNow
The #shutdownNow(executor, timeout) method, which forces shutdown, including interrupting tasks that are already executing.
public static void shutdownNow(Executor executor, final int timeout) { // Ignore, if not ExecutorService, or closed if (!(executor instanceof ExecutorService) || isShutdown(executor)) { return; } // Close immediately, including interrupted tasks final ExecutorService es = (ExecutorService) executor; try { es.shutdownNow(); // <1> } catch (SecurityException ex2) { return; } catch (NullPointerException ex2) { return; } // Waiting for the original task to be interrupted try { es.awaitTermination(timeout, TimeUnit.MILLISECONDS); } catch (InterruptedException ex) { Thread.currentThread().interrupt(); } // New threads open to close if not closed successfully if (!isShutdown(es)) { newThreadToCloseExecutor(es); } } |
- Unlike the #gracefulShutdown(executor, timeout) method, the #shutdownNow() method is called at <1> instead of the #shutdown() method.
3.3 newThreadToCloseExecutor
#newThreadToCloseExecutor(ExecutorService) method, which opens new threads and constantly forces shutdown.
private static void newThreadToCloseExecutor(final ExecutorService es) { if (!isShutdown(es)) { shutdownExecutor.execute(new Runnable() { public void run() { try { // Cycle 1000 times to force endpoint of thread pool for (int i = 0; i < 1000; i++) { // Close immediately, including interrupted tasks es.shutdownNow(); // Waiting for the original task to be interrupted if (es.awaitTermination(10, TimeUnit.MILLISECONDS)) { break; } } } catch (InterruptedException ex) { Thread.currentThread().interrupt(); } catch (Throwable e) { logger.warn(e.getMessage(), e); } } }); } } |
666. Eggs
In theory, if a service provider is to be shut down, the general process is as follows:
Provider => registry: remove yourself
Provider => consumer: I'm ready to close, don't call me
All consumer => provider: Okay, I know
Provider => consumer: process all original requests
provider shutdown
But the reality is very complex, if you rely on consumer to answer and confirm.So Dubbo's choice is:
- Provider removes itself from registry.And sleep waits for a certain amount of time (developer-allocated) for consumer to be notified.Of course, this process is not absolutely successful.For example, consumer cannot connect to registry, but to the upper provider.
- provider informs consumer that he is ready to close and does not ask for himself.When all notifications are complete, wait until the original request is processed.When finished, close the local server and thread pool.
Of course, consumer also gracefully shuts down, waiting for all the requests it makes to end.Relatively simple.
Recommended reading articles: