Spring Cloud Config Server Node migration issues, please pay special attention to this!

Keywords: Java Spring github git Attribute

Preface:

Although it is strongly recommended to use domestic open source configuration centers, such as Ctrip's Apollo configuration center and Ali's Nacos Registration & configuration center.

However, when choosing the actual architecture, according to the actual project size, business complexity and other factors, some projects will still choose Spring Cloud Config, which is also recommended by Spring Cloud official website. Especially in scenarios where the performance requirements are not very high, Spring Cloud Config is still useful, basically able to meet the requirements, through Git natural support version control mode management configuration.

Moreover, there are also small partners in the github community who have developed a simple configuration management interface to address some of Spring Cloud Config's "flaws", and have also been open source, such as spring-cloud-config-admin, which is also a masterpiece of Super Columbia (programmer DD). The address of the project is: https://dyc87112.github.io/sp...

Spring Cloud version used in this article: Edgware.SR3, Spring Boot version: 1.5.10.RELEASE

Problem analysis:

Personally, I think this problem is representative, and based on it, I can understand how to improve the official website. In the process of using Spring Cloud Config, if you encounter configuration center server migration, you may encounter the problems described in the blog DD:
http://blog.didispace.com/Spr...

Here I will briefly outline the questions raised in this article:

When using Spring Cloud Config configuration center node migration or containerized deployment (IP is changed), Config Server will report an error due to health check failure, which is caused by using or pre-migration node IP.

This paper combines this problem as a starting point, continues to extend, and combines with the source code to explore the reasons and improvement measures.

The prerequisite is to use DiscoveryClient service registration discovery. If we use Eureka as the registry, its implementation class is Eureka DiscoveryClient.
The client connects to the configuration center through Eureka, which requires the following configuration:

spring.cloud.config.discovery.service-id=config-server
spring.cloud.config.discovery.enabled=true

The key here is the spring.cloud.config.discovery.enabled configuration, which defaults to false and is set to true to indicate activation of service discovery. Eventually, the Discovery Client Config Service Bootstrap Configuration starts the configuration class to find the configuration center service.

Next, let's look at the source code for this class:

@ConditionalOnProperty(value = "spring.cloud.config.discovery.enabled", matchIfMissing = false) 
@Configuration
 // Introducing Tool Class Automatic Configuration Class
@Import({ UtilAutoConfiguration.class })
// Open Service Discovery
@EnableDiscoveryClient 
public class DiscoveryClientConfigServiceBootstrapConfiguration {
@Autowired
private ConfigClientProperties config;
@Autowired
private ConfigServerInstanceProvider instanceProvider;
private HeartbeatMonitor monitor = new HeartbeatMonitor();
@Bean
public ConfigServerInstanceProvider configServerInstanceProvider(
                DiscoveryClient discoveryClient) {
    return new ConfigServerInstanceProvider(discoveryClient);
}

// Context refresh event listener that triggers when a service starts or triggers / refresh or / bus/refresh triggers a message bus
@EventListener(ContextRefreshedEvent.class)
public void startup(ContextRefreshedEvent event) {
    refresh();
}

// The heartbeat event listener, which is triggered by the client from the Fetch registration information in Eureka.
@EventListener(HeartbeatEvent.class)
public void heartbeat(HeartbeatEvent event) {
    if (monitor.update(event.getValue())) {
            refresh();
    }
}

// This method obtains an instance of the coordination center from the registry, and then sets the url of the instance to the uri field in ConfigClient Properties.
private void refresh() {
    try {
        String serviceId = this.config.getDiscovery().getServiceId();
        ServiceInstance server = this.instanceProvider
                        .getConfigServerInstance(serviceId);
        String url = getHomePage(server);
        if (server.getMetadata().containsKey("password")) {
                String user = server.getMetadata().get("user");
                user = user == null ? "user" : user;
                this.config.setUsername(user);
                String password = server.getMetadata().get("password");
                this.config.setPassword(password);
        }
        if (server.getMetadata().containsKey("configPath")) {
                String path = server.getMetadata().get("configPath");
                if (url.endsWith("/") && path.startsWith("/")) {
                        url = url.substring(0, url.length() - 1);
                }
                url = url + path;
        }
        this.config.setUri(url);
    }
    catch (Exception ex) {
            if (config.isFailFast()) {
                    throw ex;
            }
            else {
                    logger.warn("Could not locate configserver via discovery", ex);
            }
    }
 }
}

A context refresh event listener @EventListener(ContextRefreshedEvent.class) is turned on here, so when the configuration is refreshed via message bus/bus/refresh or directly requesting the client's/refresh configuration, the event is triggered automatically, and the refresh() method in this class is invoked to retrieve the configuration center instance from the Eureka registry.

Here, ConfigServerInstanceProvider encapsulates the DiscoveryClient interface, and obtains the instance through the following methods:

@Retryable(interceptor = "configServerRetryInterceptor")
public ServiceInstance getConfigServerInstance(String serviceId) {
    logger.debug("Locating configserver (" + serviceId + ") via discovery");
    List<ServiceInstance> instances = this.client.getInstances(serviceId);
    if (instances.isEmpty()) {
            throw new IllegalStateException(
                            "No instances found of configserver (" + serviceId + ")");
    }
    ServiceInstance instance = instances.get(0);
    logger.debug(
                    "Located configserver (" + serviceId + ") via discovery: " + instance);
    return instance;
}

As you can see from the source code above, you can get all the service lists through the service Id, which is spring.cloud.config.discovery.service-id configuration item, and instances.get(0) gets the first instance from the service list. The list of services obtained from the registry is out of order each time.

Getting the latest resource attributes from the configuration center is implemented by the locate() method of the ConfigServicePropertySourceLocator class, and further into the source code of this class to see the specific implementation:

@Override
@Retryable(interceptor = "configServerRetryInterceptor")
public org.springframework.core.env.PropertySource<?> locate(
        org.springframework.core.env.Environment environment) {
    
    // Get the current client configuration properties, override is used to preferentially use spring.cloud.config.application, profile, label (if configured)
    ConfigClientProperties properties = this.defaultProperties.override(environment);
    CompositePropertySource composite = new CompositePropertySource("configService");

    // ResetTemplate can be customized, opening up the public setRestTemplate(RestTemplate restTemplate) method. If not, use the resetTemplate defined in the default getSecureRestTemplate(properties). The default timeout time in this method is 3 minutes and 5 seconds, which is relatively long. If you need to shorten this time, you can only customize resetTemplate. 
    RestTemplate restTemplate = this.restTemplate == null ? getSecureRestTemplate(properties)
                    : this.restTemplate;
    Exception error = null;
    String errorBody = null;
    logger.info("Fetching config from server at: " + properties.getRawUri());
    try {
            String[] labels = new String[] { "" };
            if (StringUtils.hasText(properties.getLabel())) {
                    labels = StringUtils.commaDelimitedListToStringArray(properties.getLabel());
            }
            String state = ConfigClientStateHolder.getState();
            // Try all the labels until one works
            for (String label : labels) {
      
            // Loop labels branch, request uri in config attribute configuration according to restTemplate template, the specific method can be seen below.
                Environment result = getRemoteEnvironment(restTemplate,
                                properties, label.trim(), state);
                if (result != null) {
                        logger.info(String.format("Located environment: name=%s, profiles=%s, label=%s, version=%s, state=%s",
                                        result.getName(),
                                        result.getProfiles() == null ? "" : Arrays.asList(result.getProfiles()),
                                        result.getLabel(), result.getVersion(), result.getState()));
                        ...... 
                        if (StringUtils.hasText(result.getState()) || StringUtils.hasText(result.getVersion())) {
                                HashMap<String, Object> map = new HashMap<>();
                                putValue(map, "config.client.state", result.getState());
                                putValue(map, "config.client.version", result.getVersion());
                                
                                // Set the latest version number of the Git repository in the current environment.
                                composite.addFirstPropertySource(new MapPropertySource("configClient", map));
                        }
                        return composite;
                    }
            }
    }
    ...... // Ignore partial source code
    }

From the uri source in the method, you can see that it was obtained from properties.getRawUri().

Get the Environment method from the configuration center server:

private Environment getRemoteEnvironment(RestTemplate restTemplate, ConfigClientProperties properties,
                                                                             String label, String state) {
    String path = "/{name}/{profile}";
    String name = properties.getName();
    String profile = properties.getProfile();
    String token = properties.getToken();
    String uri = properties.getRawUri();
    ......// Ignore partial source code
    response = restTemplate.exchange(uri + path, HttpMethod.GET,
                    entity, Environment.class, args);
    }
    ......
    Environment result = response.getBody();
    return result;
}

From the above analysis, we can see that the latest resource properties are obtained from the remote configuration center according to the fixed uri obtained from properties.getRawUri(); through the restTemplate to complete the request.

The properties.getRawUri() seen in the source code is a fixed value. Why is there a problem when the configuration center migrates or uses the container to dynamically acquire IP?

The reason is that when the configuration center migrates, when the service renewal expiration time exceeds the registry (Eureka registry defaults to 90 seconds, in fact, this value is not accurate, the official source code has also been marked as a bug, which can be further discussed in a separate article) will be kicked out of the registry, when we trigger the event refresh through / refresh or / bus/refresh, then This URI will be updated to an available configuration center instance, where the ConfigService Property SourceLocator is a newly created instance object, so the property resources will be obtained through the latest uri.

But because neither the ConfigServer Health Indicator object nor its dependent ConfigServicePropertySourceLocator object has been re-instantiated, nor the object initialized at service startup, the property value in properties.getRawUri() has changed.

This is also the design flaw of Spring Cloud Config, because even if one instance can be obtained after the configuration is refreshed, it does not necessarily mean that the request for the instance is successful. For example, when the network is not reachable, other machines should be retried to obtain data through load balancing to ensure the consistency of the latest environmental configuration data.

Solution posture:

This problem has been corrected in version 2.x.x of spring cloud config on github. The implementation also does not use a Ribbon-like approach to soft load balancing, and speculation may take into account the reduction of framework coupling.

In this version, the URI field in the configuration client properties in the ConfigClientProperties class is changed from String string type to String [] array type, and all available configuration center instance URI lists are set to URI properties through DiscoveryClient.

The ConfigServicePropertySourceLocator.locate() method then iterates through the array. When the uri request is unsuccessful, a ResourceAccessException exception is thrown. After capturing the exception, the next node is retried in the catch. If all nodes are still unsuccessful in retrying, the exception is thrown directly and the run ends.

At the same time, the request timeout request ReadTimeout is extracted to ConfigClient Properties as a configurable item.
Some source codes are implemented as follows:

private Environment getRemoteEnvironment(RestTemplate restTemplate,
        ConfigClientProperties properties, String label, String state) {
    String path = "/{name}/{profile}";
    String name = properties.getName();
    String profile = properties.getProfile();
    String token = properties.getToken();
    int noOfUrls = properties.getUri().length;
    if (noOfUrls > 1) {
            logger.info("Multiple Config Server Urls found listed.");
    }
    for (int i = 0; i < noOfUrls; i++) {
        Credentials credentials = properties.getCredentials(i);
        String uri = credentials.getUri();
        String username = credentials.getUsername();
        String password = credentials.getPassword();
        logger.info("Fetching config from server at : " + uri);
        try {
             ...... 
                response = restTemplate.exchange(uri + path, HttpMethod.GET, entity,
                                Environment.class, args);
        }
        catch (HttpClientErrorException e) {
                if (e.getStatusCode() != HttpStatus.NOT_FOUND) {
                        throw e;
                }
        }
        catch (ResourceAccessException e) {
                logger.info("Connect Timeout Exception on Url - " + uri
                                + ". Will be trying the next url if available");
                if (i == noOfUrls - 1)
                        throw e;
                else
                        continue;
        }
        if (response == null || response.getStatusCode() != HttpStatus.OK) {
                return null;
        }
        Environment result = response.getBody();
        return result;
    }
    return null;
}

Conclusion:

This paper mainly from the Spring Cloud Config Server source level, the problems encountered after the Config Server node migration, and the process of this problem are analyzed. At the same time, it also further combines the source code to understand how to fix this problem in Spring Cloud Config official website.

Of course, the latest version of Spring Cloud is generally used now, and the default version of Spring Cloud Config 2.x.x is introduced, so there will be no problems described in this article.

If you choose Spring Cloud Config as the configuration center, it is recommended that you do relevant tests according to the "CAP theoretical model" before officially launching into production environment to ensure that unpredictable problems do not arise.

If you are interested, you can refer to the latest source code implementation of github:

https://github.com/spring-clo...

Welcome to pay attention to my public number, scan two-dimensional code to get more wonderful articles, and grow up with you.~

Posted by blackmamba on Sat, 12 Oct 2019 03:30:48 -0700