Java programs deploy settings related to CPU and MMORY resource constraints in K8S containers

Keywords: Java jvm Docker JDK

background

Execute Java programs in the k8s docker environment, because we set the limit of cpu, memory, so the parameters of the JVM are not associated with the parameters we set when the Java program executes, resulting in the CPU and memory that the JVM perceives as the size of the CPU and memory on our k8s work node.The problem with this is that when a Java program in a container uses more memory than memory limit, it directly causes an Out of Memory error, which causes the container to restart.Many JVM parameters are also intelligent, and memory allocation at startup is also adjusted based on CPU and memory, such as GC-related parameters that are dynamically adjusted.If the container perceives an incorrect number of CPU cores, it can also have a significant impact on program performance.

Memory

Java's use of memory has several parameters that can be configured.Previous versions used -Xms, -Xmx to set the initial Java heap size and the maximum Java heap size, respectively.However, since the Java heap size is not equal to all available memory sizes, a value is added when setting the memory limit.This prevents Java from using more memory than the maximum memory limit allocated to the container.This added value requires experience and testing.

The JVM later provided the UseCGroupMemoryLimitForHeap parameter to let the JVM automatically allocate the heap size based on the memory limits we provided.This prevents us from artificially determining how much space we should give the heap.Once tested, determine the total space occupied by this Java program.Use this method by adding parameters after the Java run: java-XX:+UnlockExperimentalVMOptions-XX:+UseCGroupMemoryLimitForHeap_

CPU

With the above parameters, we have not completely solved the problem.Because JVM GC-related parameters are associated with the CPU processor core, the more CPU cores are available, the more thread resources are allocated to the GC.If we don't set the correct number of CPU cores to the container, what it sees is the number of CPUs for the entire k8s worker node. For example, we limit the container to 2 cores, but the worker node has 32 cores.This container will allocate a lot of thread resources to the GC, which will seriously affect the running of normal Java threads.

The influence of number of CPU s on JVM GC

The JVM provides the ActiveProcessorCount parameter to set this value.However, this parameter is only supported after java 1.8.0_191.Next, I did a test on my notebook (total 8 cores) to see how this parameter affects GC parameters.

Step1: Write a hello wold program.

root@kyle:~# cat Hello.java
public class Hello{
    public static void  main(String[] args){
        System.out.println("hello world");
}

Step2: Compile

root@kyle:~# javac Hello.java

Step3: Run without parameters

root@kyle:~# java -XX:+PrintFlagsFinal Hello > init.txt
[Global flags]
     intx ActiveProcessorCount                      = -1                                  {product}
    uintx AdaptiveSizeDecrementScaleFactor          = 4                                   {product}
    uintx AdaptiveSizeMajorGCDecayTimeScale         = 10                                  {product}
    uintx AdaptiveSizePausePolicy                   = 0                                   {product}
    uintx AdaptiveSizePolicyCollectionCostMargin    = 50                                  {product}
…

Step4: Run with different parameter values

root@kyle:~# java -XX:ActiveProcessorCount=1 -XX:+PrintFlagsFinal Hello > p1.txt
root@kyle:~# java -XX:ActiveProcessorCount=2 -XX:+PrintFlagsFinal Hello > p2.txt
root@kyle:~# java -XX:ActiveProcessorCount=4 -XX:+PrintFlagsFinal Hello > p4.txt
root@kyle:~# java -XX:ActiveProcessorCount=8 -XX:+PrintFlagsFinal Hello > p8.txt

Step5: See how different parameters affect GC:
Comparison of one processor with two processors:

 root@kyle:~# diff p1.txt p2.txt
2c2
<      intx ActiveProcessorCount                     := 1                                   {product}
---
>      intx ActiveProcessorCount                     := 2                                   {product}
304c304
<     uintx MarkSweepDeadRatio                        = 5                                   {product}
---
>     uintx MarkSweepDeadRatio                        = 1                                   {product}
311c311
<     uintx MaxHeapFreeRatio                          = 70                                  {manageable}
---
>     uintx MaxHeapFreeRatio                          = 100                                 {manageable}
335,336c335,336
<     uintx MinHeapDeltaBytes                        := 196608                              {product}
<     uintx MinHeapFreeRatio                          = 40                                  {manageable}
---
>     uintx MinHeapDeltaBytes                        := 524288                              {product}
>     uintx MinHeapFreeRatio                          = 0                                   {manageable}
388c388
<     uintx ParallelGCThreads                         = 0                                   {product}
---
>     uintx ParallelGCThreads                         = 2                                   {product}
682,683c682,683
<      bool UseParallelGC                             = false                               {product}
<      bool UseParallelOldGC                          = false                               {product}
---
>      bool UseParallelGC                            := true                                {product}
>      bool UseParallelOldGC                          = true                                {product}

Comparison of 2 processors with 4 processors:

root@kyle:~# diff p2.txt p4.txt
2c2
<      intx ActiveProcessorCount                     := 2                                   {product}
---
>      intx ActiveProcessorCount                     := 4                                   {product}
59c59
<      intx CICompilerCount                          := 2                                   {product}
---
>      intx CICompilerCount                          := 3                                   {product}
388c388
<     uintx ParallelGCThreads                         = 2                                   {product}
---
>     uintx ParallelGCThreads                         = 4                                   {product}

Comparing four processors with eight processors:

root@kyle:~# diff p4.txt p8.txt
2c2
<      intx ActiveProcessorCount                     := 4                                   {product}
---
>      intx ActiveProcessorCount                     := 8                                   {product}
59c59
<      intx CICompilerCount                          := 3                                   {product}
---
>      intx CICompilerCount                          := 4                                   {product}
388c388
<     uintx ParallelGCThreads                         = 4                                   {product}
---
>     uintx ParallelGCThreads                         = 8                                   {product}

Comparison with 8 processors without parameters:

root@kyle:~# diff init.txt p8.txt
2c2
<      intx ActiveProcessorCount                      = -1                                  {product}
---
>      intx ActiveProcessorCount                     := 8                                   {product}

As you can see from the comparison above, not setting this parameter is the same as setting the maximum parameter (the current system is 8core).The 2, 4, 8 core settings only affect ParallelGCThreads, CICompilerCount.However, if only one core is used, both UseParallelGC and UseParallelOldGC become false, affecting several other parameters as well.See the diff p1.txt p2.txt comparison above.

The impact of the number of CPU s on Java programs

Setting the number of CPUs affects not only JVM GC performance, but also Java worker threads.The following code is commonly used in Java libraries, which generate worker threads based on the number of CPUs.If the parameters in the docker are not set correctly, the actual program performance will be greatly affected.

Runtime.getRuntime().availableProcessors()

The following code, extracted from the aliyun-log-java-producer library, sends loghub data based on the number of IO threads that are generated by the available processors.

# ProducerConfig.java:
public class ProducerConfig {
  public static final int DEFAULT_IO_THREAD_COUNT =
      Math.max(Runtime.getRuntime().availableProcessors(), 1);

OpenJDK Version

We run the following command to check the JDK version.openjdk version "1.8.0_131" later supports the UseCGroupMemoryLimitForHeap parameter and "1.8.0_191" later supports the ActiveProcessorCount parameter.

root@kyle:~# java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

Improvement Program

If the JDK version we are using supports these two parameters, we only need to add this UseCGroupMemoryLimitForHeap parameter to the Java program when it is running, and assign the ActiveProcessorCount parameter the cpu limit actually assigned to the container.If the current JDK version is less than 1.8.0_191, ActiveProcessorCount is not supported, there are two ways to do this:

  1. It is recommended that you upgrade to a later version than 1991 and then configure ActiveProcessorCount based on the cpu limit.
  2. Instead of upgrading the jdk version, set the GC parameters associated with the ActiveProcessorCount parameters directly: for example, ParallelGCThreads, CICompilerCount.If it is a pre-1.8.0_131 version, you can use the -Xms, -Xmx parameters to allocate heap space. Note that these two parameters only set the size allocated to the heap, and the actual memory limit should be larger than this.This scenario is not a best practice, since it does not require some parameters for JVM auto-adaption.Most importantly, this approach does not prevent many Java libraries from doing logical processing based on availableProcessors().

Reference material

  1. Assign Memory Resources to Containers and Pods
  2. Assign CPU Resources to Containers and Pods
  3. Kubernetes Demystified: Restrictions on Java Application Resources
  4. JVM compatibility with docker container CPU limits
  5. Java SE support for Docker CPU and memory limits
  6. This is enough for Jvm knowledge

Posted by dirty_n4ppy on Sun, 19 May 2019 10:16:36 -0700