Preface
This article requires the reader to be familiar with the general garbage collection principles built into the JVM. Heap memory is divided into Eden, Survivor and Tenured/Old spaces. Generation assumption and other different GC algorithms are beyond the scope of this paper.
Minor GC
Reclaiming memory from younger generation spaces, including Eden and Survivor areas, is called Minor GC. This definition is clear and easy to understand. However, when the Minor GC event occurs, there are some interesting things to note:
1. When the JVM is unable to allocate space for a new object, Minor GC will be triggered, for example, when the Eden area is full. So the higher the allocation rate, the more frequent the Minor GC will be executed.
2. When the memory pool is full, all the contents will be copied, and the pointer will track the free memory from 0. The Eden and survivor areas are marked and copied, replacing the classic mark, scan, compress and clean operations. So there is no memory fragmentation in the Eden and Survivor areas. The write pointer always stays at the top of the memory pool being used.
3. When performing Minor GC operation, permanent generation will not be affected. References from the permanent generation to the young generation are treated as GC roots, and references from the young generation to the permanent generation are directly ignored in the tag phase.
4. Challenge the conventional cognition that all minor GCS will trigger "stop the world" and stop the application threads. For most applications, the delay caused by pauses is negligible. The truth is that most of the objects in the Eden area can be considered garbage and will never be copied to the Survivor area or the space for the elderly. If, on the contrary, most of the new objects in Eden area do not meet GC conditions, Minor GC will pause for a long time.
So the situation of Minor GC is quite clear - every time Minor GC cleans up the memory of the younger generation.
Major GC vs Full GC
It should be noted that at present, these terms are not formally defined either in the JVM specification or in the garbage collection research paper. But we can see from the first sight that these definitions are correct based on what we already know. Minor GC should be designed to clean the memory in the Youth Zone simply:
Major GC is to clean up the old generation.
Full GC cleans up the entire heap space - both young and old.
Unfortunately, it's actually a little complicated and confusing. First of all, many major GCs are triggered by Minor GC, so in many cases, it is impossible to separate the two kinds of GC. On the other hand, many modern garbage collection mechanisms clean up part of the permanent generation space, so using the word "cleaning" is only partially correct.
This makes it unnecessary for us to pay attention to whether it is called Major GC or Full GC. We should pay attention to whether the current GC stops all application threads or can process concurrently without stopping application threads.
This confusion is even built into JVM standard tools. The following example explains what I mean very well. Let's compare the trace records that two different tools, Concurrent Mark and Sweep collector (-XX:+UseConcMarkSweepGC), output at runtime in the JVM.
The first attempt to output through jstat:
my-precious: me$ jstat -gc -t 4235 1sTime S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT 5.7 34048.0 34048.0 0.0 34048.0 272640.0 194699.7 1756416.0 181419.9 18304.0 17865.1 2688.0 2497.6 3 0.275 0 0.000 0.275 6.7 34048.0 34048.0 34048.0 0.0 272640.0 247555.4 1756416.0 263447.9 18816.0 18123.3 2688.0 2523.1 4 0.359 0 0.000 0.359 7.7 34048.0 34048.0 0.0 34048.0 272640.0 257729.3 1756416.0 345109.8 19072.0 18396.6 2688.0 2550.3 5 0.451 0 0.000 0.451 8.7 34048.0 34048.0 34048.0 34048.0 272640.0 272640.0 1756416.0 444982.5 19456.0 18681.3 2816.0 2575.8 7 0.550 0 0.000 0.550 9.7 34048.0 34048.0 34046.7 0.0 272640.0 16777.0 1756416.0 587906.3 20096.0 19235.1 2944.0 2631.8 8 0.720 0 0.000 0.72010.7 34048.0 34048.0 0.0 34046.2 272640.0 80171.6 1756416.0 664913.4 20352.0 19495.9 2944.0 2657.4 9 0.810 0 0.000 0.81011.7 34048.0 34048.0 34048.0 0.0 272640.0 129480.8 1756416.0 745100.2 20608.0 19704.5 2944.0 2678.4 10 0.896 0 0.000 0.89612.7 34048.0 34048.0 0.0 34046.6 272640.0 164070.7 1756416.0 822073.7 20992.0 19937.1 3072.0 2702.8 11 0.978 0 0.000 0.97813.7 34048.0 34048.0 34048.0 0.0 272640.0 211949.9 1756416.0 897364.4 21248.0 20179.6 3072.0 2728.1 12 1.087 1 0.004 1.09114.7 34048.0 34048.0 0.0 34047.1 272640.0 245801.5 1756416.0 597362.6 21504.0 20390.6 3072.0 2750.3 13 1.183 2 0.050 1.23315.7 34048.0 34048.0 0.0 34048.0 272640.0 21474.1 1756416.0 757347.0 22012.0 20792.0 3200.0 2791.0 15 1.336 2 0.050 1.38616.7 34048.0 34048.0 34047.0 0.0 272640.0 48378.0 1756416.0 838594.4 22268.0 21003.5 3200.0 2813.2 16 1.433 2 0.050 1.484
This fragment is extracted the 17th second after the JVM starts. Based on this information, we can get the result that we have run Minor GC for 12 times and Full GC for 2 times, with a total time span of 50ms. You can get the same results with GUI based tools like jconsole or jvisualvm.
java -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC eu.plumbr.demo.GarbageProducer3.157: [GC (Allocation Failure) 3.157: [ParNew: 272640K->34048K(306688K), 0.0844702 secs] 272640K->69574K(2063104K), 0.0845560 secs] [Times: user=0.23 sys=0.03, real=0.09 secs] 4.092: [GC (Allocation Failure) 4.092: [ParNew: 306688K->34048K(306688K), 0.1013723 secs] 342214K->136584K(2063104K), 0.1014307 secs] [Times: user=0.25 sys=0.05, real=0.10 secs] ... cut for brevity ...11.292: [GC (Allocation Failure) 11.292: [ParNew: 306686K->34048K(306688K), 0.0857219 secs] 971599K->779148K(2063104K), 0.0857875 secs] [Times: user=0.26 sys=0.04, real=0.09 secs] 12.140: [GC (Allocation Failure) 12.140: [ParNew: 306688K->34046K(306688K), 0.0821774 secs] 1051788K->856120K(2063104K), 0.0822400 secs] [Times: user=0.25 sys=0.03, real=0.08 secs] 12.989: [GC (Allocation Failure) 12.989: [ParNew: 306686K->34048K(306688K), 0.1086667 secs] 1128760K->931412K(2063104K), 0.1087416 secs] [Times: user=0.24 sys=0.04, real=0.11 secs] 13.098: [GC (CMS Initial Mark) [1 CMS-initial-mark: 897364K(1756416K)] 936667K(2063104K), 0.0041705 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 13.102: [CMS-concurrent-mark-start]13.341: [CMS-concurrent-mark: 0.238/0.238 secs] [Times: user=0.36 sys=0.01, real=0.24 secs] 13.341: [CMS-concurrent-preclean-start]13.350: [CMS-concurrent-preclean: 0.009/0.009 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 13.350: [CMS-concurrent-abortable-preclean-start]13.878: [GC (Allocation Failure) 13.878: [ParNew: 306688K->34047K(306688K), 0.0960456 secs] 1204052K->1010638K(2063104K), 0.0961542 secs] [Times: user=0.29 sys=0.04, real=0.09 secs] 14.366: [CMS-concurrent-abortable-preclean: 0.917/1.016 secs] [Times: user=2.22 sys=0.07, real=1.01 secs] 14.366: [GC (CMS Final Remark) [YG occupancy: 182593 K (306688 K)]14.366: [Rescan (parallel) , 0.0291598 secs]14.395: [weak refs processing, 0.0000232 secs]14.395: [class unloading, 0.0117661 secs]14.407: [scrub symbol table, 0.0015323 secs]14.409: [scrub string table, 0.0003221 secs][1 CMS-remark: 976591K(1756416K)] 1159184K(2063104K), 0.0462010 secs] [Times: user=0.14 sys=0.00, real=0.05 secs] 14.412: [CMS-concurrent-sweep-start]14.633: [CMS-concurrent-sweep: 0.221/0.221 secs] [Times: user=0.37 sys=0.00, real=0.22 secs] 14.633: [CMS-concurrent-reset-start]14.636: [CMS-concurrent-reset: 0.002/0.002 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
Before nodding in agreement, let's look at the output of the garbage collection logs from the same JVM that started the collection. Obviously - XX: + PrintGCDetails tells us a different and more detailed story:
Based on this information, we can see that 12 Minor GC starts to be different from the above. The difference is that a single GC runs twice in different stages of the permanent generation:
1. In the initial marking stage, it took 0.0041705 seconds, or about 4ms. This phase pauses the "stop the world" event, stops all application threads, and starts tagging.
2. Parallel execution of marking and cleaning phase. These are parallel to application threads.
3. In the last Remark stage, it took 0.046, 2010 seconds about 46 Ms. This stage will pause all events again.
4. Perform cleaning operations in parallel. As its name implies, this phase is also parallel and does not stop other threads.
So, as we can see from the garbage collection log, it's actually just a Major GC to clean up the space of the elderly, rather than a Full GC twice.
If you make a decision later, the data provided by jstat will guide you to make the right decision. It correctly lists two situations in which all events are suspended, causing all threads to stop for a total of 50ms. But if you try to optimize throughput, you will be misled. The list only lists the initial mark and final mark stages of recycling, and the output of jstat does not see the concurrent work.
conclusion
Considering this situation, it is better to avoid thinking in the way of Minor, Major and Full GC. Instead, you should monitor application latency or throughput, and then associate GC events with results.
You are welcome to pay attention to my public blog, where the articles will be updated and the sorted data will also be put.
As these GC events happen, you need to pay extra attention to some information, whether GC events force all application threads to stop or process some events in parallel.
Last
Welcome everyone to exchange, like the article, remember to pay attention to my point of praise, thank you for your support!