When will Go trigger GC?

As a new language, Go language was often despised in the early stage because STW (stop the world) took too long in the garbage collection (GC) mechanism.

At this time, we will be curious. As the beginning of STW, when will GC be triggered in Go language?

Today, fried fish will take you to study and discuss a round.

What is GC

In computer science, garbage collection (GC) is a mechanism to automatically manage memory. The garbage collector will try to recycle objects that are no longer used by the program and the memory it occupies.

John McCarthy first invented garbage collection around 1959 to simplify the mechanism of manual memory management in Lisp (from @ wikipedia).

Figure from network

Why GC

It is troublesome to manage memory manually, and it is also very bad to manage wrong or leak memory, which will directly lead to program instability (continuous leakage) and even direct crash.

GC trigger scenario

GC triggered scenarios are mainly divided into two categories:

System trigger: when the runtime checks and finds it according to the built-in conditions, it will perform GC processing to maintain the availability of the whole application.
Manual trigger: the developer calls the runtime.GC method in the business code to trigger the GC behavior.

System trigger

In the scenarios triggered by the system, the src/runtime/mgc.go file of the Go source code clearly identifies three scenarios triggered by the GC system, as follows:

const (
 gcTriggerHeap gcTriggerKind = iota
 gcTriggerTime
 gcTriggerCycle
)

gcTriggerHeap: triggered when the allocated heap size reaches the threshold (the size of the trigger heap calculated by the controller).
gcTriggerTime: triggered when the time from the last GC cycle exceeds a certain time- The time period is subject to the runtime.forcegcperiod variable, which is 2 minutes by default.
gcTriggerCycle: if GC is not turned on, start GC.
- Involved in the manually triggered runtime.GC method.

Manual trigger

In the case of manual triggering, only the runtime.GC method in the Go language can trigger, so there is no additional classification.

But what we need to think about is, in what business scenario do we need to manually interfere with GC and forcibly trigger it?

Scenarios requiring manual forced triggering are extremely rare. It may be that after some business methods are executed, they need to be released manually because they occupy too much memory. Or required by the debug program.

Basic process

After knowing that Go language will trigger GC, let's take a further look at the process code that triggers GC. We can use the manually triggered runtime.GC method as a breakthrough.

The core code is as follows:

func GC() {
 n := atomic.Load(&work.cycles)
 gcWaitOnMark(n)

 gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1})
  
 gcWaitOnMark(n + 1)

 for atomic.Load(&work.cycles) == n+1 && sweepone() != ^uintptr(0) {
  sweep.nbgsweep++
  Gosched()
 }
  
 for atomic.Load(&work.cycles) == n+1 && atomic.Load(&mheap_.sweepers) != 0 {
  Gosched()
 }
  
 mp := acquirem()
 cycle := atomic.Load(&work.cycles)
 if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) {
  mProf_PostSweep()
 }
 releasem(mp)
}

Before starting a new round of GC cycle, you need to call gcWaitOnMark method to mark the end of the previous round of GC (including scan termination, mark, or mark termination).
Start a new round of GC cycle, call gcStart method to trigger GC behavior and start scanning marking stage.
The gcWaitOnMark method needs to be called to wait until the scanning, marking and marking of the current GC cycle are completed.
You need to call the sweepone method to scan the UN swept heap span and continue to sweep to ensure that the cleaning is completed. When waiting for the blocking time before cleaning, Gosched will be called to give up.
After this round of GC is basically completed, MProf will be called_ Postsweep method. This records a snapshot of the heap configuration file at the last mark termination.
End, release M.

Where is it triggered

After reading the basic process of GC, we have a basic understanding. But maybe a little partner has doubts again?

The title of this article is "when GC will trigger GC", although we know the trigger time earlier. But... Where is the trigger mechanism implemented by Go? It seems that it is not seen in the process at all?

Monitoring thread

In essence, when the Go runtime is initialized, a goroutine will be started to handle matters related to the GC mechanism.

The code is as follows:

func init() {
 go forcegchelper()
}

func forcegchelper() {
 forcegc.g = getg()
 lockInit(&forcegc.lock, lockRankForcegc)
 for {
  lock(&forcegc.lock)
  if forcegc.idle != 0 {
   throw("forcegc: phase error")
  }
  atomic.Store(&forcegc.idle, 1)
  goparkunlock(&forcegc.lock, waitReasonForceGCIdle, traceEvGoBlock, 1)
    // this goroutine is explicitly resumed by sysmon
  if debug.gctrace > 0 {
   println("GC forced")
  }

  gcStart(gcTrigger{kind: gcTriggerTime, now: nanotime()})
 }
}

In this program, special attention should be paid to that in the forcegchelper method, the goparkunlock method will be called to make the goroutine fall into sleep waiting state to reduce unnecessary resource overhead.

After hibernation, sysmon, a system monitoring thread, will monitor and wake up:

func sysmon() {
 ...
 for {
  ...
  // check if we need to force a GC
  if t := (gcTrigger{kind: gcTriggerTime, now: now}); t.test() && atomic.Load(&forcegc.idle) != 0 {
   lock(&forcegc.lock)
   forcegc.idle = 0
   var list gList
   list.push(forcegc.g)
   injectglist(&list)
   unlock(&forcegc.lock)
  }
  if debug.schedtrace > 0 && lasttrace+int64(debug.schedtrace)*1000000 <= now {
   lasttrace = now
   schedtrace(debug.scheddetail > 0)
  }
  unlock(&sched.sysmonlock)
 }
}

The core behavior of this code is to continuously compare gcTriggerTime and now variables in the for loop to determine whether a certain time has been reached (2 minutes by default).

If the conditions are met, forcegc.g will be placed in the global queue to accept a new round of scheduling, and then wake up the above forcegchelper.

Heap memory request

After understanding the mechanism of timed triggering, another scenario is the allocation of heap space, so what we need to see is very clear.

That is the mallocgc method for the runtime to request heap memory. The core code is as follows:

func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
 shouldhelpgc := false
 ...
 if size <= maxSmallSize {
  if noscan && size < maxTinySize {
   ...
   // Allocate a new maxTinySize block.
   span = c.alloc[tinySpanClass]
   v := nextFreeFast(span)
   if v == 0 {
    v, span, shouldhelpgc = c.nextFree(tinySpanClass)
   }
   ...
   spc := makeSpanClass(sizeclass, noscan)
   span = c.alloc[spc]
   v := nextFreeFast(span)
   if v == 0 {
    v, span, shouldhelpgc = c.nextFree(spc)
   }
   ...
  }
 } else {
  shouldhelpgc = true
  span = c.allocLarge(size, needzero, noscan)
  ...
 }

 if shouldhelpgc {
  if t := (gcTrigger{kind: gcTriggerHeap}); t.test() {
   gcStart(t)
  }
 }

 return x
}

Small object: when applying for a small object and finding that there is no free span in the current memory space, you will need to call the nextFree method to obtain a new available object, which may trigger the GC behavior.
Large objects: if you apply for large objects larger than 32k, GC behavior may be triggered.

summary

In this article, we introduce two categories of scenarios in which Go language triggers GC, and explain them one by one based on the subdivision scenarios in the categories.

Generally speaking, we can understand it. If you are interested in the internal implementation, you can also open it with the code in the article.

However, it should be noted that it is likely that once the Go version is upgraded, it may change again. It is important to learn ideas!

Posted by crazy/man on Fri, 12 Nov 2021 06:03:04 -0800

Programmer Group