转载:https://segmentfault.com/a/1190000020086769
Go的GC自打出生的时候就开始被人诟病,但是在引入v1.5的三色标记和v1.8的混合写屏障后,正常的GC已经缩短到10us左右,已经变得非常优秀,了不起了,我们接下来探索一下Go的GC的原理吧
三色标记原理
我们首先看一张图,大概就会对 三色标记法 有一个大致的了解:
原理:
首先把所有的对象都放到白色的集合中
从根节点开始遍历对象,遍历到的白色对象从白色集合中放到灰色集合中
遍历灰色集合中的对象,把灰色对象引用的白色集合的对象放入到灰色集合中,同时把遍历过的灰色集合中的对象放到黑色的集合中
循环步骤3,知道灰色集合中没有对象
步骤4结束后,白色集合中的对象就是不可达对象,也就是垃圾,进行回收
写屏障
Go在进行三色标记的时候并没有STW,也就是说,此时的对象还是可以进行修改
那么我们考虑一下,下面的情况
我们在进行三色标记中扫描灰色集合中,扫描到了对象A,并标记了对象A的所有引用,这时候,开始扫描对象D的引用,而此时,另一个goroutine修改了D->E的引用,变成了如下图所示
这样会不会导致E对象就扫描不到了,而被误认为 为白色对象,也就是垃圾
写屏障就是为了解决这样的问题,引入写屏障后,在上述步骤后,E会被认为是存活的,即使后面E被A对象抛弃,E会被在下一轮的GC中进行回收,这一轮GC中是不会对对象E进行回收的
Go1.9中开始启用了混合写屏障,伪代码如下
writePointer(slot, ptr):
shade(*slot)
if any stack is grey:
shade(ptr)
*slot = ptr
混合写屏障会同时标记指针写入目标的"原指针"和“新指针".
标记原指针的原因是, 其他运行中的线程有可能会同时把这个指针的值复制到寄存器或者栈上的本地变量
因为复制指针到寄存器或者栈上的本地变量不会经过写屏障, 所以有可能会导致指针不被标记, 试想下面的情况:
[go] b = obj
[go] oldx = nil
[gc] scan oldx…
[go] oldx = b.x // 复制b.x到本地变量, 不进过写屏障
[go] b.x = ptr // 写屏障应该标记b.x的原值
[gc] scan b…
如果写屏障不标记原值, 那么oldx就不会被扫描到.
标记新指针的原因是, 其他运行中的线程有可能会转移指针的位置, 试想下面的情况:
[go] a = ptr
[go] b = obj
[gc] scan b…
[go] b.x = a // 写屏障应该标记b.x的新值
[go] a = nil
[gc] scan a…
如果写屏障不标记新值, 那么ptr就不会被扫描到.
混合写屏障可以让GC在并行标记结束后不需要重新扫描各个G的堆栈, 可以减少Mark Termination中的STW时间
除了写屏障外, 在GC的过程中所有新分配的对象都会立刻变为黑色, 在上面的mallocgc函数中可以看到
回收流程
GO的GC是并行GC, 也就是GC的大部分处理和普通的go代码是同时运行的, 这让GO的GC流程比较复杂.
首先GC有四个阶段, 它们分别是:
Sweep Termination: 对未清扫的span进行清扫, 只有上一轮的GC的清扫工作完成才可以开始新一轮的GC
Mark: 扫描所有根对象, 和根对象可以到达的所有对象, 标记它们不被回收
Mark Termination: 完成标记工作, 重新扫描部分根对象(要求STW)
Sweep: 按标记结果清扫span
下图是比较完整的GC流程, 并按颜色对这四个阶段进行了分类:
在GC过程中会有两种后台任务(G), 一种是标记用的后台任务, 一种是清扫用的后台任务.
标记用的后台任务会在需要时启动, 可以同时工作的后台任务数量大约是P的数量的25%, 也就是go所讲的让25%的cpu用在GC上的根据.
清扫用的后台任务在程序启动时会启动一个, 进入清扫阶段时唤醒.
目前整个GC流程会进行两次STW(Stop The World), 第一次是Mark阶段的开始, 第二次是Mark Termination阶段.
第一次STW会准备根对象的扫描, 启动写屏障(Write Barrier)和辅助GC(mutator assist).
第二次STW会重新扫描部分根对象, 禁用写屏障(Write Barrier)和辅助GC(mutator assist).
需要注意的是, 不是所有根对象的扫描都需要STW, 例如扫描栈上的对象只需要停止拥有该栈的G.
写屏障的实现使用了Hybrid Write Barrier, 大幅减少了第二次STW的时间.
源码分析
gcStart
func gcStart(mode gcMode, trigger gcTrigger) {
// Since this is called from malloc and malloc is called in
// the guts of a number of libraries that might be holding
// locks, don’t attempt to start GC in non-preemptible or
// potentially unstable situations.
// 判断当前g是否可以抢占,不可抢占时不触发GC
mp := acquirem()
if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != “” {
releasem(mp)
return
}
releasem(mp)
mp = nil
// Pick up the remaining unswept/not being swept spans concurrently
//
// This shouldn't happen if we're being invoked in background
// mode since proportional sweep should have just finished
// sweeping everything, but rounding errors, etc, may leave a
// few spans unswept. In forced mode, this is necessary since
// GC can be forced at any point in the sweeping cycle.
//
// We check the transition condition continuously here in case
// this G gets delayed in to the next GC cycle.
// 清扫 残留的未清扫的垃圾
for trigger.test() && gosweepone() != ^uintptr(0) {
sweep.nbgsweep++
}
// Perform GC initialization and the sweep termination
// transition.
semacquire(&work.startSema)
// Re-check transition condition under transition lock.
// 判断gcTrriger的条件是否成立
if !trigger.test() {
semrelease(&work.startSema)
return
}
// For stats, check if this GC was forced by the user
// 判断并记录GC是否被强制执行的,runtime.GC()可以被用户调用并强制执行
work.userForced = trigger.kind == gcTriggerAlways || trigger.kind == gcTriggerCycle
// In gcstoptheworld debug mode, upgrade the mode accordingly.
// We do this after re-checking the transition condition so
// that multiple goroutines that detect the heap trigger don't
// start multiple STW GCs.
// 设置gc的mode
if mode == gcBackgroundMode {
if debug.gcstoptheworld == 1 {
mode = gcForceMode
} else if debug.gcstoptheworld == 2 {
mode = gcForceBlockMode
}
}
// Ok, we're doing it! Stop everybody else
semacquire(&worldsema)
if trace.enabled {
traceGCStart()
}
// 启动后台标记任务
if mode == gcBackgroundMode {
gcBgMarkStartWorkers()
}
// 重置gc 标记相关的状态
gcResetMarkState()
work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs
if work.stwprocs > ncpu {
// This is used to compute CPU time of the STW phases,
// so it can't be more than ncpu, even if GOMAXPROCS is.
work.stwprocs = ncpu
}
work.heap0 = atomic.Load64(&memstats.heap_live)
work.pauseNS = 0
work.mode = mode
now := nanotime()
work.tSweepTerm = now
work.pauseStart = now
if trace.enabled {
traceGCSTWStart(1)
}
// STW,停止世界
systemstack(stopTheWorldWithSema)
// Finish sweep before we start concurrent scan.
// 先清扫上一轮的垃圾,确保上轮GC完成
systemstack(func() {
finishsweep_m()
})
// clearpools before we start the GC. If we wait they memory will not be
// reclaimed until the next GC cycle.
// 清理 sync.pool sched.sudogcache、sched.deferpool,这里不展开,sync.pool已经说了,剩余的后面的文章会涉及
clearpools()
// 增加GC技术
work.cycles++
if mode == gcBackgroundMode { // Do as much work concurrently as possible
gcController.startCycle()
work.heapGoal = memstats.next_gc
// Enter concurrent mark phase and enable
// write barriers.
//
// Because the world is stopped, all Ps will
// observe that write barriers are enabled by
// the time we start the world and begin
// scanning.
//
// Write barriers must be enabled before assists are
// enabled because they must be enabled before
// any non-leaf heap objects are marked. Since
// allocations are blocked until assists can
// happen, we want enable assists as early as
// possible.
// 设置GC的状态为 gcMark
setGCPhase(_GCmark)
// 更新 bgmark 的状态
gcBgMarkPrepare() // Must happen before assist enable.
// 计算并排队root 扫描任务,并初始化相关扫描任务状态
gcMarkRootPrepare()
// Mark all active tinyalloc blocks. Since we're
// allocating from these, they need to be black like
// other allocations. The alternative is to blacken
// the tiny block on every allocation from it, which
// would slow down the tiny allocator.
// 标记 tiny 对象
gcMarkTinyAllocs()
// At this point all Ps have enabled the write
// barrier, thus maintaining the no white to
// black invariant. Enable mutator assists to
// put back-pressure on fast allocating
// mutators.
// 设置 gcBlackenEnabled 为 1,启用写屏障
atomic.Store(&gcBlackenEnabled, 1)
// Assists and workers can start the moment we start
// the world.
gcController.markStartTime = now
// Concurrent mark.
systemstack(func() {
now = startTheWorldWithSema(trace.enabled)
})
work.pauseNS += now - work.pauseStart
work.tMark = now
} else {
// 非并行模式
// 记录完成标记阶段的开始时间
if trace.enabled {
// Switch to mark termination STW.
traceGCSTWDone()
traceGCSTWStart(0)
}
t := nanotime()
work.tMark, work.tMarkTerm = t, t
work.heapGoal = work.heap0
// Perform mark termination. This will restart the world.
// stw,进行标记,清扫并start the world
gcMarkTermination(memstats.triggerRatio)
}
semrelease(&work.startSema)
}
gcBgMarkStartWorkers
这个函数准备一些 执行bg mark工作的goroutine,但是这些goroutine并不是立即工作的,而是到等到GC的状态被标记为gcMark 才开始工作,见上个函数的119行
func gcBgMarkStartWorkers() {
// Background marking is performed by per-P G’s. Ensure that
// each P has a background GC G.
for _, p := range allp {
if p.gcBgMarkWorker == 0 {
go gcBgMarkWorker§
// 等待gcBgMarkWorker goroutine 的 bgMarkReady信号再继续
notetsleepg(&work.bgMarkReady, -1)
noteclear(&work.bgMarkReady)
}
}
}
gcBgMarkWorker
后台标记任务的函数
func gcBgMarkWorker(p *p) {
gp := getg()
// 用于休眠结束后重新获取p和m
type parkInfo struct {
m muintptr // Release this m on park.
attach puintptr // If non-nil, attach to this p on park.
}
// We pass park to a gopark unlock function, so it can’t be on
// the stack (see gopark). Prevent deadlock from recursively
// starting GC by disabling preemption.
gp.m.preemptoff = “GC worker init”
park := new(parkInfo)
gp.m.preemptoff = “”
// 设置park的m和p的信息,留着后面传给gopark,在被gcController.findRunnable唤醒的时候,便于找回
park.m.set(acquirem())
park.attach.set(p)
// Inform gcBgMarkStartWorkers that this worker is ready.
// After this point, the background mark worker is scheduled
// cooperatively by gcController.findRunnable. Hence, it must
// never be preempted, as this would put it into _Grunnable
// and put it on a run queue. Instead, when the preempt flag
// is set, this puts itself into _Gwaiting to be woken up by
// gcController.findRunnable at the appropriate time.
// 让gcBgMarkStartWorkers notetsleepg停止等待并继续及退出
notewakeup(&work.bgMarkReady)
for {
// Go to sleep until woken by gcController.findRunnable.
// We can't releasem yet since even the call to gopark
// may be preempted.
// 让g进入休眠
gopark(func(g *g, parkp unsafe.Pointer) bool {
park := (*parkInfo)(parkp)
// The worker G is no longer running, so it's
// now safe to allow preemption.
// 释放当前抢占的m
releasem(park.m.ptr())
// If the worker isn't attached to its P,
// attach now. During initialization and after
// a phase change, the worker may have been
// running on a different P. As soon as we
// attach, the owner P may schedule the
// worker, so this must be done after the G is
// stopped.
// 设置关联p,上面已经设置过了
if park.attach != 0 {
p := park.attach.ptr()
park.attach.set(nil)
// cas the worker because we may be
// racing with a new worker starting
// on this P.
if !p.gcBgMarkWorker.cas(0, guintptr(unsafe.Pointer(g))) {
// The P got a new worker.
// Exit this worker.
return false
}
}
return true
}, unsafe.Pointer(park), waitReasonGCWorkerIdle, traceEvGoBlock, 0)
// Loop until the P dies and disassociates this
// worker (the P may later be reused, in which case
// it will get a new worker) or we failed to associate.
// 检查P的gcBgMarkWorker是否和当前的G一致, 不一致时结束当前的任务
if _p_.gcBgMarkWorker.ptr() != gp {
break
}
// Disable preemption so we can use the gcw. If the
// scheduler wants to preempt us, we'll stop draining,
// dispose the gcw, and then preempt.
// gopark第一个函数中释放了m,这里再抢占回来
park.m.set(acquirem())
if gcBlackenEnabled == 0 {
throw("gcBgMarkWorker: blackening not enabled")
}
startTime := nanotime()
// 设置gcmark的开始时间
_p_.gcMarkWorkerStartTime = startTime
decnwait := atomic.Xadd(&work.nwait, -1)
if decnwait == work.nproc {
println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc)
throw("work.nwait was > work.nproc")
}
// 切换到g0工作
systemstack(func() {
// Mark our goroutine preemptible so its stack
// can be scanned. This lets two mark workers
// scan each other (otherwise, they would
// deadlock). We must not modify anything on
// the G stack. However, stack shrinking is
// disabled for mark workers, so it is safe to
// read from the G stack.
// 设置G的状态为waiting,以便于另一个g扫描它的栈(两个g可以互相扫描对方的栈)
casgstatus(gp, _Grunning, _Gwaiting)
switch _p_.gcMarkWorkerMode {
default:
throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")
case gcMarkWorkerDedicatedMode:
// 专心执行标记工作的模式
gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)
if gp.preempt {
// 被抢占了,把所有本地运行队列中的G放到全局运行队列中
// We were preempted. This is
// a useful signal to kick
// everything out of the run
// queue so it can run
// somewhere else.
lock(&sched.lock)
for {
gp, _ := runqget(_p_)
if gp == nil {
break
}
globrunqput(gp)
}
unlock(&sched.lock)
}
// Go back to draining, this time
// without preemption.
// 继续执行标记工作
gcDrain(&_p_.gcw, gcDrainNoBlock|gcDrainFlushBgCredit)
case gcMarkWorkerFractionalMode:
// 执行标记工作,知道被抢占
gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)
case gcMarkWorkerIdleMode:
// 空闲的时候执行标记工作
gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)
}
// 把G的waiting状态转换到runing状态
casgstatus(gp, _Gwaiting, _Grunning)
})
// If we are nearing the end of mark, dispose
// of the cache promptly. We must do this
// before signaling that we're no longer
// working so that other workers can't observe
// no workers and no work while we have this
// cached, and before we compute done.
// 及时处理本地缓存,上交到全局的队列中
if gcBlackenPromptly {
_p_.gcw.dispose()
}
// Account for time.
// 累加耗时
duration := nanotime() - startTime
switch _p_.gcMarkWorkerMode {
case gcMarkWorkerDedicatedMode:
atomic.Xaddint64(&gcController.dedicatedMarkTime, duration)
atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 1)
case gcMarkWorkerFractionalMode:
atomic.Xaddint64(&gcController.fractionalMarkTime, duration)
atomic.Xaddint64(&_p_.gcFractionalMarkTime, duration)
case gcMarkWorkerIdleMode:
atomic.Xaddint64(&gcController.idleMarkTime, duration)
}
// Was this the last worker and did we run out
// of work?
incnwait := atomic.Xadd(&work.nwait, +1)
if incnwait > work.nproc {
println("runtime: p.gcMarkWorkerMode=", _p_.gcMarkWorkerMode,
"work.nwait=", incnwait, "work.nproc=", work.nproc)
throw("work.nwait > work.nproc")
}
// If this worker reached a background mark completion
// point, signal the main GC goroutine.
if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {
// Make this G preemptible and disassociate it
// as the worker for this P so
// findRunnableGCWorker doesn't try to
// schedule it.
// 取消p m的关联
_p_.gcBgMarkWorker.set(nil)
releasem(park.m.ptr())
gcMarkDone()
// Disable preemption and prepare to reattach
// to the P.
//
// We may be running on a different P at this
// point, so we can't reattach until this G is
// parked.
park.m.set(acquirem())
park.attach.set(_p_)
}
}
}
gcDrain
三色标记的主要实现
gcDrain扫描所有的roots和对象,并表黑灰色对象,知道所有的roots和对象都被标记
func gcDrain(gcw *gcWork, flags gcDrainFlags) {
if !writeBarrier.needed {
throw(“gcDrain phase incorrect”)
}
gp := getg().m.curg
// 看到抢占标识是否要返回
preemptible := flags&gcDrainUntilPreempt != 0
// 没有任务时是否要等待任务
blocking := flags&(gcDrainUntilPreempt|gcDrainIdle|gcDrainFractional|gcDrainNoBlock) == 0
// 是否计算后台的扫描量来减少辅助GC和唤醒等待中的G
flushBgCredit := flags&gcDrainFlushBgCredit != 0
// 是否在空闲的时候执行标记任务
idle := flags&gcDrainIdle != 0
// 记录初始的已经执行过的扫描任务
initScanWork := gcw.scanWork
// checkWork is the scan work before performing the next
// self-preempt check.
// 设置对应模式的工作检查函数
checkWork := int64(1<<63 - 1)
var check func() bool
if flags&(gcDrainIdle|gcDrainFractional) != 0 {
checkWork = initScanWork + drainCheckThreshold
if idle {
check = pollWork
} else if flags&gcDrainFractional != 0 {
check = pollFractionalWorkerExit
}
}
// Drain root marking jobs.
// 如果root对象没有扫描完,则扫描
if work.markrootNext < work.markrootJobs {
for !(preemptible && gp.preempt) {
job := atomic.Xadd(&work.markrootNext, +1) - 1
if job >= work.markrootJobs {
break
}
// 执行root扫描任务
markroot(gcw, job)
if check != nil && check() {
goto done
}
}
}
// Drain heap marking jobs.
// 循环直到被抢占
for !(preemptible && gp.preempt) {
// Try to keep work available on the global queue. We used to
// check if there were waiting workers, but it's better to
// just keep work available than to make workers wait. In the
// worst case, we'll do O(log(_WorkbufSize)) unnecessary
// balances.
if work.full == 0 {
// 平衡工作,如果全局的标记队列为空,则分一部分工作到全局队列中
gcw.balance()
}
var b uintptr
if blocking {
b = gcw.get()
} else {
b = gcw.tryGetFast()
if b == 0 {
b = gcw.tryGet()
}
}
// 获取任务失败,跳出循环
if b == 0 {
// work barrier reached or tryGet failed.
break
}
// 扫描获取的到对象
scanobject(b, gcw)
// Flush background scan work credit to the global
// account if we've accumulated enough locally so
// mutator assists can draw on it.
// 如果当前扫描的数量超过了 gcCreditSlack,就把扫描的对象数量加到全局的数量,批量更新
if gcw.scanWork >= gcCreditSlack {
atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)
if flushBgCredit {
gcFlushBgCredit(gcw.scanWork - initScanWork)
initScanWork = 0
}
checkWork -= gcw.scanWork
gcw.scanWork = 0
// 如果扫描的对象数量已经达到了 执行下次抢占的目标数量 checkWork, 则调用对应模式的函数
// idle模式为 pollWork, Fractional模式为 pollFractionalWorkerExit ,在第20行
if checkWork <= 0 {
checkWork += drainCheckThreshold
if check != nil && check() {
break
}
}
}
}
// In blocking mode, write barriers are not allowed after this
// point because we must preserve the condition that the work
// buffers are empty.
done:
// Flush remaining scan work credit.
if gcw.scanWork > 0 {
// 把扫描的对象数量添加到全局
atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)
if flushBgCredit {
gcFlushBgCredit(gcw.scanWork - initScanWork)
}
gcw.scanWork = 0
}
}
markroot
这个被用于根对象扫描
func markroot(gcw *gcWork, i uint32) {
// TODO(austin): This is a bit ridiculous. Compute and store
// the bases in gcMarkRootPrepare instead of the counts.
baseFlushCache := uint32(fixedRootCount)
baseData := baseFlushCache + uint32(work.nFlushCacheRoots)
baseBSS := baseData + uint32(work.nDataRoots)
baseSpans := baseBSS + uint32(work.nBSSRoots)
baseStacks := baseSpans + uint32(work.nSpanRoots)
end := baseStacks + uint32(work.nStackRoots)
// Note: if you add a case here, please also update heapdump.go:dumproots.
switch {
// 释放mcache中的span
case baseFlushCache <= i && i < baseData:
flushmcache(int(i - baseFlushCache))
// 扫描可读写的全局变量
case baseData <= i && i < baseBSS:
for _, datap := range activeModules() {
markrootBlock(datap.data, datap.edata-datap.data, datap.gcdatamask.bytedata, gcw, int(i-baseData))
}
// 扫描只读的全局队列
case baseBSS <= i && i < baseSpans:
for _, datap := range activeModules() {
markrootBlock(datap.bss, datap.ebss-datap.bss, datap.gcbssmask.bytedata, gcw, int(i-baseBSS))
}
// 扫描Finalizer队列
case i == fixedRootFinalizers:
// Only do this once per GC cycle since we don't call
// queuefinalizer during marking.
if work.markrootDone {
break
}
for fb := allfin; fb != nil; fb = fb.alllink {
cnt := uintptr(atomic.Load(&fb.cnt))
scanblock(uintptr(unsafe.Pointer(&fb.fin[0])), cnt*unsafe.Sizeof(fb.fin[0]), &finptrmask[0], gcw)
}
// 释放已经终止的stack
case i == fixedRootFreeGStacks:
// Only do this once per GC cycle; preferably
// concurrently.
if !work.markrootDone {
// Switch to the system stack so we can call
// stackfree.
systemstack(markrootFreeGStacks)
}
// 扫描MSpan.specials
case baseSpans <= i && i < baseStacks:
// mark MSpan.specials
markrootSpans(gcw, int(i-baseSpans))
default:
// the rest is scanning goroutine stacks
// 获取需要扫描的g
var gp *g
if baseStacks <= i && i < end {
gp = allgs[i-baseStacks]
} else {
throw("markroot: bad index")
}
// remember when we've first observed the G blocked
// needed only to output in traceback
status := readgstatus(gp) // We are not in a scan state
if (status == _Gwaiting || status == _Gsyscall) && gp.waitsince == 0 {
gp.waitsince = work.tstart
}
// scang must be done on the system stack in case
// we're trying to scan our own stack.
// 转交给g0进行扫描
systemstack(func() {
// If this is a self-scan, put the user G in
// _Gwaiting to prevent self-deadlock. It may
// already be in _Gwaiting if this is a mark
// worker or we're in mark termination.
userG := getg().m.curg
selfScan := gp == userG && readgstatus(userG) == _Grunning
// 如果是扫描自己的,则转换自己的g的状态
if selfScan {
casgstatus(userG, _Grunning, _Gwaiting)
userG.waitreason = waitReasonGarbageCollectionScan
}
// TODO: scang blocks until gp's stack has
// been scanned, which may take a while for
// running goroutines. Consider doing this in
// two phases where the first is non-blocking:
// we scan the stacks we can and ask running
// goroutines to scan themselves; and the
// second blocks.
// 扫描g的栈
scang(gp, gcw)
if selfScan {
casgstatus(userG, _Gwaiting, _Grunning)
}
})
}
}
markRootBlock
根据 ptrmask0,来扫描[b0, b0+n0)区域
func markrootBlock(b0, n0 uintptr, ptrmask0 uint8, gcw gcWork, shard int) {
if rootBlockBytes%(8sys.PtrSize) != 0 {
// This is necessary to pick byte offsets in ptrmask0.
throw("rootBlockBytes must be a multiple of 8ptrSize")
}
b := b0 + uintptr(shard)*rootBlockBytes
// 如果需扫描的block区域,超出b0+n0的区域,直接返回
if b >= b0+n0 {
return
}
ptrmask := (*uint8)(add(unsafe.Pointer(ptrmask0), uintptr(shard)*(rootBlockBytes/(8*sys.PtrSize))))
n := uintptr(rootBlockBytes)
if b+n > b0+n0 {
n = b0 + n0 - b
}
// Scan this shard.
// 扫描给定block的shard
scanblock(b, n, ptrmask, gcw)
}
scanblock
func scanblock(b0, n0 uintptr, ptrmask *uint8, gcw *gcWork) {
// Use local copies of original parameters, so that a stack trace
// due to one of the throws below shows the original block
// base and extent.
b := b0
n := n0
for i := uintptr(0); i < n; {
// Find bits for the next word.
// 找到bitmap中对应的bits
bits := uint32(*addb(ptrmask, i/(sys.PtrSize*8)))
if bits == 0 {
i += sys.PtrSize * 8
continue
}
for j := 0; j < 8 && i < n; j++ {
if bits&1 != 0 {
// 如果该地址包含指针
// Same work as in scanobject; see comments there.
obj := *(*uintptr)(unsafe.Pointer(b + i))
if obj != 0 {
// 如果该地址下找到了对应的对象,标灰
if obj, span, objIndex := findObject(obj, b, i); obj != 0 {
greyobject(obj, b, i, span, gcw, objIndex)
}
}
}
bits >>= 1
i += sys.PtrSize
}
}
}
greyobject
标灰对象其实就是找到对应bitmap,标记存活并扔进队列
func greyobject(obj, base, off uintptr, span *mspan, gcw *gcWork, objIndex uintptr) {
// obj should be start of allocation, and so must be at least pointer-aligned.
if obj&(sys.PtrSize-1) != 0 {
throw(“greyobject: obj not pointer-aligned”)
}
mbits := span.markBitsForIndex(objIndex)
if useCheckmark {
// 这里是用来debug,确保所有的对象都被正确标识
if !mbits.isMarked() {
// 这个对象没有被标记
printlock()
print("runtime:greyobject: checkmarks finds unexpected unmarked object obj=", hex(obj), "\n")
print("runtime: found obj at *(", hex(base), "+", hex(off), ")\n")
// Dump the source (base) object
gcDumpObject("base", base, off)
// Dump the object
gcDumpObject("obj", obj, ^uintptr(0))
getg().m.traceback = 2
throw("checkmark found unmarked object")
}
hbits := heapBitsForAddr(obj)
if hbits.isCheckmarked(span.elemsize) {
return
}
hbits.setCheckmarked(span.elemsize)
if !hbits.isCheckmarked(span.elemsize) {
throw("setCheckmarked and isCheckmarked disagree")
}
} else {
if debug.gccheckmark > 0 && span.isFree(objIndex) {
print("runtime: marking free object ", hex(obj), " found at *(", hex(base), "+", hex(off), ")\n")
gcDumpObject("base", base, off)
gcDumpObject("obj", obj, ^uintptr(0))
getg().m.traceback = 2
throw("marking free object")
}
// If marked we have nothing to do.
// 对象被正确标记了,无需做其他的操作
if mbits.isMarked() {
return
}
// mbits.setMarked() // Avoid extra call overhead with manual inlining.
// 标记对象
atomic.Or8(mbits.bytep, mbits.mask)
// If this is a noscan object, fast-track it to black
// instead of greying it.
// 如果对象不是指针,则只需要标记,不需要放进队列,相当于直接标黑
if span.spanclass.noscan() {
gcw.bytesMarked += uint64(span.elemsize)
return
}
}
// Queue the obj for scanning. The PREFETCH(obj) logic has been removed but
// seems like a nice optimization that can be added back in.
// There needs to be time between the PREFETCH and the use.
// Previously we put the obj in an 8 element buffer that is drained at a rate
// to give the PREFETCH time to do its work.
// Use of PREFETCHNTA might be more appropriate than PREFETCH
// 判断对象是否被放进队列,没有则放入,标灰步骤完成
if !gcw.putFast(obj) {
gcw.put(obj)
}