源码上解读 GMP模型

2023年3月6日 · 阅读需 15 分钟

作者

path : src/runtime/runtime1.go

go version 1.16.1

G

type g struct {
 // Stack (栈)parameters(参数).
 // stack describes the actual stack memory: [stack.lo, stack.hi).
 // stackguard0 is the stack pointer compared in the Go stack growth prologue.
//stackguard0是在Go堆栈增长序言中比较的堆栈指针
 // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption.
//lo+StackGuard正常，但可以是StackPreempt触发抢占
 // stackguard1 is the stack pointer compared in the C stack growth prologue.
//stackguard1是在C堆栈增长序言中比较的堆栈指针

 // It is stack.lo+StackGuard on g0 and gsignal stacks.
 // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash).
 stack       stack   // offset known to runtime/cgo 已知的偏移量为runtime/cgo
 stackguard0 uintptr // offset known to liblink 已知liblink的偏移量
 stackguard1 uintptr // offset known to liblink

 _panic       *_panic // innermost panic - offset known to liblink
 _defer       *_defer // innermost defer
 m            *m      // current m; offset known to arm liblink
 sched        gobuf
 syscallsp    uintptr        // if status==Gsyscall, syscallsp = sched.sp to use during gc
 syscallpc    uintptr        // if status==Gsyscall, syscallpc = sched.pc to use during gc
 stktopsp     uintptr        // expected sp at top of stack, to check in traceback
 param        unsafe.Pointer // passed parameter on wakeup
 atomicstatus uint32
 stackLock    uint32 // sigprof/scang lock; TODO: fold in to atomicstatus
 goid         int64
 schedlink    guintptr
 waitsince    int64      // approx time when the g become blocked
 waitreason   waitReason // if status==Gwaiting

 preempt       bool // preemption signal, duplicates stackguard0 = stackpreempt
 preemptStop   bool // transition to _Gpreempted on preemption; otherwise, just deschedule
 preemptShrink bool // shrink stack at synchronous safe point

 // asyncSafePoint is set if g is stopped at an asynchronous
 // safe point. This means there are frames on the stack
 // without precise pointer information.
 asyncSafePoint bool

 paniconfault bool // panic (instead of crash) on unexpected fault address
 gcscandone   bool // g has scanned stack; protected by _Gscan bit in status
 throwsplit   bool // must not split stack
 // activeStackChans indicates that there are unlocked channels
 // pointing into this goroutine's stack. If true, stack
 // copying needs to acquire channel locks to protect these
 // areas of the stack.
 activeStackChans bool
 // parkingOnChan indicates that the goroutine is about to
 // park on a chansend or chanrecv. Used to signal an unsafe point
 // for stack shrinking. It's a boolean value, but is updated atomically.
 parkingOnChan uint8

 raceignore     int8     // ignore race detection events
 sysblocktraced bool     // StartTrace has emitted EvGoInSyscall about this goroutine
 sysexitticks   int64    // cputicks when syscall has returned (for tracing)
 traceseq       uint64   // trace event sequencer
 tracelastp     puintptr // last P emitted an event for this goroutine
 lockedm        muintptr
 sig            uint32
 writebuf       []byte
 sigcode0       uintptr
 sigcode1       uintptr
 sigpc          uintptr
 gopc           uintptr         // pc of go statement that created this goroutine
 ancestors      *[]ancestorInfo // ancestor information goroutine(s) that created this goroutine (only used if debug.tracebackancestors)
 startpc        uintptr         // pc of goroutine function
 racectx        uintptr
 waiting        *sudog         // sudog structures this g is waiting on (that have a valid elem ptr); in lock order
 cgoCtxt        []uintptr      // cgo traceback context
 labels         unsafe.Pointer // profiler labels
 timer          *timer         // cached timer for time.Sleep
 selectDone     uint32         // are we participating in a select and did someone win the race?

 // Per-G GC state

 // gcAssistBytes is this G's GC assist credit in terms of
 // bytes allocated. If this is positive, then the G has credit
 // to allocate gcAssistBytes bytes without assisting. If this
 // is negative, then the G must correct this by performing
 // scan work. We track this in bytes to make it fast to update
 // and check for debt in the malloc hot path. The assist ratio
 // determines how this corresponds to scan work debt.
 gcAssistBytes int64
}

M

type m struct {
 g0      *g     // goroutine with scheduling stack
 morebuf gobuf  // gobuf arg to morestack
 divmod  uint32 // div/mod denominator for arm - known to liblink

 // Fields not known to debuggers.
 procid        uint64       // for debuggers, but offset not hard-coded
 gsignal       *g           // signal-handling g
 goSigStack    gsignalStack // Go-allocated signal handling stack
 sigmask       sigset       // storage for saved signal mask
 tls           [6]uintptr   // thread-local storage (for x86 extern register)
 mstartfn      func()
 curg          *g       // current running goroutine
 caughtsig     guintptr // goroutine running during fatal signal
 p             puintptr // attached p for executing go code (nil if not executing go code)
 nextp         puintptr
 oldp          puintptr // the p that was attached before executing a syscall
 id            int64
 mallocing     int32
 throwing      int32
 preemptoff    string // if != "", keep curg running on this m
 locks         int32
 dying         int32
 profilehz     int32
 spinning      bool // m is out of work and is actively looking for work
 blocked       bool // m is blocked on a note
 newSigstack   bool // minit on C thread called sigaltstack
 printlock     int8
 incgo         bool   // m is executing a cgo call
 freeWait      uint32 // if == 0, safe to free g0 and delete m (atomic)
 fastrand      [2]uint32
 needextram    bool
 traceback     uint8
 ncgocall      uint64      // number of cgo calls in total
 ncgo          int32       // number of cgo calls currently in progress
 cgoCallersUse uint32      // if non-zero, cgoCallers in use temporarily
 cgoCallers    *cgoCallers // cgo traceback if crashing in cgo call
 doesPark      bool        // non-P running threads: sysmon and newmHandoff never use .park
 park          note
 alllink       *m // on allm
 schedlink     muintptr
 lockedg       guintptr
 createstack   [32]uintptr // stack that created this thread.
 lockedExt     uint32      // tracking for external LockOSThread
 lockedInt     uint32      // tracking for internal lockOSThread
 nextwaitm     muintptr    // next m waiting for lock
 waitunlockf   func(*g, unsafe.Pointer) bool
 waitlock      unsafe.Pointer
 waittraceev   byte
 waittraceskip int
 startingtrace bool
 syscalltick   uint32
 freelink      *m // on sched.freem

 // mFixup is used to synchronize OS related m state (credentials etc)
 // use mutex to access.
 mFixup struct {
  lock mutex
  fn   func(bool) bool
 }

 // these are here because they are too large to be on the stack
 // of low-level NOSPLIT functions.
 libcall   libcall
 libcallpc uintptr // for cpu profiler
 libcallsp uintptr
 libcallg  guintptr
 syscall   libcall // stores syscall parameters on windows

 vdsoSP uintptr // SP for traceback while in VDSO call (0 if not in call)
 vdsoPC uintptr // PC for traceback while in VDSO call

 // preemptGen counts the number of completed preemption
 // signals. This is used to detect when a preemption is
 // requested, but fails. Accessed atomically.
 preemptGen uint32

 // Whether this is a pending preemption signal on this M.
 // Accessed atomically.
 signalPending uint32

 dlogPerM

 mOS

 // Up to 10 locks held by this m, maintained by the lock ranking code.
 locksHeldLen int
 locksHeld    [10]heldLockInfo
}

P

type p struct {
 id          int32
 status      uint32 // one of pidle/prunning/...
 link        puintptr
 schedtick   uint32     // incremented on every scheduler call
 syscalltick uint32     // incremented on every system call
 sysmontick  sysmontick // last tick observed by sysmon
 m           muintptr   // back-link to associated m (nil if idle)
 mcache      *mcache
 pcache      pageCache
 raceprocctx uintptr

 deferpool    [5][]*_defer // pool of available defer structs of different sizes (see panic.go)
 deferpoolbuf [5][32]*_defer

 // Cache of goroutine ids, amortizes accesses to runtime·sched.goidgen.
 goidcache    uint64
 goidcacheend uint64

 // Queue of runnable goroutines. Accessed without lock.
 runqhead uint32
 runqtail uint32
 runq     [256]guintptr
 // runnext, if non-nil, is a runnable G that was ready'd by
 // the current G and should be run next instead of what's in
 // runq if there's time remaining in the running G's time
 // slice. It will inherit the time left in the current time
 // slice. If a set of goroutines is locked in a
 // communicate-and-wait pattern, this schedules that set as a
 // unit and eliminates the (potentially large) scheduling
 // latency that otherwise arises from adding the ready'd
 // goroutines to the end of the run queue.
 runnext guintptr

 // Available G's (status == Gdead)
 gFree struct {
  gList
  n int32
 }

 sudogcache []*sudog
 sudogbuf   [128]*sudog

 // Cache of mspan objects from the heap.
 mspancache struct {
  // We need an explicit length here because this field is used
  // in allocation codepaths where write barriers are not allowed,
  // and eliminating the write barrier/keeping it eliminated from
  // slice updates is tricky, moreso than just managing the length
  // ourselves.
  len int
  buf [128]*mspan
 }

 tracebuf traceBufPtr

 // traceSweep indicates the sweep events should be traced.
 // This is used to defer the sweep start event until a span
 // has actually been swept.
 traceSweep bool
 // traceSwept and traceReclaimed track the number of bytes
 // swept and reclaimed by sweeping in the current sweep loop.
 traceSwept, traceReclaimed uintptr

 palloc persistentAlloc // per-P to avoid mutex

 _ uint32 // Alignment for atomic fields below

 // The when field of the first entry on the timer heap.
 // This is updated using atomic functions.
 // This is 0 if the timer heap is empty.
 timer0When uint64

 // The earliest known nextwhen field of a timer with
 // timerModifiedEarlier status. Because the timer may have been
 // modified again, there need not be any timer with this value.
 // This is updated using atomic functions.
 // This is 0 if the value is unknown.
 timerModifiedEarliest uint64

 // Per-P GC state
 gcAssistTime         int64 // Nanoseconds in assistAlloc
 gcFractionalMarkTime int64 // Nanoseconds in fractional mark worker (atomic)

 // gcMarkWorkerMode is the mode for the next mark worker to run in.
 // That is, this is used to communicate with the worker goroutine
 // selected for immediate execution by
 // gcController.findRunnableGCWorker. When scheduling other goroutines,
 // this field must be set to gcMarkWorkerNotWorker.
 gcMarkWorkerMode gcMarkWorkerMode
 // gcMarkWorkerStartTime is the nanotime() at which the most recent
 // mark worker started.
 gcMarkWorkerStartTime int64

 // gcw is this P's GC work buffer cache. The work buffer is
 // filled by write barriers, drained by mutator assists, and
 // disposed on certain GC state transitions.
 gcw gcWork

 // wbBuf is this P's GC write barrier buffer.
 //
 // TODO: Consider caching this in the running G.
 wbBuf wbBuf

 runSafePointFn uint32 // if 1, run sched.safePointFn at next safe point

 // statsSeq is a counter indicating whether this P is currently
 // writing any stats. Its value is even when not, odd when it is.
 statsSeq uint32

 // Lock for timers. We normally access the timers while running
 // on this P, but the scheduler can also do it from a different P.
 timersLock mutex

 // Actions to take at some time. This is used to implement the
 // standard library's time package.
 // Must hold timersLock to access.
 timers []*timer

 // Number of timers in P's heap.
 // Modified using atomic instructions.
 numTimers uint32

 // Number of timerModifiedEarlier timers on P's heap.
 // This should only be modified while holding timersLock,
 // or while the timer status is in a transient state
 // such as timerModifying.
 adjustTimers uint32

 // Number of timerDeleted timers in P's heap.
 // Modified using atomic instructions.
 deletedTimers uint32

 // Race context used while executing timer functions.
 timerRaceCtx uintptr

 // preempt is set to indicate that this P should be enter the
 // scheduler ASAP (regardless of what G is running on it).
 preempt bool

 pad cpu.CacheLinePad
}

Go 的模型调度

M(thread) 内核线程 , P(processor) 进程 ,G (goroutine) 协程

G : Go 运行时对goroutine的描述,G中存放并发执行的 代码入口地址,上下文,运行环境 (关联的P和M),运行栈等执行相关的信息,G的新建,休眠,恢复,停止都受到Go运行时的管理

GO运行时的监控线程会监控G的调度，G不会长久地阻塞系统线程，运行时的调度器会自动切换到其他G上运行。G新建或恢复时会添加到运行队列，等待M取出并运行。

M : OS内核线程,是操作系统层面调度和执行的实体.M仅负责执行,M不停的唤醒或创建,然后执行
P : 代表M和P所需要的资源,是对资源的一种抽象管理,P 不是一段代码实体,而是一个管理的数据结构,P主要是降低 M对G的复杂性,增加一个间接的控制层数据结构,P控制Go的并行度,它不是实体

P持有G的队列，P可以隔离调度，解除P和M的绑定就解除了M对一串G的调用。

G并不是执行体，而是存放并发执行体的元信息，包括并发执行的入口函数、堆栈、上下文等信息。G由于保存的是元信息，为了减少对象的分配和回收，G对象是可以复用，只需要将相关元信息初始化为新值即可。

M仅负责执行，M启动时进入运行时的管理代码，这段管理代码必须拿到P后，才能执行调度。

P的数目默认是CPU核心的数量。M和P的数目差不多，但运行时会根据当前的状态动态地创建M，M有一个最大值上限：10000；G与P是M:N的关系，M可以成千上万，远远大于N.

Work Stealing算法的基本原理

M和P构成一个运行时环境，每个P有一个本地的可调度的G队列，队列里面的G会被M依次调度执行，如果本地队列空了，则去全局队列偷取一部分G，如果全局队列也是空的，则去其他的P中偷取一部分G。

什么时候创建M、P、G

在程序启动过程中会初始化空闲P列表，P是这个时候创建的，同时第一个G也是在初始化过程中被创建的。

每个并发调用都会初始化一个新的G任务，如何唤醒M执行任务。这个唤醒不是特定唤醒某个线程去工作，而是先尝试获取当前线程M，如果无法获取，则从全局调度的空闲M列表中获取可用的M，如果没有可用的M，则新建M，然后绑定P和GY运行。M和P不是一一对应的，而是按需分配的

M线程有管理调度和切换堆栈的逻辑，但是M必须拿到P后才能运行，可用看到M是自驱动的，单需要P的配合。

goroutine经历的过程

通过go func()来创建一个 goroutine
有两个存储的G队列,一个是局部调度器P的本地队列,一个是全局G队列,新创建的G会先保存在P的本地队列中,如果P的本地队列已经满了就会保存在全局队列中
G只能运行在M中,一个M必须有一个P,M与P是1:1的关系. M会对P的本地队列弹出一个可执行状态的G来执行,如果P的本地队列为空,就会向其它的MP组合取一个可执行的G来执行
一个M 调度执行的过程是一个循环机制
当 M执行某一个G 时候如果发生了 syscall或其余阻塞操作,M会阻塞,如果

G​

M​

P​

Go 的模型调度​

Work Stealing算法的基本原理​

什么时候创建M、P、G​

goroutine经历的过程​

G

M

P

Go 的模型调度

Work Stealing算法的基本原理

什么时候创建M、P、G

goroutine经历的过程